{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "45c8dd54-1683-40cc-8b0b-bbef62882a76",
   "metadata": {},
   "source": [
    "# Single meal: Baobab, Honey, and Antelope\n",
    "\n",
    "Here we will build a medium based on the anticipated metabolites present in the human gut for a meal matched to the Hadza people in Tanzania. Please be aware that this is unlikely to represent the full diversity of the diet of the Hadza.\n",
    "\n",
    "The composition of the meal is the following:\n",
    "\n",
    "- 200g Baobab\n",
    "- 50g Baobab seeds (mixed with water)\n",
    "- 100g of Honey\n",
    "- 100g of Antelope meat\n",
    "\n",
    "This sums up to 1051 kcal.\n",
    "\n",
    "We will start by reading individual tables for the specific foods and scale the abundances."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 86,
   "id": "be313cb2-1e5d-4766-bba1-7479835c5925",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "\n",
    "meal = {\n",
    "    \"Baobab pulp\": 200,\n",
    "    \"Baobab seed\": 50,\n",
    "    \"Honey\": 100,\n",
    "    \"Antelope\": 100\n",
    "}\n",
    "\n",
    "foods = []\n",
    "for food in meal:\n",
    "    mets = pd.read_excel(\"../data/foods_diets.xlsx\", food)\n",
    "    mets[\"amount_g\"] = mets.relative_abundance / mets.relative_abundance.sum() * meal[food]\n",
    "    mets[\"concentration\"] = mets[\"amount_g\"] / mets[\"mw\"] * 1000.0  # to yield mmol/meal\n",
    "    mets[\"flux\"] = mets[\"concentration\"] / 8.0\n",
    "    mets[\"food\"] = food\n",
    "    foods.append(mets)\n",
    "foods = pd.concat(foods)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "693840a9-6642-4a73-b2d4-3d028ec49d55",
   "metadata": {},
   "source": [
    "Now we combine the data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 87,
   "id": "8b0f6c2a-f57c-4cfb-ac7e-c73ee9c0bb9e",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>metabolite</th>\n",
       "      <th>flux</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>4abut</td>\n",
       "      <td>0.724400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Lcystin</td>\n",
       "      <td>0.002091</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>ab14lac_L</td>\n",
       "      <td>0.258269</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>acgal</td>\n",
       "      <td>1.051659</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>acon_C</td>\n",
       "      <td>1.122785</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  metabolite      flux\n",
       "0      4abut  0.724400\n",
       "1    Lcystin  0.002091\n",
       "2  ab14lac_L  0.258269\n",
       "3      acgal  1.051659\n",
       "4     acon_C  1.122785"
      ]
     },
     "execution_count": 87,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "foods.loc[foods.metabolite == \"h2o\", \"flux\"] += 1000.0 / 18.01 * 1000.0 / 8.0 # add 1L water\n",
    "diet = foods.dropna(subset=[\"metabolite\"]).groupby(\"metabolite\").flux.sum().reset_index()\n",
    "diet.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a1221035-cf11-44d8-aee8-784e2d8d4da8",
   "metadata": {},
   "source": [
    "## Adjust for intestinal adsorption\n",
    "\n",
    "To achieve this we will load the Recon3 human model. AGORA and Recon IDs are very similar so we should be able to match them. We just have to adjust the Recon3 ones a bit. We start by identifying all available exchanges in Recon3 and adjusting the IDs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 88,
   "id": "eea9b19f-530f-48c9-a1d3-636a1cb6b72d",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0     5adtststerone\n",
       "1    5adtststerones\n",
       "2             5fthf\n",
       "3             5htrp\n",
       "4             5mthf\n",
       "dtype: object"
      ]
     },
     "execution_count": 88,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from cobra.io import read_sbml_model\n",
    "import pandas as pd\n",
    "\n",
    "recon3 = read_sbml_model(\"../data/Recon3D.xml.gz\")\n",
    "exchanges = pd.Series([r.id for r in recon3.exchanges])\n",
    "exchanges = exchanges.str.replace(\"__\", \"_\").str.replace(\"_e$|EX_\", \"\", regex=True)\n",
    "exchanges.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 89,
   "id": "730fbbb8-88bf-4961-a61a-64bb1e910a59",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.2    52\n",
       "1.0    19\n",
       "Name: dilution, dtype: int64"
      ]
     },
     "execution_count": 89,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "diet[\"dilution\"] = 1.0\n",
    "diet.loc[diet.metabolite.isin(exchanges), \"dilution\"] = 0.2\n",
    "diet[\"flux\"] = diet[\"flux\"] * diet[\"dilution\"] \n",
    "diet[[\"metabolite\", \"dilution\"]].drop_duplicates().dilution.value_counts()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ee3884b4-5592-424a-a157-ee7cce0d9d77",
   "metadata": {},
   "source": [
    "## Adding host supplied components\n",
    "\n",
    "Finally we add the host metabolites such as primary bile acids and mucins and a minuscule amount of oxygen."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 90,
   "id": "c2a5763e-0569-455d-9bac-cb32a77ac4bf",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>metabolite</th>\n",
       "      <th>flux</th>\n",
       "      <th>dilution</th>\n",
       "      <th>reaction</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>4abut</td>\n",
       "      <td>0.144880</td>\n",
       "      <td>0.2</td>\n",
       "      <td>EX_4abut(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Lcystin</td>\n",
       "      <td>0.002091</td>\n",
       "      <td>1.0</td>\n",
       "      <td>EX_Lcystin(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>ab14lac_L</td>\n",
       "      <td>0.258269</td>\n",
       "      <td>1.0</td>\n",
       "      <td>EX_ab14lac_L(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>acgal</td>\n",
       "      <td>0.210332</td>\n",
       "      <td>0.2</td>\n",
       "      <td>EX_acgal(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>acon_C</td>\n",
       "      <td>1.122785</td>\n",
       "      <td>1.0</td>\n",
       "      <td>EX_acon_C(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>82</th>\n",
       "      <td>gncore2_rl</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_gncore2_rl(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>83</th>\n",
       "      <td>core7</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_core7(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>84</th>\n",
       "      <td>gchola</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_gchola(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>85</th>\n",
       "      <td>tchola</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_tchola(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>86</th>\n",
       "      <td>o2</td>\n",
       "      <td>0.001000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_o2(e)</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>87 rows × 4 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "    metabolite      flux  dilution          reaction\n",
       "0        4abut  0.144880       0.2       EX_4abut(e)\n",
       "1      Lcystin  0.002091       1.0     EX_Lcystin(e)\n",
       "2    ab14lac_L  0.258269       1.0   EX_ab14lac_L(e)\n",
       "3        acgal  0.210332       0.2       EX_acgal(e)\n",
       "4       acon_C  1.122785       1.0      EX_acon_C(e)\n",
       "..         ...       ...       ...               ...\n",
       "82  gncore2_rl  1.000000       NaN  EX_gncore2_rl(e)\n",
       "83       core7  1.000000       NaN       EX_core7(e)\n",
       "84      gchola  1.000000       NaN      EX_gchola(e)\n",
       "85      tchola  1.000000       NaN      EX_tchola(e)\n",
       "86          o2  0.001000       NaN          EX_o2(e)\n",
       "\n",
       "[87 rows x 4 columns]"
      ]
     },
     "execution_count": 90,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "diet.set_index(\"metabolite\", inplace=True)\n",
    "annotations = pd.read_csv(\"../data/agora_metabolites.csv\")\n",
    "\n",
    "# mucin\n",
    "for met in annotations.loc[annotations.metabolite.str.contains(\"core\"), \"metabolite\"]:\n",
    "    diet.loc[met, \"flux\"] = 1\n",
    "\n",
    "# primary BAs\n",
    "for met in [\"gchola\", \"tchola\"]:\n",
    "    diet.loc[met, \"flux\"] = 1\n",
    "\n",
    "# anaerobic\n",
    "diet.loc[\"o2\", \"flux\"] = 0.001\n",
    "\n",
    "diet.reset_index(inplace=True)\n",
    "diet[\"reaction\"] = \"EX_\" + diet.metabolite + \"(e)\"\n",
    "diet"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c3024830-4534-4786-8665-932dd02b7b6a",
   "metadata": {},
   "source": [
    "And we will merge this table with some annotations to make it more accessible."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 91,
   "id": "8ddab049-2e20-488b-aa9e-8894698a105d",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>metabolite</th>\n",
       "      <th>flux</th>\n",
       "      <th>dilution</th>\n",
       "      <th>reaction</th>\n",
       "      <th>name</th>\n",
       "      <th>hmdb</th>\n",
       "      <th>kegg.compound</th>\n",
       "      <th>pubchem.compound</th>\n",
       "      <th>inchi</th>\n",
       "      <th>chebi</th>\n",
       "      <th>global_id</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>4abut</td>\n",
       "      <td>0.144880</td>\n",
       "      <td>0.2</td>\n",
       "      <td>EX_4abut_m</td>\n",
       "      <td>4-Aminobutanoate</td>\n",
       "      <td>HMDB00112</td>\n",
       "      <td>C00334</td>\n",
       "      <td>119.0</td>\n",
       "      <td>InChI=1S/C4H9NO2/c5-3-1-2-4(6)7/h1-3,5H2,(H,6,7)</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_4abut(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Lcystin</td>\n",
       "      <td>0.002091</td>\n",
       "      <td>1.0</td>\n",
       "      <td>EX_Lcystin_m</td>\n",
       "      <td>L-cystine</td>\n",
       "      <td>HMDB00192</td>\n",
       "      <td>C00491</td>\n",
       "      <td>NaN</td>\n",
       "      <td>InChI=1S/C6H12N2O4S2/c7-3(5(9)10)1-13-14-2-4(8...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_Lcystin(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>ab14lac_L</td>\n",
       "      <td>0.258269</td>\n",
       "      <td>1.0</td>\n",
       "      <td>EX_ab14lac_L_m</td>\n",
       "      <td>L-Arabinono-1,4-lactone</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_ab14lac_L(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>acgal</td>\n",
       "      <td>0.210332</td>\n",
       "      <td>0.2</td>\n",
       "      <td>EX_acgal_m</td>\n",
       "      <td>N-acetyl-D-galactosamine</td>\n",
       "      <td>HMDB00212</td>\n",
       "      <td>C01132</td>\n",
       "      <td>35717.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_acgal(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>acon_C</td>\n",
       "      <td>1.122785</td>\n",
       "      <td>1.0</td>\n",
       "      <td>EX_acon_C_m</td>\n",
       "      <td>cis-Aconitate</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_acon_C(e)</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  metabolite      flux  dilution        reaction                      name  \\\n",
       "0      4abut  0.144880       0.2      EX_4abut_m          4-Aminobutanoate   \n",
       "1    Lcystin  0.002091       1.0    EX_Lcystin_m                 L-cystine   \n",
       "2  ab14lac_L  0.258269       1.0  EX_ab14lac_L_m   L-Arabinono-1,4-lactone   \n",
       "3      acgal  0.210332       0.2      EX_acgal_m  N-acetyl-D-galactosamine   \n",
       "4     acon_C  1.122785       1.0     EX_acon_C_m             cis-Aconitate   \n",
       "\n",
       "        hmdb kegg.compound  pubchem.compound  \\\n",
       "0  HMDB00112        C00334             119.0   \n",
       "1  HMDB00192        C00491               NaN   \n",
       "2        NaN           NaN               NaN   \n",
       "3  HMDB00212        C01132           35717.0   \n",
       "4        NaN           NaN               NaN   \n",
       "\n",
       "                                               inchi chebi        global_id  \n",
       "0   InChI=1S/C4H9NO2/c5-3-1-2-4(6)7/h1-3,5H2,(H,6,7)   NaN      EX_4abut(e)  \n",
       "1  InChI=1S/C6H12N2O4S2/c7-3(5(9)10)1-13-14-2-4(8...   NaN    EX_Lcystin(e)  \n",
       "2                                                NaN   NaN  EX_ab14lac_L(e)  \n",
       "3                                                NaN   NaN      EX_acgal(e)  \n",
       "4                                                NaN   NaN     EX_acon_C(e)  "
      ]
     },
     "execution_count": 91,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "skeleton = pd.merge(diet, annotations, on=\"metabolite\")\n",
    "\n",
    "skeleton[\"global_id\"] = skeleton.reaction\n",
    "skeleton[\"reaction\"] = \"EX_\" + skeleton.metabolite + \"_m\"\n",
    "skeleton.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3e958b78-e7a3-4277-a632-044ac8b9d36d",
   "metadata": {},
   "source": [
    "## Complete the medium\n",
    "\n",
    "Great we now have a pretty good skeleton. One issue that this will never be fully complete. There will always be some components missing that are essential for microbial growth. Fortunately, we provide a algorithm in MICOM to complete a medium with the smallest set of additional components to provide growth to all intestinal taxa."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 92,
   "id": "6ff1606f-40ad-4e89-a5f4-909b6847e7f8",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Completing <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">818</span> strain-level models on a medium with <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">87</span> components <span style=\"font-weight: bold\">(</span><span style=\"color: #800000; text-decoration-color: #800000; font-weight: bold\">1</span><span style=\"color: #800000; text-decoration-color: #800000\"> strict</span><span style=\"font-weight: bold\">)</span>.\n",
       "</pre>\n"
      ],
      "text/plain": [
       "Completing \u001b[1;36m818\u001b[0m strain-level models on a medium with \u001b[1;36m87\u001b[0m components \u001b[1m(\u001b[0m\u001b[1;31m1\u001b[0m\u001b[31m strict\u001b[0m\u001b[1m)\u001b[0m.\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "4965aea19a7e43a3bb350fb1fba1097f",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Output()"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"></pre>\n"
      ],
      "text/plain": []
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Obtained growth for <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">815</span> models adding additional flux of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.82</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2940.27</span> on average.\n",
       "</pre>\n"
      ],
      "text/plain": [
       "Obtained growth for \u001b[1;36m815\u001b[0m models adding additional flux of \u001b[1;36m0.82\u001b[0m/\u001b[1;36m2940.27\u001b[0m on average.\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from micom.workflows.db_media import complete_db_medium\n",
    "\n",
    "manifest, imports = complete_db_medium(\n",
    "    \"../data/agora103_strain.qza\", \n",
    "    medium=skeleton, \n",
    "    growth=0.1, \n",
    "    threads=12, \n",
    "    max_added_import=10,  # do not add more than 10 mmol/h of flux per component\n",
    "    strict=[\"EX_o2(e)\"],  # force anaerobic environment\n",
    "    weights=\"mass\"        # minimize added molecular weight\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 93,
   "id": "32bd665c-ce23-4b28-bfd1-23adfdfd4743",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True     815\n",
       "False      3\n",
       "Name: can_grow, dtype: int64"
      ]
     },
     "execution_count": 93,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "manifest.can_grow.value_counts()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 94,
   "id": "4dd91dec-23c7-4699-8d31-9d3a64680280",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Added flux is 153.99/3143.57 mmol/h.\n"
     ]
    }
   ],
   "source": [
    "filled = imports.max()\n",
    "added = filled.sum() - skeleton.loc[skeleton.reaction.isin(filled.index), \"flux\"].sum()\n",
    "\n",
    "print(f\"Added flux is {added.sum():.2f}/{filled.sum():.2f} mmol/h.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3b5962e5-3c91-438a-aaec-9f764096e29d",
   "metadata": {},
   "source": [
    "Let's see what was added in large amounts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 95,
   "id": "8d324305-1e05-4f50-84e1-e6440d66e18c",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "EX_h_m         10.000000\n",
       "EX_h2_m        10.000000\n",
       "EX_glyleu_m    10.000000\n",
       "EX_no_m        10.000000\n",
       "EX_ser_L_m     10.000000\n",
       "EX_glyc_m       8.883167\n",
       "EX_succ_m       8.277121\n",
       "EX_co2_m        7.558247\n",
       "EX_asn_L_m      6.665649\n",
       "EX_for_m        5.926777\n",
       "EX_urea_m       4.965212\n",
       "EX_mal_L_m      4.867167\n",
       "EX_acald_m      4.838348\n",
       "EX_cytd_m       4.458517\n",
       "EX_ph2s_m       4.322594\n",
       "EX_acac_m       4.121076\n",
       "EX_fum_m        3.904283\n",
       "EX_nh4_m        3.869570\n",
       "EX_no3_m        3.851188\n",
       "EX_arg_L_m      3.519534\n",
       "dtype: float64"
      ]
     },
     "execution_count": 95,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "added_fluxes = filled.copy()\n",
    "shared = added_fluxes.index[added_fluxes.index.isin(skeleton.reaction)]\n",
    "added_fluxes[shared] -= skeleton.flux[shared]\n",
    "added_fluxes.sort_values(ascending=False)[0:20]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ad36b333-bdc5-4a4f-9c90-58cfc3845f94",
   "metadata": {},
   "source": [
    "Looks okay. So we will now assemble the final medium. For this we add the new components to each sample and rebuild the annotations for a nicely formatted medium."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 96,
   "id": "2abf0ad7-27c0-4d3f-b813-711c7b6874d5",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>metabolite</th>\n",
       "      <th>flux</th>\n",
       "      <th>name</th>\n",
       "      <th>hmdb</th>\n",
       "      <th>kegg.compound</th>\n",
       "      <th>pubchem.compound</th>\n",
       "      <th>inchi</th>\n",
       "      <th>chebi</th>\n",
       "      <th>reaction</th>\n",
       "      <th>global_id</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>ala_L</td>\n",
       "      <td>0.427871</td>\n",
       "      <td>L-alanine</td>\n",
       "      <td>HMDB00161</td>\n",
       "      <td>C00041</td>\n",
       "      <td>5950.0</td>\n",
       "      <td>InChI=1S/C3H7NO2/c1-2(4)3(5)6/h2H,4H2,1H3,(H,5...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_ala_L_m</td>\n",
       "      <td>EX_ala_L(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>arg_L</td>\n",
       "      <td>3.769624</td>\n",
       "      <td>L-argininium(1+)</td>\n",
       "      <td>HMDB00517</td>\n",
       "      <td>C00062</td>\n",
       "      <td>6322.0</td>\n",
       "      <td>InChI=1S/C6H14N4O2/c7-4(5(11)12)2-1-3-10-6(8)9...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_arg_L_m</td>\n",
       "      <td>EX_arg_L(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>asn_L</td>\n",
       "      <td>6.665649</td>\n",
       "      <td>L-asparagine</td>\n",
       "      <td>HMDB00168</td>\n",
       "      <td>C00152</td>\n",
       "      <td>6267.0</td>\n",
       "      <td>InChI=1S/C4H8N2O3/c5-2(4(8)9)1-3(6)7/h2H,1,5H2...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_asn_L_m</td>\n",
       "      <td>EX_asn_L(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>asp_L</td>\n",
       "      <td>3.163472</td>\n",
       "      <td>L-aspartate(1-)</td>\n",
       "      <td>HMDB00191</td>\n",
       "      <td>C00049</td>\n",
       "      <td>5960.0</td>\n",
       "      <td>InChI=1S/C4H7NO4/c5-2(4(8)9)1-3(6)7/h2H,1,5H2,...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_asp_L_m</td>\n",
       "      <td>EX_asp_L(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>ca2</td>\n",
       "      <td>2.710538</td>\n",
       "      <td>calcium(2+)</td>\n",
       "      <td>HMDB00464</td>\n",
       "      <td>C00076</td>\n",
       "      <td>NaN</td>\n",
       "      <td>InChI=1S/Ca/q+2</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_ca2_m</td>\n",
       "      <td>EX_ca2(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>173</th>\n",
       "      <td>MGlcn54</td>\n",
       "      <td>0.009248</td>\n",
       "      <td>mucin-type O-glycan No 54</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_MGlcn54_m</td>\n",
       "      <td>EX_MGlcn54(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>174</th>\n",
       "      <td>n2</td>\n",
       "      <td>0.028071</td>\n",
       "      <td>Nitrogen</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_n2_m</td>\n",
       "      <td>EX_n2(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>175</th>\n",
       "      <td>cmp</td>\n",
       "      <td>0.008410</td>\n",
       "      <td>CMP</td>\n",
       "      <td>HMDB00095</td>\n",
       "      <td>C00055</td>\n",
       "      <td>6131.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_cmp_m</td>\n",
       "      <td>EX_cmp(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>176</th>\n",
       "      <td>datp</td>\n",
       "      <td>0.002391</td>\n",
       "      <td>dATP</td>\n",
       "      <td>HMDB01532</td>\n",
       "      <td>C00131</td>\n",
       "      <td>15993.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_datp_m</td>\n",
       "      <td>EX_datp(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>177</th>\n",
       "      <td>so3</td>\n",
       "      <td>1.072348</td>\n",
       "      <td>Sulfite</td>\n",
       "      <td>HMDB00240</td>\n",
       "      <td>C00094</td>\n",
       "      <td>1100.0</td>\n",
       "      <td>InChI=1S/H2O3S/c1-4(2)3/h(H2,1,2,3)/p-2</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_so3_m</td>\n",
       "      <td>EX_so3(e)</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>178 rows × 10 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "    metabolite      flux                       name       hmdb kegg.compound  \\\n",
       "0        ala_L  0.427871                  L-alanine  HMDB00161        C00041   \n",
       "1        arg_L  3.769624           L-argininium(1+)  HMDB00517        C00062   \n",
       "2        asn_L  6.665649               L-asparagine  HMDB00168        C00152   \n",
       "3        asp_L  3.163472            L-aspartate(1-)  HMDB00191        C00049   \n",
       "4          ca2  2.710538                calcium(2+)  HMDB00464        C00076   \n",
       "..         ...       ...                        ...        ...           ...   \n",
       "173    MGlcn54  0.009248  mucin-type O-glycan No 54        NaN           NaN   \n",
       "174         n2  0.028071                   Nitrogen        NaN           NaN   \n",
       "175        cmp  0.008410                        CMP  HMDB00095        C00055   \n",
       "176       datp  0.002391                       dATP  HMDB01532        C00131   \n",
       "177        so3  1.072348                    Sulfite  HMDB00240        C00094   \n",
       "\n",
       "     pubchem.compound                                              inchi  \\\n",
       "0              5950.0  InChI=1S/C3H7NO2/c1-2(4)3(5)6/h2H,4H2,1H3,(H,5...   \n",
       "1              6322.0  InChI=1S/C6H14N4O2/c7-4(5(11)12)2-1-3-10-6(8)9...   \n",
       "2              6267.0  InChI=1S/C4H8N2O3/c5-2(4(8)9)1-3(6)7/h2H,1,5H2...   \n",
       "3              5960.0  InChI=1S/C4H7NO4/c5-2(4(8)9)1-3(6)7/h2H,1,5H2,...   \n",
       "4                 NaN                                    InChI=1S/Ca/q+2   \n",
       "..                ...                                                ...   \n",
       "173               NaN                                                NaN   \n",
       "174               NaN                                                NaN   \n",
       "175            6131.0                                                NaN   \n",
       "176           15993.0                                                NaN   \n",
       "177            1100.0            InChI=1S/H2O3S/c1-4(2)3/h(H2,1,2,3)/p-2   \n",
       "\n",
       "    chebi      reaction      global_id  \n",
       "0     NaN    EX_ala_L_m    EX_ala_L(e)  \n",
       "1     NaN    EX_arg_L_m    EX_arg_L(e)  \n",
       "2     NaN    EX_asn_L_m    EX_asn_L(e)  \n",
       "3     NaN    EX_asp_L_m    EX_asp_L(e)  \n",
       "4     NaN      EX_ca2_m      EX_ca2(e)  \n",
       "..    ...           ...            ...  \n",
       "173   NaN  EX_MGlcn54_m  EX_MGlcn54(e)  \n",
       "174   NaN       EX_n2_m       EX_n2(e)  \n",
       "175   NaN      EX_cmp_m      EX_cmp(e)  \n",
       "176   NaN     EX_datp_m     EX_datp(e)  \n",
       "177   NaN      EX_so3_m      EX_so3(e)  \n",
       "\n",
       "[178 rows x 10 columns]"
      ]
     },
     "execution_count": 96,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "added_df = filled[filled > 1e-8].reset_index() \n",
    "added_df.iloc[:, 0] = added_df.iloc[:, 0].str.replace(\"EX_|_m$\", \"\", regex=True)\n",
    "added_df.columns = [\"metabolite\", \"flux\"]\n",
    "\n",
    "completed = pd.merge(added_df, annotations, on=\"metabolite\", how=\"left\")\n",
    "completed[\"reaction\"] = \"EX_\" + completed.metabolite + \"_m\"\n",
    "completed[\"global_id\"] = \"EX_\" + completed.metabolite + \"(e)\"\n",
    "completed"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7633139b-8f41-4d49-b833-10f2c5ab0a41",
   "metadata": {},
   "source": [
    "## Validate the medium\n",
    "\n",
    "And we will now validate whether the medium works."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 97,
   "id": "6962f533-2cab-4b4c-9e6b-8c2ccf0f5dec",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Checking <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">818</span> strain-level models on a medium with <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">178</span> components.\n",
       "</pre>\n"
      ],
      "text/plain": [
       "Checking \u001b[1;36m818\u001b[0m strain-level models on a medium with \u001b[1;36m178\u001b[0m components.\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "92133371410143e19aaf2f7ddd146ea8",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Output()"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"></pre>\n"
      ],
      "text/plain": []
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from micom.workflows.db_media import check_db_medium\n",
    "\n",
    "check = check_db_medium(\"../data/agora103_strain.qza\", medium=completed, threads=12)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 98,
   "id": "2b509db7-f18d-4f1e-82fc-ca86206fa79d",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "count    818.000000\n",
       "mean       0.190551\n",
       "std        0.105812\n",
       "min        0.036577\n",
       "25%        0.100000\n",
       "50%        0.156764\n",
       "75%        0.256409\n",
       "max        0.646666\n",
       "Name: growth_rate, dtype: float64"
      ]
     },
     "execution_count": 98,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "check.growth_rate.describe()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 99,
   "id": "7baa0acf-95a4-47f2-bb29-616128d549ad",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'../media/baobab_honey_antelope.qza'"
      ]
     },
     "execution_count": 99,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import qiime2 as q2\n",
    "\n",
    "arti = q2.Artifact.import_data(\"MicomMedium[Global]\", completed)\n",
    "arti.save(\"../media/baobab_honey_antelope.qza\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9ccb51ce-5186-450d-bd0d-54e1f810c707",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}