{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "45c8dd54-1683-40cc-8b0b-bbef62882a76",
   "metadata": {},
   "source": [
    "# Single meal: in the Himalayas\n",
    "\n",
    "Here we will build a medium based on the anticipated metabolites present in the human gut for a meal matched to the Chepang people in Nepal. Please be aware that this is unlikely to represent the full diversity of the diet.\n",
    "\n",
    "The composition of the meal is the following:\n",
    "\n",
    "- 450g of mountain/air yam (cooked)\n",
    "- 250g cooked cowpea\n",
    "- 500g stinging nettles (weight before cooking)\n",
    "- 100g cooked chicken\n",
    "\n",
    "This sums up to 1020.5 kcal.\n",
    "\n",
    "We will start by reading individual tables for the specific foods and scale the abundances."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "id": "be313cb2-1e5d-4766-bba1-7479835c5925",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "\n",
    "meal = {\n",
    "    \"Mountain yam\": 450,\n",
    "    \"Cowpea\": 250,\n",
    "    \"Himalayan Stinging Nettle\": 500,\n",
    "    \"Chicken\": 100\n",
    "}\n",
    "\n",
    "foods = []\n",
    "for food in meal:\n",
    "    mets = pd.read_excel(\"../data/foods_diets.xlsx\", food)\n",
    "    mets[\"amount_g\"] = mets.relative_abundance / mets.relative_abundance.sum() * meal[food]\n",
    "    mets[\"concentration\"] = mets[\"amount_g\"] / mets[\"mw\"] * 1000.0  # to yield mmol/meal\n",
    "    mets[\"flux\"] = mets[\"concentration\"] / 8.0  # to yield mmol/h\n",
    "    mets[\"food\"] = food\n",
    "    foods.append(mets)\n",
    "foods = pd.concat(foods)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9be60e55-e72e-4913-8625-4d66a3079edb",
   "metadata": {},
   "source": [
    "Now we combine the data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "id": "d21a00b4-278a-4b99-a9f3-32fcde652738",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>metabolite</th>\n",
       "      <th>flux</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>ala_L</td>\n",
       "      <td>16.966227</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>arg_L</td>\n",
       "      <td>13.388907</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>ascb_L</td>\n",
       "      <td>0.030642</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>asn_L</td>\n",
       "      <td>5.593654</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>asp_L</td>\n",
       "      <td>19.277857</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  metabolite       flux\n",
       "0      ala_L  16.966227\n",
       "1      arg_L  13.388907\n",
       "2     ascb_L   0.030642\n",
       "3      asn_L   5.593654\n",
       "4      asp_L  19.277857"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "foods.loc[foods.metabolite == \"h2o\", \"flux\"] += 1000.0 / 18.01 * 1000.0 / 8.0 # add 1L water\n",
    "diet = foods.dropna(subset=[\"metabolite\"]).groupby(\"metabolite\").flux.sum().reset_index()\n",
    "diet.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cfe7d9de-6a82-4106-8b41-6a81fa84a742",
   "metadata": {},
   "source": [
    "## Adjust for intestinal adsorption\n",
    "\n",
    "To achieve this we will load the Recon3 human model. AGORA and Recon IDs are very similar so we should be able to match them. We just have to adjust the Recon3 ones a bit. We start by identifying all available exchanges in Recon3 and adjusting the IDs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "id": "75b2c2f1-e575-4085-a0d9-edc88de26801",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0     5adtststerone\n",
       "1    5adtststerones\n",
       "2             5fthf\n",
       "3             5htrp\n",
       "4             5mthf\n",
       "dtype: object"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from cobra.io import read_sbml_model\n",
    "import pandas as pd\n",
    "\n",
    "recon3 = read_sbml_model(\"../data/Recon3D.xml.gz\")\n",
    "exchanges = pd.Series([r.id for r in recon3.exchanges])\n",
    "exchanges = exchanges.str.replace(\"__\", \"_\").str.replace(\"_e$|EX_\", \"\", regex=True)\n",
    "exchanges.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "id": "d3dec0ed-7ba5-41a1-9234-4fb76ec1e71e",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.2    47\n",
       "1.0     9\n",
       "Name: dilution, dtype: int64"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "diet[\"dilution\"] = 1.0\n",
    "diet.loc[diet.metabolite.isin(exchanges), \"dilution\"] = 0.2\n",
    "diet[\"flux\"] = diet[\"flux\"] * diet[\"dilution\"] \n",
    "diet[[\"metabolite\", \"dilution\"]].drop_duplicates().dilution.value_counts()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "88e48f71-e76f-4462-be41-61578af7f9b0",
   "metadata": {},
   "source": [
    "## Adding host supplied components\n",
    "\n",
    "Finally we add the host metabolites such as primary bile acids and mucins and a minuscule amount of oxygen."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "id": "5e7d1abb-ba90-4341-aec5-d7361210328f",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>metabolite</th>\n",
       "      <th>flux</th>\n",
       "      <th>dilution</th>\n",
       "      <th>reaction</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>ala_L</td>\n",
       "      <td>3.393245</td>\n",
       "      <td>0.2</td>\n",
       "      <td>EX_ala_L(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>arg_L</td>\n",
       "      <td>2.677781</td>\n",
       "      <td>0.2</td>\n",
       "      <td>EX_arg_L(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>ascb_L</td>\n",
       "      <td>0.006128</td>\n",
       "      <td>0.2</td>\n",
       "      <td>EX_ascb_L(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>asn_L</td>\n",
       "      <td>1.118731</td>\n",
       "      <td>0.2</td>\n",
       "      <td>EX_asn_L(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>asp_L</td>\n",
       "      <td>3.855571</td>\n",
       "      <td>0.2</td>\n",
       "      <td>EX_asp_L(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>67</th>\n",
       "      <td>gncore2_rl</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_gncore2_rl(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>68</th>\n",
       "      <td>core7</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_core7(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>69</th>\n",
       "      <td>gchola</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_gchola(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>70</th>\n",
       "      <td>tchola</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_tchola(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>71</th>\n",
       "      <td>o2</td>\n",
       "      <td>0.001000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_o2(e)</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>72 rows × 4 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "    metabolite      flux  dilution          reaction\n",
       "0        ala_L  3.393245       0.2       EX_ala_L(e)\n",
       "1        arg_L  2.677781       0.2       EX_arg_L(e)\n",
       "2       ascb_L  0.006128       0.2      EX_ascb_L(e)\n",
       "3        asn_L  1.118731       0.2       EX_asn_L(e)\n",
       "4        asp_L  3.855571       0.2       EX_asp_L(e)\n",
       "..         ...       ...       ...               ...\n",
       "67  gncore2_rl  1.000000       NaN  EX_gncore2_rl(e)\n",
       "68       core7  1.000000       NaN       EX_core7(e)\n",
       "69      gchola  1.000000       NaN      EX_gchola(e)\n",
       "70      tchola  1.000000       NaN      EX_tchola(e)\n",
       "71          o2  0.001000       NaN          EX_o2(e)\n",
       "\n",
       "[72 rows x 4 columns]"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "diet.set_index(\"metabolite\", inplace=True)\n",
    "annotations = pd.read_csv(\"../data/agora_metabolites.csv\")\n",
    "\n",
    "# mucin\n",
    "for met in annotations.loc[annotations.metabolite.str.contains(\"core\"), \"metabolite\"]:\n",
    "    diet.loc[met, \"flux\"] = 1\n",
    "\n",
    "# primary BAs\n",
    "for met in [\"gchola\", \"tchola\"]:\n",
    "    diet.loc[met, \"flux\"] = 1\n",
    "\n",
    "# anaerobic\n",
    "diet.loc[\"o2\", \"flux\"] = 0.001\n",
    "\n",
    "diet.reset_index(inplace=True)\n",
    "diet[\"reaction\"] = \"EX_\" + diet.metabolite + \"(e)\"\n",
    "diet"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7483514f-757f-4b60-8587-eee86bfc2ad0",
   "metadata": {},
   "source": [
    "And we will merge this table with some annotations to make it more accessible."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "id": "e7d3e530-90fc-4499-b9b2-9e81cd4c1ce5",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>metabolite</th>\n",
       "      <th>flux</th>\n",
       "      <th>dilution</th>\n",
       "      <th>reaction</th>\n",
       "      <th>name</th>\n",
       "      <th>hmdb</th>\n",
       "      <th>kegg.compound</th>\n",
       "      <th>pubchem.compound</th>\n",
       "      <th>inchi</th>\n",
       "      <th>chebi</th>\n",
       "      <th>global_id</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>ala_L</td>\n",
       "      <td>3.393245</td>\n",
       "      <td>0.2</td>\n",
       "      <td>EX_ala_L_m</td>\n",
       "      <td>L-alanine</td>\n",
       "      <td>HMDB00161</td>\n",
       "      <td>C00041</td>\n",
       "      <td>5950.0</td>\n",
       "      <td>InChI=1S/C3H7NO2/c1-2(4)3(5)6/h2H,4H2,1H3,(H,5...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_ala_L(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>arg_L</td>\n",
       "      <td>2.677781</td>\n",
       "      <td>0.2</td>\n",
       "      <td>EX_arg_L_m</td>\n",
       "      <td>L-argininium(1+)</td>\n",
       "      <td>HMDB00517</td>\n",
       "      <td>C00062</td>\n",
       "      <td>6322.0</td>\n",
       "      <td>InChI=1S/C6H14N4O2/c7-4(5(11)12)2-1-3-10-6(8)9...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_arg_L(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>ascb_L</td>\n",
       "      <td>0.006128</td>\n",
       "      <td>0.2</td>\n",
       "      <td>EX_ascb_L_m</td>\n",
       "      <td>L-ascorbate</td>\n",
       "      <td>HMDB00044</td>\n",
       "      <td>C00072</td>\n",
       "      <td>NaN</td>\n",
       "      <td>InChI=1S/C6H8O6/c7-1-2(8)5-3(9)4(10)6(11)12-5/...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_ascb_L(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>asn_L</td>\n",
       "      <td>1.118731</td>\n",
       "      <td>0.2</td>\n",
       "      <td>EX_asn_L_m</td>\n",
       "      <td>L-asparagine</td>\n",
       "      <td>HMDB00168</td>\n",
       "      <td>C00152</td>\n",
       "      <td>6267.0</td>\n",
       "      <td>InChI=1S/C4H8N2O3/c5-2(4(8)9)1-3(6)7/h2H,1,5H2...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_asn_L(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>asp_L</td>\n",
       "      <td>3.855571</td>\n",
       "      <td>0.2</td>\n",
       "      <td>EX_asp_L_m</td>\n",
       "      <td>L-aspartate(1-)</td>\n",
       "      <td>HMDB00191</td>\n",
       "      <td>C00049</td>\n",
       "      <td>5960.0</td>\n",
       "      <td>InChI=1S/C4H7NO4/c5-2(4(8)9)1-3(6)7/h2H,1,5H2,...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_asp_L(e)</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  metabolite      flux  dilution     reaction              name       hmdb  \\\n",
       "0      ala_L  3.393245       0.2   EX_ala_L_m         L-alanine  HMDB00161   \n",
       "1      arg_L  2.677781       0.2   EX_arg_L_m  L-argininium(1+)  HMDB00517   \n",
       "2     ascb_L  0.006128       0.2  EX_ascb_L_m       L-ascorbate  HMDB00044   \n",
       "3      asn_L  1.118731       0.2   EX_asn_L_m      L-asparagine  HMDB00168   \n",
       "4      asp_L  3.855571       0.2   EX_asp_L_m   L-aspartate(1-)  HMDB00191   \n",
       "\n",
       "  kegg.compound  pubchem.compound  \\\n",
       "0        C00041            5950.0   \n",
       "1        C00062            6322.0   \n",
       "2        C00072               NaN   \n",
       "3        C00152            6267.0   \n",
       "4        C00049            5960.0   \n",
       "\n",
       "                                               inchi chebi     global_id  \n",
       "0  InChI=1S/C3H7NO2/c1-2(4)3(5)6/h2H,4H2,1H3,(H,5...   NaN   EX_ala_L(e)  \n",
       "1  InChI=1S/C6H14N4O2/c7-4(5(11)12)2-1-3-10-6(8)9...   NaN   EX_arg_L(e)  \n",
       "2  InChI=1S/C6H8O6/c7-1-2(8)5-3(9)4(10)6(11)12-5/...   NaN  EX_ascb_L(e)  \n",
       "3  InChI=1S/C4H8N2O3/c5-2(4(8)9)1-3(6)7/h2H,1,5H2...   NaN   EX_asn_L(e)  \n",
       "4  InChI=1S/C4H7NO4/c5-2(4(8)9)1-3(6)7/h2H,1,5H2,...   NaN   EX_asp_L(e)  "
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "skeleton = pd.merge(diet, annotations, on=\"metabolite\")\n",
    "\n",
    "skeleton[\"global_id\"] = skeleton.reaction\n",
    "skeleton[\"reaction\"] = \"EX_\" + skeleton.metabolite + \"_m\"\n",
    "skeleton.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6aaf4682-027e-4a8b-a7e9-ae3033694644",
   "metadata": {},
   "source": [
    "## Complete the medium\n",
    "\n",
    "Great we now have a pretty good skeleton. One issue that this will never be fully complete. There will always be some components missing that are essential for microbial growth. Fortunately, we provide a algorithm in MICOM to complete a medium with the smallest set of additional components to provide growth to all intestinal taxa."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "id": "614e9d55-43c6-4ce7-9483-8a7551a1aff5",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Completing <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">818</span> strain-level models on a medium with <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">72</span> components <span style=\"font-weight: bold\">(</span><span style=\"color: #800000; text-decoration-color: #800000; font-weight: bold\">1</span><span style=\"color: #800000; text-decoration-color: #800000\"> strict</span><span style=\"font-weight: bold\">)</span>.\n",
       "</pre>\n"
      ],
      "text/plain": [
       "Completing \u001b[1;36m818\u001b[0m strain-level models on a medium with \u001b[1;36m72\u001b[0m components \u001b[1m(\u001b[0m\u001b[1;31m1\u001b[0m\u001b[31m strict\u001b[0m\u001b[1m)\u001b[0m.\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "52ce325470624bd9bab46fbf01ce1957",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Output()"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"></pre>\n"
      ],
      "text/plain": []
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Obtained growth for <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">815</span> models adding additional flux of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.62</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6763.81</span> on average.\n",
       "</pre>\n"
      ],
      "text/plain": [
       "Obtained growth for \u001b[1;36m815\u001b[0m models adding additional flux of \u001b[1;36m0.62\u001b[0m/\u001b[1;36m6763.81\u001b[0m on average.\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from micom.workflows.db_media import complete_db_medium\n",
    "\n",
    "manifest, imports = complete_db_medium(\n",
    "    \"../data/agora103_strain.qza\", \n",
    "    medium=skeleton, \n",
    "    growth=0.1, \n",
    "    threads=12, \n",
    "    max_added_import=10,  # do not add more than 10 mmol/h of flux per component\n",
    "    strict=[\"EX_o2(e)\"],  # force anaerobic environment\n",
    "    weights=\"mass\"        # minimize added molecular weight\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "id": "72675356-b60f-4ee9-821d-46caeac4d4b3",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True     815\n",
       "False      3\n",
       "Name: can_grow, dtype: int64"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "manifest.can_grow.value_counts()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "id": "6e91940c-1dfb-4884-aa26-4e0ac3ec56f9",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Added flux is 119.80/6969.38 mmol/h.\n"
     ]
    }
   ],
   "source": [
    "filled = imports.max()\n",
    "added = filled.sum() - skeleton.loc[skeleton.reaction.isin(filled.index), \"flux\"].sum()\n",
    "\n",
    "print(f\"Added flux is {added.sum():.2f}/{filled.sum():.2f} mmol/h.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8aed4f6a-b328-47d0-90d0-2fc706f6b7a4",
   "metadata": {},
   "source": [
    "Let's see what was added in large amounts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "id": "ebde289e-4e96-4fc7-baed-30e61fb2bcea",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "EX_h_m         10.000000\n",
       "EX_glyleu_m    10.000000\n",
       "EX_h2_m        10.000000\n",
       "EX_glyc_m       8.883167\n",
       "EX_no_m         7.881353\n",
       "EX_ser_L_m      7.426290\n",
       "EX_for_m        7.181399\n",
       "EX_asn_L_m      5.312592\n",
       "EX_urea_m       4.965212\n",
       "EX_ac_m         4.587295\n",
       "EX_fru_m        4.524197\n",
       "EX_cytd_m       4.458517\n",
       "EX_mal_L_m      3.850456\n",
       "EX_co2_m        3.453552\n",
       "EX_no3_m        3.413730\n",
       "EX_ph2s_m       3.074153\n",
       "EX_nh4_m        2.765672\n",
       "EX_no2_m        2.182533\n",
       "EX_glyc3p_m     1.819979\n",
       "EX_fum_m        1.686218\n",
       "dtype: float64"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "added_fluxes = filled.copy()\n",
    "shared = added_fluxes.index[added_fluxes.index.isin(skeleton.reaction)]\n",
    "added_fluxes[shared] -= skeleton.flux[shared]\n",
    "added_fluxes.sort_values(ascending=False)[0:20]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "49433d80-6a4d-414f-9bc9-45ef15e09498",
   "metadata": {},
   "source": [
    "Looks okay. So we will now assemble the final medium. For this we add the new components to each sample and rebuild the annotations for a nicely formatted medium."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "id": "b1bbaaa0-cc1b-4393-abff-1d98fb774ff0",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>metabolite</th>\n",
       "      <th>flux</th>\n",
       "      <th>name</th>\n",
       "      <th>hmdb</th>\n",
       "      <th>kegg.compound</th>\n",
       "      <th>pubchem.compound</th>\n",
       "      <th>inchi</th>\n",
       "      <th>chebi</th>\n",
       "      <th>reaction</th>\n",
       "      <th>global_id</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>ala_L</td>\n",
       "      <td>3.393245</td>\n",
       "      <td>L-alanine</td>\n",
       "      <td>HMDB00161</td>\n",
       "      <td>C00041</td>\n",
       "      <td>5950.0</td>\n",
       "      <td>InChI=1S/C3H7NO2/c1-2(4)3(5)6/h2H,4H2,1H3,(H,5...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_ala_L_m</td>\n",
       "      <td>EX_ala_L(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>arg_L</td>\n",
       "      <td>3.769624</td>\n",
       "      <td>L-argininium(1+)</td>\n",
       "      <td>HMDB00517</td>\n",
       "      <td>C00062</td>\n",
       "      <td>6322.0</td>\n",
       "      <td>InChI=1S/C6H14N4O2/c7-4(5(11)12)2-1-3-10-6(8)9...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_arg_L_m</td>\n",
       "      <td>EX_arg_L(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>asn_L</td>\n",
       "      <td>6.431323</td>\n",
       "      <td>L-asparagine</td>\n",
       "      <td>HMDB00168</td>\n",
       "      <td>C00152</td>\n",
       "      <td>6267.0</td>\n",
       "      <td>InChI=1S/C4H8N2O3/c5-2(4(8)9)1-3(6)7/h2H,1,5H2...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_asn_L_m</td>\n",
       "      <td>EX_asn_L(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>asp_L</td>\n",
       "      <td>3.855571</td>\n",
       "      <td>L-aspartate(1-)</td>\n",
       "      <td>HMDB00191</td>\n",
       "      <td>C00049</td>\n",
       "      <td>5960.0</td>\n",
       "      <td>InChI=1S/C4H7NO4/c5-2(4(8)9)1-3(6)7/h2H,1,5H2,...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_asp_L_m</td>\n",
       "      <td>EX_asp_L(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>but</td>\n",
       "      <td>0.000322</td>\n",
       "      <td>butyrate</td>\n",
       "      <td>HMDB00039</td>\n",
       "      <td>C00246</td>\n",
       "      <td>264.0</td>\n",
       "      <td>InChI=1S/C4H8O2/c1-2-3-4(5)6/h2-3H2,1H3,(H,5,6...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_but_m</td>\n",
       "      <td>EX_but(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>164</th>\n",
       "      <td>indole</td>\n",
       "      <td>0.005516</td>\n",
       "      <td>Indole</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_indole_m</td>\n",
       "      <td>EX_indole(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>165</th>\n",
       "      <td>cmp</td>\n",
       "      <td>0.008410</td>\n",
       "      <td>CMP</td>\n",
       "      <td>HMDB00095</td>\n",
       "      <td>C00055</td>\n",
       "      <td>6131.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_cmp_m</td>\n",
       "      <td>EX_cmp(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>166</th>\n",
       "      <td>coa</td>\n",
       "      <td>0.000619</td>\n",
       "      <td>Coenzyme A</td>\n",
       "      <td>HMDB01423</td>\n",
       "      <td>C00010</td>\n",
       "      <td>6816.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_coa_m</td>\n",
       "      <td>EX_coa(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>167</th>\n",
       "      <td>datp</td>\n",
       "      <td>0.002391</td>\n",
       "      <td>dATP</td>\n",
       "      <td>HMDB01532</td>\n",
       "      <td>C00131</td>\n",
       "      <td>15993.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_datp_m</td>\n",
       "      <td>EX_datp(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>168</th>\n",
       "      <td>urea</td>\n",
       "      <td>4.965212</td>\n",
       "      <td>Urea</td>\n",
       "      <td>HMDB00294</td>\n",
       "      <td>C00086</td>\n",
       "      <td>1176.0</td>\n",
       "      <td>InChI=1S/CH4N2O/c2-1(3)4/h(H4,2,3,4)</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_urea_m</td>\n",
       "      <td>EX_urea(e)</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>169 rows × 10 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "    metabolite      flux              name       hmdb kegg.compound  \\\n",
       "0        ala_L  3.393245         L-alanine  HMDB00161        C00041   \n",
       "1        arg_L  3.769624  L-argininium(1+)  HMDB00517        C00062   \n",
       "2        asn_L  6.431323      L-asparagine  HMDB00168        C00152   \n",
       "3        asp_L  3.855571   L-aspartate(1-)  HMDB00191        C00049   \n",
       "4          but  0.000322          butyrate  HMDB00039        C00246   \n",
       "..         ...       ...               ...        ...           ...   \n",
       "164     indole  0.005516            Indole        NaN           NaN   \n",
       "165        cmp  0.008410               CMP  HMDB00095        C00055   \n",
       "166        coa  0.000619        Coenzyme A  HMDB01423        C00010   \n",
       "167       datp  0.002391              dATP  HMDB01532        C00131   \n",
       "168       urea  4.965212              Urea  HMDB00294        C00086   \n",
       "\n",
       "     pubchem.compound                                              inchi  \\\n",
       "0              5950.0  InChI=1S/C3H7NO2/c1-2(4)3(5)6/h2H,4H2,1H3,(H,5...   \n",
       "1              6322.0  InChI=1S/C6H14N4O2/c7-4(5(11)12)2-1-3-10-6(8)9...   \n",
       "2              6267.0  InChI=1S/C4H8N2O3/c5-2(4(8)9)1-3(6)7/h2H,1,5H2...   \n",
       "3              5960.0  InChI=1S/C4H7NO4/c5-2(4(8)9)1-3(6)7/h2H,1,5H2,...   \n",
       "4               264.0  InChI=1S/C4H8O2/c1-2-3-4(5)6/h2-3H2,1H3,(H,5,6...   \n",
       "..                ...                                                ...   \n",
       "164               NaN                                                NaN   \n",
       "165            6131.0                                                NaN   \n",
       "166            6816.0                                                NaN   \n",
       "167           15993.0                                                NaN   \n",
       "168            1176.0               InChI=1S/CH4N2O/c2-1(3)4/h(H4,2,3,4)   \n",
       "\n",
       "    chebi     reaction     global_id  \n",
       "0     NaN   EX_ala_L_m   EX_ala_L(e)  \n",
       "1     NaN   EX_arg_L_m   EX_arg_L(e)  \n",
       "2     NaN   EX_asn_L_m   EX_asn_L(e)  \n",
       "3     NaN   EX_asp_L_m   EX_asp_L(e)  \n",
       "4     NaN     EX_but_m     EX_but(e)  \n",
       "..    ...          ...           ...  \n",
       "164   NaN  EX_indole_m  EX_indole(e)  \n",
       "165   NaN     EX_cmp_m     EX_cmp(e)  \n",
       "166   NaN     EX_coa_m     EX_coa(e)  \n",
       "167   NaN    EX_datp_m    EX_datp(e)  \n",
       "168   NaN    EX_urea_m    EX_urea(e)  \n",
       "\n",
       "[169 rows x 10 columns]"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "added_df = filled[filled > 1e-8].reset_index() \n",
    "added_df.iloc[:, 0] = added_df.iloc[:, 0].str.replace(\"EX_|_m$\", \"\", regex=True)\n",
    "added_df.columns = [\"metabolite\", \"flux\"]\n",
    "\n",
    "completed = pd.merge(added_df, annotations, on=\"metabolite\", how=\"left\")\n",
    "completed[\"reaction\"] = \"EX_\" + completed.metabolite + \"_m\"\n",
    "completed[\"global_id\"] = \"EX_\" + completed.metabolite + \"(e)\"\n",
    "completed"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f59284ae-71d9-40fe-b78e-f2d923d16f55",
   "metadata": {},
   "source": [
    "## Validate the medium\n",
    "\n",
    "And we will now validate whether the medium works."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "id": "3a115b65-ef0d-485c-89c0-311779a53e6a",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Checking <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">818</span> strain-level models on a medium with <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">169</span> components.\n",
       "</pre>\n"
      ],
      "text/plain": [
       "Checking \u001b[1;36m818\u001b[0m strain-level models on a medium with \u001b[1;36m169\u001b[0m components.\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "1cf5ca034bc44fc09ffe7ae2a83e2cba",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Output()"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"></pre>\n"
      ],
      "text/plain": []
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from micom.workflows.db_media import check_db_medium\n",
    "\n",
    "check = check_db_medium(\"../data/agora103_strain.qza\", medium=completed, threads=12)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "id": "11d0cb77-480b-44af-b1e1-e19233aa693d",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "count    818.000000\n",
       "mean       0.191422\n",
       "std        0.109046\n",
       "min        0.036577\n",
       "25%        0.100000\n",
       "50%        0.156764\n",
       "75%        0.256409\n",
       "max        0.646666\n",
       "Name: growth_rate, dtype: float64"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "check.growth_rate.describe()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "id": "2b5462a6-4e6c-4e5d-8e9b-c6e1245bef5d",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'../media/himalaya.qza'"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import qiime2 as q2\n",
    "\n",
    "arti = q2.Artifact.import_data(\"MicomMedium[Global]\", completed)\n",
    "arti.save(\"../media/himalaya.qza\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}