{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "45c8dd54-1683-40cc-8b0b-bbef62882a76",
   "metadata": {},
   "source": [
    "# Single meal: in the mountains of Guerrero\n",
    "\n",
    "Here we will build a medium based on the anticipated metabolites present in the human gut for a meal matched to the Me'Phaa people in Guerrero, Mexico. Please be aware that this is unlikely to represent the full diversity of the diet.\n",
    "\n",
    "The composition of the meal is the following:\n",
    "\n",
    "- 150g corn (mixed with water)\n",
    "- 200g cooked beans\n",
    "- 300g cooked chayote\n",
    "- 100g cooked chicken\n",
    "\n",
    "This sums up to 1022 kcal.\n",
    "\n",
    "We will start by reading individual tables for the specific foods and scale the abundances."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "id": "be313cb2-1e5d-4766-bba1-7479835c5925",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "\n",
    "meal = {\n",
    "    \"Corn\": 150,\n",
    "    \"Pinto beans\": 200,\n",
    "    \"Chayote\": 300,\n",
    "    \"Chicken\": 100\n",
    "}\n",
    "\n",
    "foods = []\n",
    "for food in meal:\n",
    "    mets = pd.read_excel(\"../data/foods_diets.xlsx\", food)\n",
    "    mets[\"amount_g\"] = mets.relative_abundance / mets.relative_abundance.sum() * meal[food]\n",
    "    mets[\"concentration\"] = mets[\"amount_g\"] / mets[\"mw\"] * 1000.0  # to yield mmol/meal\n",
    "    mets[\"flux\"] = mets[\"concentration\"] / 8.0 # to yield mmol/h\n",
    "    mets[\"food\"] = food\n",
    "    foods.append(mets)\n",
    "foods = pd.concat(foods)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9be60e55-e72e-4913-8625-4d66a3079edb",
   "metadata": {},
   "source": [
    "Now we combine the data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "id": "d21a00b4-278a-4b99-a9f3-32fcde652738",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>metabolite</th>\n",
       "      <th>flux</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>ala_L</td>\n",
       "      <td>3.357585</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>arach</td>\n",
       "      <td>0.032473</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>arg_L</td>\n",
       "      <td>1.761688</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>ascb_L</td>\n",
       "      <td>0.185524</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>asp_L</td>\n",
       "      <td>5.841686</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  metabolite      flux\n",
       "0      ala_L  3.357585\n",
       "1      arach  0.032473\n",
       "2      arg_L  1.761688\n",
       "3     ascb_L  0.185524\n",
       "4      asp_L  5.841686"
      ]
     },
     "execution_count": 44,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "foods.loc[foods.metabolite == \"h2o\", \"flux\"] += 1000.0 / 18.01 * 1000.0 / 8.0 # add 1L water\n",
    "diet = foods.dropna(subset=[\"metabolite\"]).groupby(\"metabolite\").flux.sum().reset_index()\n",
    "diet.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cfe7d9de-6a82-4106-8b41-6a81fa84a742",
   "metadata": {},
   "source": [
    "## Adjust for intestinal adsorption\n",
    "\n",
    "To achieve this we will load the Recon3 human model. AGORA and Recon IDs are very similar so we should be able to match them. We just have to adjust the Recon3 ones a bit. We start by identifying all available exchanges in Recon3 and adjusting the IDs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "id": "75b2c2f1-e575-4085-a0d9-edc88de26801",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0     5adtststerone\n",
       "1    5adtststerones\n",
       "2             5fthf\n",
       "3             5htrp\n",
       "4             5mthf\n",
       "dtype: object"
      ]
     },
     "execution_count": 45,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from cobra.io import read_sbml_model\n",
    "import pandas as pd\n",
    "\n",
    "recon3 = read_sbml_model(\"../data/Recon3D.xml.gz\")\n",
    "exchanges = pd.Series([r.id for r in recon3.exchanges])\n",
    "exchanges = exchanges.str.replace(\"__\", \"_\").str.replace(\"_e$|EX_\", \"\", regex=True)\n",
    "exchanges.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "id": "d3dec0ed-7ba5-41a1-9234-4fb76ec1e71e",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.2    49\n",
       "1.0    10\n",
       "Name: dilution, dtype: int64"
      ]
     },
     "execution_count": 46,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "diet[\"dilution\"] = 1.0\n",
    "diet.loc[diet.metabolite.isin(exchanges), \"dilution\"] = 0.2\n",
    "diet[\"flux\"] = diet[\"flux\"] * diet[\"dilution\"] \n",
    "diet[[\"metabolite\", \"dilution\"]].drop_duplicates().dilution.value_counts()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "88e48f71-e76f-4462-be41-61578af7f9b0",
   "metadata": {},
   "source": [
    "## Adding host supplied components\n",
    "\n",
    "Finally we add the host metabolites such as primary bile acids and mucins and a minuscule amount of oxygen."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "id": "5e7d1abb-ba90-4341-aec5-d7361210328f",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>metabolite</th>\n",
       "      <th>flux</th>\n",
       "      <th>dilution</th>\n",
       "      <th>reaction</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>ala_L</td>\n",
       "      <td>0.671517</td>\n",
       "      <td>0.2</td>\n",
       "      <td>EX_ala_L(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>arach</td>\n",
       "      <td>0.006495</td>\n",
       "      <td>0.2</td>\n",
       "      <td>EX_arach(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>arg_L</td>\n",
       "      <td>0.352338</td>\n",
       "      <td>0.2</td>\n",
       "      <td>EX_arg_L(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>ascb_L</td>\n",
       "      <td>0.037105</td>\n",
       "      <td>0.2</td>\n",
       "      <td>EX_ascb_L(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>asp_L</td>\n",
       "      <td>1.168337</td>\n",
       "      <td>0.2</td>\n",
       "      <td>EX_asp_L(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>70</th>\n",
       "      <td>gncore2_rl</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_gncore2_rl(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>71</th>\n",
       "      <td>core7</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_core7(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>72</th>\n",
       "      <td>gchola</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_gchola(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>73</th>\n",
       "      <td>tchola</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_tchola(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>74</th>\n",
       "      <td>o2</td>\n",
       "      <td>0.001000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_o2(e)</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>75 rows × 4 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "    metabolite      flux  dilution          reaction\n",
       "0        ala_L  0.671517       0.2       EX_ala_L(e)\n",
       "1        arach  0.006495       0.2       EX_arach(e)\n",
       "2        arg_L  0.352338       0.2       EX_arg_L(e)\n",
       "3       ascb_L  0.037105       0.2      EX_ascb_L(e)\n",
       "4        asp_L  1.168337       0.2       EX_asp_L(e)\n",
       "..         ...       ...       ...               ...\n",
       "70  gncore2_rl  1.000000       NaN  EX_gncore2_rl(e)\n",
       "71       core7  1.000000       NaN       EX_core7(e)\n",
       "72      gchola  1.000000       NaN      EX_gchola(e)\n",
       "73      tchola  1.000000       NaN      EX_tchola(e)\n",
       "74          o2  0.001000       NaN          EX_o2(e)\n",
       "\n",
       "[75 rows x 4 columns]"
      ]
     },
     "execution_count": 47,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "diet.set_index(\"metabolite\", inplace=True)\n",
    "annotations = pd.read_csv(\"../data/agora_metabolites.csv\")\n",
    "\n",
    "# mucin\n",
    "for met in annotations.loc[annotations.metabolite.str.contains(\"core\"), \"metabolite\"]:\n",
    "    diet.loc[met, \"flux\"] = 1\n",
    "\n",
    "# primary BAs\n",
    "for met in [\"gchola\", \"tchola\"]:\n",
    "    diet.loc[met, \"flux\"] = 1\n",
    "\n",
    "# anaerobic\n",
    "diet.loc[\"o2\", \"flux\"] = 0.001\n",
    "\n",
    "diet.reset_index(inplace=True)\n",
    "diet[\"reaction\"] = \"EX_\" + diet.metabolite + \"(e)\"\n",
    "diet"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7483514f-757f-4b60-8587-eee86bfc2ad0",
   "metadata": {},
   "source": [
    "And we will merge this table with some annotations to make it more accessible."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "id": "e7d3e530-90fc-4499-b9b2-9e81cd4c1ce5",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>metabolite</th>\n",
       "      <th>flux</th>\n",
       "      <th>dilution</th>\n",
       "      <th>reaction</th>\n",
       "      <th>name</th>\n",
       "      <th>hmdb</th>\n",
       "      <th>kegg.compound</th>\n",
       "      <th>pubchem.compound</th>\n",
       "      <th>inchi</th>\n",
       "      <th>chebi</th>\n",
       "      <th>global_id</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>ala_L</td>\n",
       "      <td>0.671517</td>\n",
       "      <td>0.2</td>\n",
       "      <td>EX_ala_L_m</td>\n",
       "      <td>L-alanine</td>\n",
       "      <td>HMDB00161</td>\n",
       "      <td>C00041</td>\n",
       "      <td>5950.0</td>\n",
       "      <td>InChI=1S/C3H7NO2/c1-2(4)3(5)6/h2H,4H2,1H3,(H,5...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_ala_L(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>arach</td>\n",
       "      <td>0.006495</td>\n",
       "      <td>0.2</td>\n",
       "      <td>EX_arach_m</td>\n",
       "      <td>arachidate</td>\n",
       "      <td>HMDB02212</td>\n",
       "      <td>C06425</td>\n",
       "      <td>10467.0</td>\n",
       "      <td>InChI=1S/C20H40O2/c1-2-3-4-5-6-7-8-9-10-11-12-...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_arach(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>arg_L</td>\n",
       "      <td>0.352338</td>\n",
       "      <td>0.2</td>\n",
       "      <td>EX_arg_L_m</td>\n",
       "      <td>L-argininium(1+)</td>\n",
       "      <td>HMDB00517</td>\n",
       "      <td>C00062</td>\n",
       "      <td>6322.0</td>\n",
       "      <td>InChI=1S/C6H14N4O2/c7-4(5(11)12)2-1-3-10-6(8)9...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_arg_L(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>ascb_L</td>\n",
       "      <td>0.037105</td>\n",
       "      <td>0.2</td>\n",
       "      <td>EX_ascb_L_m</td>\n",
       "      <td>L-ascorbate</td>\n",
       "      <td>HMDB00044</td>\n",
       "      <td>C00072</td>\n",
       "      <td>NaN</td>\n",
       "      <td>InChI=1S/C6H8O6/c7-1-2(8)5-3(9)4(10)6(11)12-5/...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_ascb_L(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>asp_L</td>\n",
       "      <td>1.168337</td>\n",
       "      <td>0.2</td>\n",
       "      <td>EX_asp_L_m</td>\n",
       "      <td>L-aspartate(1-)</td>\n",
       "      <td>HMDB00191</td>\n",
       "      <td>C00049</td>\n",
       "      <td>5960.0</td>\n",
       "      <td>InChI=1S/C4H7NO4/c5-2(4(8)9)1-3(6)7/h2H,1,5H2,...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_asp_L(e)</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  metabolite      flux  dilution     reaction              name       hmdb  \\\n",
       "0      ala_L  0.671517       0.2   EX_ala_L_m         L-alanine  HMDB00161   \n",
       "1      arach  0.006495       0.2   EX_arach_m        arachidate  HMDB02212   \n",
       "2      arg_L  0.352338       0.2   EX_arg_L_m  L-argininium(1+)  HMDB00517   \n",
       "3     ascb_L  0.037105       0.2  EX_ascb_L_m       L-ascorbate  HMDB00044   \n",
       "4      asp_L  1.168337       0.2   EX_asp_L_m   L-aspartate(1-)  HMDB00191   \n",
       "\n",
       "  kegg.compound  pubchem.compound  \\\n",
       "0        C00041            5950.0   \n",
       "1        C06425           10467.0   \n",
       "2        C00062            6322.0   \n",
       "3        C00072               NaN   \n",
       "4        C00049            5960.0   \n",
       "\n",
       "                                               inchi chebi     global_id  \n",
       "0  InChI=1S/C3H7NO2/c1-2(4)3(5)6/h2H,4H2,1H3,(H,5...   NaN   EX_ala_L(e)  \n",
       "1  InChI=1S/C20H40O2/c1-2-3-4-5-6-7-8-9-10-11-12-...   NaN   EX_arach(e)  \n",
       "2  InChI=1S/C6H14N4O2/c7-4(5(11)12)2-1-3-10-6(8)9...   NaN   EX_arg_L(e)  \n",
       "3  InChI=1S/C6H8O6/c7-1-2(8)5-3(9)4(10)6(11)12-5/...   NaN  EX_ascb_L(e)  \n",
       "4  InChI=1S/C4H7NO4/c5-2(4(8)9)1-3(6)7/h2H,1,5H2,...   NaN   EX_asp_L(e)  "
      ]
     },
     "execution_count": 48,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "skeleton = pd.merge(diet, annotations, on=\"metabolite\")\n",
    "\n",
    "skeleton[\"global_id\"] = skeleton.reaction\n",
    "skeleton[\"reaction\"] = \"EX_\" + skeleton.metabolite + \"_m\"\n",
    "skeleton.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6aaf4682-027e-4a8b-a7e9-ae3033694644",
   "metadata": {},
   "source": [
    "## Complete the medium\n",
    "\n",
    "Great we now have a pretty good skeleton. One issue that this will never be fully complete. There will always be some components missing that are essential for microbial growth. Fortunately, we provide a algorithm in MICOM to complete a medium with the smallest set of additional components to provide growth to all intestinal taxa."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "id": "614e9d55-43c6-4ce7-9483-8a7551a1aff5",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Completing <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">818</span> strain-level models on a medium with <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">74</span> components <span style=\"font-weight: bold\">(</span><span style=\"color: #800000; text-decoration-color: #800000; font-weight: bold\">1</span><span style=\"color: #800000; text-decoration-color: #800000\"> strict</span><span style=\"font-weight: bold\">)</span>.\n",
       "</pre>\n"
      ],
      "text/plain": [
       "Completing \u001b[1;36m818\u001b[0m strain-level models on a medium with \u001b[1;36m74\u001b[0m components \u001b[1m(\u001b[0m\u001b[1;31m1\u001b[0m\u001b[31m strict\u001b[0m\u001b[1m)\u001b[0m.\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "fcf6085a8e4f41eb840d28dfe90c639e",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Output()"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"></pre>\n"
      ],
      "text/plain": []
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Obtained growth for <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">815</span> models adding additional flux of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.00</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4844.08</span> on average.\n",
       "</pre>\n"
      ],
      "text/plain": [
       "Obtained growth for \u001b[1;36m815\u001b[0m models adding additional flux of \u001b[1;36m1.00\u001b[0m/\u001b[1;36m4844.08\u001b[0m on average.\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from micom.workflows.db_media import complete_db_medium\n",
    "\n",
    "manifest, imports = complete_db_medium(\n",
    "    \"../data/agora103_strain.qza\", \n",
    "    medium=skeleton, \n",
    "    growth=0.1, \n",
    "    threads=12, \n",
    "    max_added_import=10,  # do not add more than 10 mmol/h of flux per component\n",
    "    strict=[\"EX_o2(e)\"],  # force anaerobic environment\n",
    "    weights=\"mass\"        # minimize added molecular weight\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "id": "72675356-b60f-4ee9-821d-46caeac4d4b3",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True     815\n",
       "False      3\n",
       "Name: can_grow, dtype: int64"
      ]
     },
     "execution_count": 50,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "manifest.can_grow.value_counts()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "id": "6e91940c-1dfb-4884-aa26-4e0ac3ec56f9",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Added flux is 168.07/5076.05 mmol/h.\n"
     ]
    }
   ],
   "source": [
    "filled = imports.max()\n",
    "added = filled.sum() - skeleton.loc[skeleton.reaction.isin(filled.index), \"flux\"].sum()\n",
    "\n",
    "print(f\"Added flux is {added.sum():.2f}/{filled.sum():.2f} mmol/h.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8aed4f6a-b328-47d0-90d0-2fc706f6b7a4",
   "metadata": {},
   "source": [
    "Let's see what was added in large amounts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "id": "ebde289e-4e96-4fc7-baed-30e61fb2bcea",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "EX_h_m         10.000000\n",
       "EX_h2_m        10.000000\n",
       "EX_no_m        10.000000\n",
       "EX_glyleu_m    10.000000\n",
       "EX_ser_L_m      9.833701\n",
       "EX_glyc_m       8.883167\n",
       "EX_succ_m       6.735561\n",
       "EX_co2_m        6.703640\n",
       "EX_asn_L_m      6.587626\n",
       "EX_mal_L_m      6.100775\n",
       "EX_for_m        5.677523\n",
       "EX_n2o_m        5.317345\n",
       "EX_urea_m       4.965212\n",
       "EX_fum_m        4.863709\n",
       "EX_fru_m        4.797330\n",
       "EX_cytd_m       4.458517\n",
       "EX_ph2s_m       4.322594\n",
       "EX_no3_m        4.321645\n",
       "EX_nh4_m        3.869570\n",
       "EX_acald_m      3.835911\n",
       "dtype: float64"
      ]
     },
     "execution_count": 52,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "added_fluxes = filled.copy()\n",
    "shared = added_fluxes.index[added_fluxes.index.isin(skeleton.reaction)]\n",
    "added_fluxes[shared] -= skeleton.flux[shared]\n",
    "added_fluxes.sort_values(ascending=False)[0:20]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "49433d80-6a4d-414f-9bc9-45ef15e09498",
   "metadata": {},
   "source": [
    "Looks okay. So we will now assemble the final medium. For this we add the new components to each sample and rebuild the annotations for a nicely formatted medium."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "id": "b1bbaaa0-cc1b-4393-abff-1d98fb774ff0",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>metabolite</th>\n",
       "      <th>flux</th>\n",
       "      <th>name</th>\n",
       "      <th>hmdb</th>\n",
       "      <th>kegg.compound</th>\n",
       "      <th>pubchem.compound</th>\n",
       "      <th>inchi</th>\n",
       "      <th>chebi</th>\n",
       "      <th>reaction</th>\n",
       "      <th>global_id</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>ala_L</td>\n",
       "      <td>0.742297</td>\n",
       "      <td>L-alanine</td>\n",
       "      <td>HMDB00161</td>\n",
       "      <td>C00041</td>\n",
       "      <td>5950.0</td>\n",
       "      <td>InChI=1S/C3H7NO2/c1-2(4)3(5)6/h2H,4H2,1H3,(H,5...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_ala_L_m</td>\n",
       "      <td>EX_ala_L(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>arg_L</td>\n",
       "      <td>3.769624</td>\n",
       "      <td>L-argininium(1+)</td>\n",
       "      <td>HMDB00517</td>\n",
       "      <td>C00062</td>\n",
       "      <td>6322.0</td>\n",
       "      <td>InChI=1S/C6H14N4O2/c7-4(5(11)12)2-1-3-10-6(8)9...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_arg_L_m</td>\n",
       "      <td>EX_arg_L(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>asn_L</td>\n",
       "      <td>6.587626</td>\n",
       "      <td>L-asparagine</td>\n",
       "      <td>HMDB00168</td>\n",
       "      <td>C00152</td>\n",
       "      <td>6267.0</td>\n",
       "      <td>InChI=1S/C4H8N2O3/c5-2(4(8)9)1-3(6)7/h2H,1,5H2...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_asn_L_m</td>\n",
       "      <td>EX_asn_L(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>asp_L</td>\n",
       "      <td>3.015252</td>\n",
       "      <td>L-aspartate(1-)</td>\n",
       "      <td>HMDB00191</td>\n",
       "      <td>C00049</td>\n",
       "      <td>5960.0</td>\n",
       "      <td>InChI=1S/C4H7NO4/c5-2(4(8)9)1-3(6)7/h2H,1,5H2,...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_asp_L_m</td>\n",
       "      <td>EX_asp_L(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>but</td>\n",
       "      <td>0.000322</td>\n",
       "      <td>butyrate</td>\n",
       "      <td>HMDB00039</td>\n",
       "      <td>C00246</td>\n",
       "      <td>264.0</td>\n",
       "      <td>InChI=1S/C4H8O2/c1-2-3-4(5)6/h2-3H2,1H3,(H,5,6...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_but_m</td>\n",
       "      <td>EX_but(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>176</th>\n",
       "      <td>mantr</td>\n",
       "      <td>0.015442</td>\n",
       "      <td>mannotriose (beta-1,4)</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_mantr_m</td>\n",
       "      <td>EX_mantr(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>177</th>\n",
       "      <td>cmp</td>\n",
       "      <td>0.008410</td>\n",
       "      <td>CMP</td>\n",
       "      <td>HMDB00095</td>\n",
       "      <td>C00055</td>\n",
       "      <td>6131.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_cmp_m</td>\n",
       "      <td>EX_cmp(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>178</th>\n",
       "      <td>datp</td>\n",
       "      <td>0.002391</td>\n",
       "      <td>dATP</td>\n",
       "      <td>HMDB01532</td>\n",
       "      <td>C00131</td>\n",
       "      <td>15993.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_datp_m</td>\n",
       "      <td>EX_datp(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>179</th>\n",
       "      <td>xylan</td>\n",
       "      <td>0.002183</td>\n",
       "      <td>Oat spelt xylan, MW 79200 (DOI: 10.1002/masy.2...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_xylan_m</td>\n",
       "      <td>EX_xylan(e)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>180</th>\n",
       "      <td>so3</td>\n",
       "      <td>1.225733</td>\n",
       "      <td>Sulfite</td>\n",
       "      <td>HMDB00240</td>\n",
       "      <td>C00094</td>\n",
       "      <td>1100.0</td>\n",
       "      <td>InChI=1S/H2O3S/c1-4(2)3/h(H2,1,2,3)/p-2</td>\n",
       "      <td>NaN</td>\n",
       "      <td>EX_so3_m</td>\n",
       "      <td>EX_so3(e)</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>181 rows × 10 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "    metabolite      flux                                               name  \\\n",
       "0        ala_L  0.742297                                          L-alanine   \n",
       "1        arg_L  3.769624                                   L-argininium(1+)   \n",
       "2        asn_L  6.587626                                       L-asparagine   \n",
       "3        asp_L  3.015252                                    L-aspartate(1-)   \n",
       "4          but  0.000322                                           butyrate   \n",
       "..         ...       ...                                                ...   \n",
       "176      mantr  0.015442                             mannotriose (beta-1,4)   \n",
       "177        cmp  0.008410                                                CMP   \n",
       "178       datp  0.002391                                               dATP   \n",
       "179      xylan  0.002183  Oat spelt xylan, MW 79200 (DOI: 10.1002/masy.2...   \n",
       "180        so3  1.225733                                            Sulfite   \n",
       "\n",
       "          hmdb kegg.compound  pubchem.compound  \\\n",
       "0    HMDB00161        C00041            5950.0   \n",
       "1    HMDB00517        C00062            6322.0   \n",
       "2    HMDB00168        C00152            6267.0   \n",
       "3    HMDB00191        C00049            5960.0   \n",
       "4    HMDB00039        C00246             264.0   \n",
       "..         ...           ...               ...   \n",
       "176        NaN           NaN               NaN   \n",
       "177  HMDB00095        C00055            6131.0   \n",
       "178  HMDB01532        C00131           15993.0   \n",
       "179        NaN           NaN               NaN   \n",
       "180  HMDB00240        C00094            1100.0   \n",
       "\n",
       "                                                 inchi chebi    reaction  \\\n",
       "0    InChI=1S/C3H7NO2/c1-2(4)3(5)6/h2H,4H2,1H3,(H,5...   NaN  EX_ala_L_m   \n",
       "1    InChI=1S/C6H14N4O2/c7-4(5(11)12)2-1-3-10-6(8)9...   NaN  EX_arg_L_m   \n",
       "2    InChI=1S/C4H8N2O3/c5-2(4(8)9)1-3(6)7/h2H,1,5H2...   NaN  EX_asn_L_m   \n",
       "3    InChI=1S/C4H7NO4/c5-2(4(8)9)1-3(6)7/h2H,1,5H2,...   NaN  EX_asp_L_m   \n",
       "4    InChI=1S/C4H8O2/c1-2-3-4(5)6/h2-3H2,1H3,(H,5,6...   NaN    EX_but_m   \n",
       "..                                                 ...   ...         ...   \n",
       "176                                                NaN   NaN  EX_mantr_m   \n",
       "177                                                NaN   NaN    EX_cmp_m   \n",
       "178                                                NaN   NaN   EX_datp_m   \n",
       "179                                                NaN   NaN  EX_xylan_m   \n",
       "180            InChI=1S/H2O3S/c1-4(2)3/h(H2,1,2,3)/p-2   NaN    EX_so3_m   \n",
       "\n",
       "       global_id  \n",
       "0    EX_ala_L(e)  \n",
       "1    EX_arg_L(e)  \n",
       "2    EX_asn_L(e)  \n",
       "3    EX_asp_L(e)  \n",
       "4      EX_but(e)  \n",
       "..           ...  \n",
       "176  EX_mantr(e)  \n",
       "177    EX_cmp(e)  \n",
       "178   EX_datp(e)  \n",
       "179  EX_xylan(e)  \n",
       "180    EX_so3(e)  \n",
       "\n",
       "[181 rows x 10 columns]"
      ]
     },
     "execution_count": 53,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "added_df = filled[filled > 1e-8].reset_index() \n",
    "added_df.iloc[:, 0] = added_df.iloc[:, 0].str.replace(\"EX_|_m$\", \"\", regex=True)\n",
    "added_df.columns = [\"metabolite\", \"flux\"]\n",
    "\n",
    "completed = pd.merge(added_df, annotations, on=\"metabolite\", how=\"left\")\n",
    "completed[\"reaction\"] = \"EX_\" + completed.metabolite + \"_m\"\n",
    "completed[\"global_id\"] = \"EX_\" + completed.metabolite + \"(e)\"\n",
    "completed"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f59284ae-71d9-40fe-b78e-f2d923d16f55",
   "metadata": {},
   "source": [
    "## Validate the medium\n",
    "\n",
    "And we will now validate whether the medium works."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "id": "3a115b65-ef0d-485c-89c0-311779a53e6a",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Checking <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">818</span> strain-level models on a medium with <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">181</span> components.\n",
       "</pre>\n"
      ],
      "text/plain": [
       "Checking \u001b[1;36m818\u001b[0m strain-level models on a medium with \u001b[1;36m181\u001b[0m components.\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "866aea2a20cd45bf852da7df4d871da9",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Output()"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"></pre>\n"
      ],
      "text/plain": []
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from micom.workflows.db_media import check_db_medium\n",
    "\n",
    "check = check_db_medium(\"../data/agora103_strain.qza\", medium=completed, threads=12)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "id": "11d0cb77-480b-44af-b1e1-e19233aa693d",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "count    818.000000\n",
       "mean       0.187691\n",
       "std        0.107104\n",
       "min        0.036577\n",
       "25%        0.100000\n",
       "50%        0.156764\n",
       "75%        0.256409\n",
       "max        0.646666\n",
       "Name: growth_rate, dtype: float64"
      ]
     },
     "execution_count": 55,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "check.growth_rate.describe()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "id": "2b5462a6-4e6c-4e5d-8e9b-c6e1245bef5d",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'../media/guerrero_mountains.qza'"
      ]
     },
     "execution_count": 56,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import qiime2 as q2\n",
    "\n",
    "arti = q2.Artifact.import_data(\"MicomMedium[Global]\", completed)\n",
    "arti.save(\"../media/guerrero_mountains.qza\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ec59dcb0-53b2-4489-aefe-e8687cd409e0",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}