{
"cells": [
{
"cell_type": "markdown",
"id": "45c8dd54-1683-40cc-8b0b-bbef62882a76",
"metadata": {},
"source": [
"# Single meal: in the mountains of Guerrero\n",
"\n",
"Here we will build a medium based on the anticipated metabolites present in the human gut for a meal matched to the Me'Phaa people in Guerrero, Mexico. Please be aware that this is unlikely to represent the full diversity of the diet.\n",
"\n",
"The composition of the meal is the following:\n",
"\n",
"- 150g corn (mixed with water)\n",
"- 200g cooked beans\n",
"- 300g cooked chayote\n",
"- 100g cooked chicken\n",
"\n",
"This sums up to 1022 kcal.\n",
"\n",
"We will start by reading individual tables for the specific foods and scale the abundances."
]
},
{
"cell_type": "code",
"execution_count": 43,
"id": "be313cb2-1e5d-4766-bba1-7479835c5925",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import pandas as pd\n",
"\n",
"meal = {\n",
" \"Corn\": 150,\n",
" \"Pinto beans\": 200,\n",
" \"Chayote\": 300,\n",
" \"Chicken\": 100\n",
"}\n",
"\n",
"foods = []\n",
"for food in meal:\n",
" mets = pd.read_excel(\"../data/foods_diets.xlsx\", food)\n",
" mets[\"amount_g\"] = mets.relative_abundance / mets.relative_abundance.sum() * meal[food]\n",
" mets[\"concentration\"] = mets[\"amount_g\"] / mets[\"mw\"] * 1000.0 # to yield mmol/meal\n",
" mets[\"flux\"] = mets[\"concentration\"] / 8.0 # to yield mmol/h\n",
" mets[\"food\"] = food\n",
" foods.append(mets)\n",
"foods = pd.concat(foods)"
]
},
{
"cell_type": "markdown",
"id": "9be60e55-e72e-4913-8625-4d66a3079edb",
"metadata": {},
"source": [
"Now we combine the data."
]
},
{
"cell_type": "code",
"execution_count": 44,
"id": "d21a00b4-278a-4b99-a9f3-32fcde652738",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" metabolite | \n",
" flux | \n",
"
\n",
" \n",
" \n",
" \n",
" | 0 | \n",
" ala_L | \n",
" 3.357585 | \n",
"
\n",
" \n",
" | 1 | \n",
" arach | \n",
" 0.032473 | \n",
"
\n",
" \n",
" | 2 | \n",
" arg_L | \n",
" 1.761688 | \n",
"
\n",
" \n",
" | 3 | \n",
" ascb_L | \n",
" 0.185524 | \n",
"
\n",
" \n",
" | 4 | \n",
" asp_L | \n",
" 5.841686 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" metabolite flux\n",
"0 ala_L 3.357585\n",
"1 arach 0.032473\n",
"2 arg_L 1.761688\n",
"3 ascb_L 0.185524\n",
"4 asp_L 5.841686"
]
},
"execution_count": 44,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"foods.loc[foods.metabolite == \"h2o\", \"flux\"] += 1000.0 / 18.01 * 1000.0 / 8.0 # add 1L water\n",
"diet = foods.dropna(subset=[\"metabolite\"]).groupby(\"metabolite\").flux.sum().reset_index()\n",
"diet.head()"
]
},
{
"cell_type": "markdown",
"id": "cfe7d9de-6a82-4106-8b41-6a81fa84a742",
"metadata": {},
"source": [
"## Adjust for intestinal adsorption\n",
"\n",
"To achieve this we will load the Recon3 human model. AGORA and Recon IDs are very similar so we should be able to match them. We just have to adjust the Recon3 ones a bit. We start by identifying all available exchanges in Recon3 and adjusting the IDs."
]
},
{
"cell_type": "code",
"execution_count": 45,
"id": "75b2c2f1-e575-4085-a0d9-edc88de26801",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0 5adtststerone\n",
"1 5adtststerones\n",
"2 5fthf\n",
"3 5htrp\n",
"4 5mthf\n",
"dtype: object"
]
},
"execution_count": 45,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from cobra.io import read_sbml_model\n",
"import pandas as pd\n",
"\n",
"recon3 = read_sbml_model(\"../data/Recon3D.xml.gz\")\n",
"exchanges = pd.Series([r.id for r in recon3.exchanges])\n",
"exchanges = exchanges.str.replace(\"__\", \"_\").str.replace(\"_e$|EX_\", \"\", regex=True)\n",
"exchanges.head()"
]
},
{
"cell_type": "code",
"execution_count": 46,
"id": "d3dec0ed-7ba5-41a1-9234-4fb76ec1e71e",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.2 49\n",
"1.0 10\n",
"Name: dilution, dtype: int64"
]
},
"execution_count": 46,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"diet[\"dilution\"] = 1.0\n",
"diet.loc[diet.metabolite.isin(exchanges), \"dilution\"] = 0.2\n",
"diet[\"flux\"] = diet[\"flux\"] * diet[\"dilution\"] \n",
"diet[[\"metabolite\", \"dilution\"]].drop_duplicates().dilution.value_counts()"
]
},
{
"cell_type": "markdown",
"id": "88e48f71-e76f-4462-be41-61578af7f9b0",
"metadata": {},
"source": [
"## Adding host supplied components\n",
"\n",
"Finally we add the host metabolites such as primary bile acids and mucins and a minuscule amount of oxygen."
]
},
{
"cell_type": "code",
"execution_count": 47,
"id": "5e7d1abb-ba90-4341-aec5-d7361210328f",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" metabolite | \n",
" flux | \n",
" dilution | \n",
" reaction | \n",
"
\n",
" \n",
" \n",
" \n",
" | 0 | \n",
" ala_L | \n",
" 0.671517 | \n",
" 0.2 | \n",
" EX_ala_L(e) | \n",
"
\n",
" \n",
" | 1 | \n",
" arach | \n",
" 0.006495 | \n",
" 0.2 | \n",
" EX_arach(e) | \n",
"
\n",
" \n",
" | 2 | \n",
" arg_L | \n",
" 0.352338 | \n",
" 0.2 | \n",
" EX_arg_L(e) | \n",
"
\n",
" \n",
" | 3 | \n",
" ascb_L | \n",
" 0.037105 | \n",
" 0.2 | \n",
" EX_ascb_L(e) | \n",
"
\n",
" \n",
" | 4 | \n",
" asp_L | \n",
" 1.168337 | \n",
" 0.2 | \n",
" EX_asp_L(e) | \n",
"
\n",
" \n",
" | ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" | 70 | \n",
" gncore2_rl | \n",
" 1.000000 | \n",
" NaN | \n",
" EX_gncore2_rl(e) | \n",
"
\n",
" \n",
" | 71 | \n",
" core7 | \n",
" 1.000000 | \n",
" NaN | \n",
" EX_core7(e) | \n",
"
\n",
" \n",
" | 72 | \n",
" gchola | \n",
" 1.000000 | \n",
" NaN | \n",
" EX_gchola(e) | \n",
"
\n",
" \n",
" | 73 | \n",
" tchola | \n",
" 1.000000 | \n",
" NaN | \n",
" EX_tchola(e) | \n",
"
\n",
" \n",
" | 74 | \n",
" o2 | \n",
" 0.001000 | \n",
" NaN | \n",
" EX_o2(e) | \n",
"
\n",
" \n",
"
\n",
"
75 rows × 4 columns
\n",
"
"
],
"text/plain": [
" metabolite flux dilution reaction\n",
"0 ala_L 0.671517 0.2 EX_ala_L(e)\n",
"1 arach 0.006495 0.2 EX_arach(e)\n",
"2 arg_L 0.352338 0.2 EX_arg_L(e)\n",
"3 ascb_L 0.037105 0.2 EX_ascb_L(e)\n",
"4 asp_L 1.168337 0.2 EX_asp_L(e)\n",
".. ... ... ... ...\n",
"70 gncore2_rl 1.000000 NaN EX_gncore2_rl(e)\n",
"71 core7 1.000000 NaN EX_core7(e)\n",
"72 gchola 1.000000 NaN EX_gchola(e)\n",
"73 tchola 1.000000 NaN EX_tchola(e)\n",
"74 o2 0.001000 NaN EX_o2(e)\n",
"\n",
"[75 rows x 4 columns]"
]
},
"execution_count": 47,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"diet.set_index(\"metabolite\", inplace=True)\n",
"annotations = pd.read_csv(\"../data/agora_metabolites.csv\")\n",
"\n",
"# mucin\n",
"for met in annotations.loc[annotations.metabolite.str.contains(\"core\"), \"metabolite\"]:\n",
" diet.loc[met, \"flux\"] = 1\n",
"\n",
"# primary BAs\n",
"for met in [\"gchola\", \"tchola\"]:\n",
" diet.loc[met, \"flux\"] = 1\n",
"\n",
"# anaerobic\n",
"diet.loc[\"o2\", \"flux\"] = 0.001\n",
"\n",
"diet.reset_index(inplace=True)\n",
"diet[\"reaction\"] = \"EX_\" + diet.metabolite + \"(e)\"\n",
"diet"
]
},
{
"cell_type": "markdown",
"id": "7483514f-757f-4b60-8587-eee86bfc2ad0",
"metadata": {},
"source": [
"And we will merge this table with some annotations to make it more accessible."
]
},
{
"cell_type": "code",
"execution_count": 48,
"id": "e7d3e530-90fc-4499-b9b2-9e81cd4c1ce5",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" metabolite | \n",
" flux | \n",
" dilution | \n",
" reaction | \n",
" name | \n",
" hmdb | \n",
" kegg.compound | \n",
" pubchem.compound | \n",
" inchi | \n",
" chebi | \n",
" global_id | \n",
"
\n",
" \n",
" \n",
" \n",
" | 0 | \n",
" ala_L | \n",
" 0.671517 | \n",
" 0.2 | \n",
" EX_ala_L_m | \n",
" L-alanine | \n",
" HMDB00161 | \n",
" C00041 | \n",
" 5950.0 | \n",
" InChI=1S/C3H7NO2/c1-2(4)3(5)6/h2H,4H2,1H3,(H,5... | \n",
" NaN | \n",
" EX_ala_L(e) | \n",
"
\n",
" \n",
" | 1 | \n",
" arach | \n",
" 0.006495 | \n",
" 0.2 | \n",
" EX_arach_m | \n",
" arachidate | \n",
" HMDB02212 | \n",
" C06425 | \n",
" 10467.0 | \n",
" InChI=1S/C20H40O2/c1-2-3-4-5-6-7-8-9-10-11-12-... | \n",
" NaN | \n",
" EX_arach(e) | \n",
"
\n",
" \n",
" | 2 | \n",
" arg_L | \n",
" 0.352338 | \n",
" 0.2 | \n",
" EX_arg_L_m | \n",
" L-argininium(1+) | \n",
" HMDB00517 | \n",
" C00062 | \n",
" 6322.0 | \n",
" InChI=1S/C6H14N4O2/c7-4(5(11)12)2-1-3-10-6(8)9... | \n",
" NaN | \n",
" EX_arg_L(e) | \n",
"
\n",
" \n",
" | 3 | \n",
" ascb_L | \n",
" 0.037105 | \n",
" 0.2 | \n",
" EX_ascb_L_m | \n",
" L-ascorbate | \n",
" HMDB00044 | \n",
" C00072 | \n",
" NaN | \n",
" InChI=1S/C6H8O6/c7-1-2(8)5-3(9)4(10)6(11)12-5/... | \n",
" NaN | \n",
" EX_ascb_L(e) | \n",
"
\n",
" \n",
" | 4 | \n",
" asp_L | \n",
" 1.168337 | \n",
" 0.2 | \n",
" EX_asp_L_m | \n",
" L-aspartate(1-) | \n",
" HMDB00191 | \n",
" C00049 | \n",
" 5960.0 | \n",
" InChI=1S/C4H7NO4/c5-2(4(8)9)1-3(6)7/h2H,1,5H2,... | \n",
" NaN | \n",
" EX_asp_L(e) | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" metabolite flux dilution reaction name hmdb \\\n",
"0 ala_L 0.671517 0.2 EX_ala_L_m L-alanine HMDB00161 \n",
"1 arach 0.006495 0.2 EX_arach_m arachidate HMDB02212 \n",
"2 arg_L 0.352338 0.2 EX_arg_L_m L-argininium(1+) HMDB00517 \n",
"3 ascb_L 0.037105 0.2 EX_ascb_L_m L-ascorbate HMDB00044 \n",
"4 asp_L 1.168337 0.2 EX_asp_L_m L-aspartate(1-) HMDB00191 \n",
"\n",
" kegg.compound pubchem.compound \\\n",
"0 C00041 5950.0 \n",
"1 C06425 10467.0 \n",
"2 C00062 6322.0 \n",
"3 C00072 NaN \n",
"4 C00049 5960.0 \n",
"\n",
" inchi chebi global_id \n",
"0 InChI=1S/C3H7NO2/c1-2(4)3(5)6/h2H,4H2,1H3,(H,5... NaN EX_ala_L(e) \n",
"1 InChI=1S/C20H40O2/c1-2-3-4-5-6-7-8-9-10-11-12-... NaN EX_arach(e) \n",
"2 InChI=1S/C6H14N4O2/c7-4(5(11)12)2-1-3-10-6(8)9... NaN EX_arg_L(e) \n",
"3 InChI=1S/C6H8O6/c7-1-2(8)5-3(9)4(10)6(11)12-5/... NaN EX_ascb_L(e) \n",
"4 InChI=1S/C4H7NO4/c5-2(4(8)9)1-3(6)7/h2H,1,5H2,... NaN EX_asp_L(e) "
]
},
"execution_count": 48,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"skeleton = pd.merge(diet, annotations, on=\"metabolite\")\n",
"\n",
"skeleton[\"global_id\"] = skeleton.reaction\n",
"skeleton[\"reaction\"] = \"EX_\" + skeleton.metabolite + \"_m\"\n",
"skeleton.head()"
]
},
{
"cell_type": "markdown",
"id": "6aaf4682-027e-4a8b-a7e9-ae3033694644",
"metadata": {},
"source": [
"## Complete the medium\n",
"\n",
"Great we now have a pretty good skeleton. One issue that this will never be fully complete. There will always be some components missing that are essential for microbial growth. Fortunately, we provide a algorithm in MICOM to complete a medium with the smallest set of additional components to provide growth to all intestinal taxa."
]
},
{
"cell_type": "code",
"execution_count": 49,
"id": "614e9d55-43c6-4ce7-9483-8a7551a1aff5",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Completing 818 strain-level models on a medium with 74 components (1 strict).\n",
"\n"
],
"text/plain": [
"Completing \u001b[1;36m818\u001b[0m strain-level models on a medium with \u001b[1;36m74\u001b[0m components \u001b[1m(\u001b[0m\u001b[1;31m1\u001b[0m\u001b[31m strict\u001b[0m\u001b[1m)\u001b[0m.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "fcf6085a8e4f41eb840d28dfe90c639e",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Output()"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n"
],
"text/plain": []
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"\n"
],
"text/plain": [
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Obtained growth for 815 models adding additional flux of 1.00/4844.08 on average.\n",
"\n"
],
"text/plain": [
"Obtained growth for \u001b[1;36m815\u001b[0m models adding additional flux of \u001b[1;36m1.00\u001b[0m/\u001b[1;36m4844.08\u001b[0m on average.\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from micom.workflows.db_media import complete_db_medium\n",
"\n",
"manifest, imports = complete_db_medium(\n",
" \"../data/agora103_strain.qza\", \n",
" medium=skeleton, \n",
" growth=0.1, \n",
" threads=12, \n",
" max_added_import=10, # do not add more than 10 mmol/h of flux per component\n",
" strict=[\"EX_o2(e)\"], # force anaerobic environment\n",
" weights=\"mass\" # minimize added molecular weight\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 50,
"id": "72675356-b60f-4ee9-821d-46caeac4d4b3",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"True 815\n",
"False 3\n",
"Name: can_grow, dtype: int64"
]
},
"execution_count": 50,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"manifest.can_grow.value_counts()"
]
},
{
"cell_type": "code",
"execution_count": 51,
"id": "6e91940c-1dfb-4884-aa26-4e0ac3ec56f9",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Added flux is 168.07/5076.05 mmol/h.\n"
]
}
],
"source": [
"filled = imports.max()\n",
"added = filled.sum() - skeleton.loc[skeleton.reaction.isin(filled.index), \"flux\"].sum()\n",
"\n",
"print(f\"Added flux is {added.sum():.2f}/{filled.sum():.2f} mmol/h.\")"
]
},
{
"cell_type": "markdown",
"id": "8aed4f6a-b328-47d0-90d0-2fc706f6b7a4",
"metadata": {},
"source": [
"Let's see what was added in large amounts."
]
},
{
"cell_type": "code",
"execution_count": 52,
"id": "ebde289e-4e96-4fc7-baed-30e61fb2bcea",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"EX_h_m 10.000000\n",
"EX_h2_m 10.000000\n",
"EX_no_m 10.000000\n",
"EX_glyleu_m 10.000000\n",
"EX_ser_L_m 9.833701\n",
"EX_glyc_m 8.883167\n",
"EX_succ_m 6.735561\n",
"EX_co2_m 6.703640\n",
"EX_asn_L_m 6.587626\n",
"EX_mal_L_m 6.100775\n",
"EX_for_m 5.677523\n",
"EX_n2o_m 5.317345\n",
"EX_urea_m 4.965212\n",
"EX_fum_m 4.863709\n",
"EX_fru_m 4.797330\n",
"EX_cytd_m 4.458517\n",
"EX_ph2s_m 4.322594\n",
"EX_no3_m 4.321645\n",
"EX_nh4_m 3.869570\n",
"EX_acald_m 3.835911\n",
"dtype: float64"
]
},
"execution_count": 52,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"added_fluxes = filled.copy()\n",
"shared = added_fluxes.index[added_fluxes.index.isin(skeleton.reaction)]\n",
"added_fluxes[shared] -= skeleton.flux[shared]\n",
"added_fluxes.sort_values(ascending=False)[0:20]"
]
},
{
"cell_type": "markdown",
"id": "49433d80-6a4d-414f-9bc9-45ef15e09498",
"metadata": {},
"source": [
"Looks okay. So we will now assemble the final medium. For this we add the new components to each sample and rebuild the annotations for a nicely formatted medium."
]
},
{
"cell_type": "code",
"execution_count": 53,
"id": "b1bbaaa0-cc1b-4393-abff-1d98fb774ff0",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" metabolite | \n",
" flux | \n",
" name | \n",
" hmdb | \n",
" kegg.compound | \n",
" pubchem.compound | \n",
" inchi | \n",
" chebi | \n",
" reaction | \n",
" global_id | \n",
"
\n",
" \n",
" \n",
" \n",
" | 0 | \n",
" ala_L | \n",
" 0.742297 | \n",
" L-alanine | \n",
" HMDB00161 | \n",
" C00041 | \n",
" 5950.0 | \n",
" InChI=1S/C3H7NO2/c1-2(4)3(5)6/h2H,4H2,1H3,(H,5... | \n",
" NaN | \n",
" EX_ala_L_m | \n",
" EX_ala_L(e) | \n",
"
\n",
" \n",
" | 1 | \n",
" arg_L | \n",
" 3.769624 | \n",
" L-argininium(1+) | \n",
" HMDB00517 | \n",
" C00062 | \n",
" 6322.0 | \n",
" InChI=1S/C6H14N4O2/c7-4(5(11)12)2-1-3-10-6(8)9... | \n",
" NaN | \n",
" EX_arg_L_m | \n",
" EX_arg_L(e) | \n",
"
\n",
" \n",
" | 2 | \n",
" asn_L | \n",
" 6.587626 | \n",
" L-asparagine | \n",
" HMDB00168 | \n",
" C00152 | \n",
" 6267.0 | \n",
" InChI=1S/C4H8N2O3/c5-2(4(8)9)1-3(6)7/h2H,1,5H2... | \n",
" NaN | \n",
" EX_asn_L_m | \n",
" EX_asn_L(e) | \n",
"
\n",
" \n",
" | 3 | \n",
" asp_L | \n",
" 3.015252 | \n",
" L-aspartate(1-) | \n",
" HMDB00191 | \n",
" C00049 | \n",
" 5960.0 | \n",
" InChI=1S/C4H7NO4/c5-2(4(8)9)1-3(6)7/h2H,1,5H2,... | \n",
" NaN | \n",
" EX_asp_L_m | \n",
" EX_asp_L(e) | \n",
"
\n",
" \n",
" | 4 | \n",
" but | \n",
" 0.000322 | \n",
" butyrate | \n",
" HMDB00039 | \n",
" C00246 | \n",
" 264.0 | \n",
" InChI=1S/C4H8O2/c1-2-3-4(5)6/h2-3H2,1H3,(H,5,6... | \n",
" NaN | \n",
" EX_but_m | \n",
" EX_but(e) | \n",
"
\n",
" \n",
" | ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" | 176 | \n",
" mantr | \n",
" 0.015442 | \n",
" mannotriose (beta-1,4) | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" EX_mantr_m | \n",
" EX_mantr(e) | \n",
"
\n",
" \n",
" | 177 | \n",
" cmp | \n",
" 0.008410 | \n",
" CMP | \n",
" HMDB00095 | \n",
" C00055 | \n",
" 6131.0 | \n",
" NaN | \n",
" NaN | \n",
" EX_cmp_m | \n",
" EX_cmp(e) | \n",
"
\n",
" \n",
" | 178 | \n",
" datp | \n",
" 0.002391 | \n",
" dATP | \n",
" HMDB01532 | \n",
" C00131 | \n",
" 15993.0 | \n",
" NaN | \n",
" NaN | \n",
" EX_datp_m | \n",
" EX_datp(e) | \n",
"
\n",
" \n",
" | 179 | \n",
" xylan | \n",
" 0.002183 | \n",
" Oat spelt xylan, MW 79200 (DOI: 10.1002/masy.2... | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" EX_xylan_m | \n",
" EX_xylan(e) | \n",
"
\n",
" \n",
" | 180 | \n",
" so3 | \n",
" 1.225733 | \n",
" Sulfite | \n",
" HMDB00240 | \n",
" C00094 | \n",
" 1100.0 | \n",
" InChI=1S/H2O3S/c1-4(2)3/h(H2,1,2,3)/p-2 | \n",
" NaN | \n",
" EX_so3_m | \n",
" EX_so3(e) | \n",
"
\n",
" \n",
"
\n",
"
181 rows × 10 columns
\n",
"
"
],
"text/plain": [
" metabolite flux name \\\n",
"0 ala_L 0.742297 L-alanine \n",
"1 arg_L 3.769624 L-argininium(1+) \n",
"2 asn_L 6.587626 L-asparagine \n",
"3 asp_L 3.015252 L-aspartate(1-) \n",
"4 but 0.000322 butyrate \n",
".. ... ... ... \n",
"176 mantr 0.015442 mannotriose (beta-1,4) \n",
"177 cmp 0.008410 CMP \n",
"178 datp 0.002391 dATP \n",
"179 xylan 0.002183 Oat spelt xylan, MW 79200 (DOI: 10.1002/masy.2... \n",
"180 so3 1.225733 Sulfite \n",
"\n",
" hmdb kegg.compound pubchem.compound \\\n",
"0 HMDB00161 C00041 5950.0 \n",
"1 HMDB00517 C00062 6322.0 \n",
"2 HMDB00168 C00152 6267.0 \n",
"3 HMDB00191 C00049 5960.0 \n",
"4 HMDB00039 C00246 264.0 \n",
".. ... ... ... \n",
"176 NaN NaN NaN \n",
"177 HMDB00095 C00055 6131.0 \n",
"178 HMDB01532 C00131 15993.0 \n",
"179 NaN NaN NaN \n",
"180 HMDB00240 C00094 1100.0 \n",
"\n",
" inchi chebi reaction \\\n",
"0 InChI=1S/C3H7NO2/c1-2(4)3(5)6/h2H,4H2,1H3,(H,5... NaN EX_ala_L_m \n",
"1 InChI=1S/C6H14N4O2/c7-4(5(11)12)2-1-3-10-6(8)9... NaN EX_arg_L_m \n",
"2 InChI=1S/C4H8N2O3/c5-2(4(8)9)1-3(6)7/h2H,1,5H2... NaN EX_asn_L_m \n",
"3 InChI=1S/C4H7NO4/c5-2(4(8)9)1-3(6)7/h2H,1,5H2,... NaN EX_asp_L_m \n",
"4 InChI=1S/C4H8O2/c1-2-3-4(5)6/h2-3H2,1H3,(H,5,6... NaN EX_but_m \n",
".. ... ... ... \n",
"176 NaN NaN EX_mantr_m \n",
"177 NaN NaN EX_cmp_m \n",
"178 NaN NaN EX_datp_m \n",
"179 NaN NaN EX_xylan_m \n",
"180 InChI=1S/H2O3S/c1-4(2)3/h(H2,1,2,3)/p-2 NaN EX_so3_m \n",
"\n",
" global_id \n",
"0 EX_ala_L(e) \n",
"1 EX_arg_L(e) \n",
"2 EX_asn_L(e) \n",
"3 EX_asp_L(e) \n",
"4 EX_but(e) \n",
".. ... \n",
"176 EX_mantr(e) \n",
"177 EX_cmp(e) \n",
"178 EX_datp(e) \n",
"179 EX_xylan(e) \n",
"180 EX_so3(e) \n",
"\n",
"[181 rows x 10 columns]"
]
},
"execution_count": 53,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"added_df = filled[filled > 1e-8].reset_index() \n",
"added_df.iloc[:, 0] = added_df.iloc[:, 0].str.replace(\"EX_|_m$\", \"\", regex=True)\n",
"added_df.columns = [\"metabolite\", \"flux\"]\n",
"\n",
"completed = pd.merge(added_df, annotations, on=\"metabolite\", how=\"left\")\n",
"completed[\"reaction\"] = \"EX_\" + completed.metabolite + \"_m\"\n",
"completed[\"global_id\"] = \"EX_\" + completed.metabolite + \"(e)\"\n",
"completed"
]
},
{
"cell_type": "markdown",
"id": "f59284ae-71d9-40fe-b78e-f2d923d16f55",
"metadata": {},
"source": [
"## Validate the medium\n",
"\n",
"And we will now validate whether the medium works."
]
},
{
"cell_type": "code",
"execution_count": 54,
"id": "3a115b65-ef0d-485c-89c0-311779a53e6a",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Checking 818 strain-level models on a medium with 181 components.\n",
"\n"
],
"text/plain": [
"Checking \u001b[1;36m818\u001b[0m strain-level models on a medium with \u001b[1;36m181\u001b[0m components.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "866aea2a20cd45bf852da7df4d871da9",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Output()"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n"
],
"text/plain": []
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"\n"
],
"text/plain": [
"\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from micom.workflows.db_media import check_db_medium\n",
"\n",
"check = check_db_medium(\"../data/agora103_strain.qza\", medium=completed, threads=12)"
]
},
{
"cell_type": "code",
"execution_count": 55,
"id": "11d0cb77-480b-44af-b1e1-e19233aa693d",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"count 818.000000\n",
"mean 0.187691\n",
"std 0.107104\n",
"min 0.036577\n",
"25% 0.100000\n",
"50% 0.156764\n",
"75% 0.256409\n",
"max 0.646666\n",
"Name: growth_rate, dtype: float64"
]
},
"execution_count": 55,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"check.growth_rate.describe()"
]
},
{
"cell_type": "code",
"execution_count": 56,
"id": "2b5462a6-4e6c-4e5d-8e9b-c6e1245bef5d",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'../media/guerrero_mountains.qza'"
]
},
"execution_count": 56,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import qiime2 as q2\n",
"\n",
"arti = q2.Artifact.import_data(\"MicomMedium[Global]\", completed)\n",
"arti.save(\"../media/guerrero_mountains.qza\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ec59dcb0-53b2-4489-aefe-e8687cd409e0",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}