{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Introductions to constrainat-based modeling using cobrapy\n",
"\n",
"## Part 3: In-silico gene knockouts\n",
"\n",
"### Instructor:\n",
"* Miguel Ponce de León from (Barcelona Supercomputing Center)\n",
"* Contact: miguel.ponce@bsc.es\n",
"\n",
"11 December, 2020"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import cobra\n",
"from cobra.io import read_sbml_model\n",
"\n",
"# State the path to the file iJO1366.xml\n",
"sbml_fname = './data/iJO1366.xml'\n",
"\n",
"# Read the model\n",
"model = read_sbml_model(sbml_fname)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Pick a gene of interest\n",
"gene = model.genes.b0720\n",
"\n",
"# Inspect the reactions associated to b0720\n",
"print(\"id\\treaction_name\")\n",
"for r in gene.reactions: \n",
" print(\"%s \\t%s\" % (r.id,r.name))\n",
"\n",
"print()\n",
"# We can also check the genes associated to this reaction\n",
"reaction = model.reactions.CS\n",
"print(\"GPR:\",reaction.gene_reaction_rule)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Exercise 3.1: Single knock out study.\n",
"\n",
"Documentation: [https://cobrapy.readthedocs.io/en/latest/deletions.html#Knocking-out-single-genes-and-reactions](https://cobrapy.readthedocs.io/en/latest/deletions.html#Knocking-out-single-genes-and-reactions)\n",
"\n",
"We will use gene b0720 as an example\n",
"\n",
"1. COBRA can find the proper reaction to be disabled when a gene is knocked out as follows:\n",
"\n",
"```\n",
"gene = model.genes.b0720\n",
"with model:\n",
" gene.knock_out()\n",
" ko_solution = model.optimize()\n",
"```\n",
"\n",
"\n",
"(This codes knocks out the gene b0720, recalculates the FBA and stores the new solution in ko_solution and If we perform the knockout using the \"with\" block we don't need to care about restoring the knocked out gene afterwards; it is automatically restored out of the \"with\" block..)\n",
"\n",
"2. Check the growth value (Hint: ko_solution.objective_values)\n",
"3. Is the gene predicted as essential or non-essential\n",
"4. Go to the Ecocyc database and check the in vivo experimental result for the knockout of b0720 by accessing the following link:\n",
"* [https://ecocyc.org/gene?orgid=ECOLI&id=EG10402](https://ecocyc.org/gene?orgid=ECOLI&id=EG10402)\n",
"\n",
"Is b0720 essential or not?"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"## TODO\n",
"## Write your code below\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Systems-wide knock out study of *E. coli*.\n",
" \n",
"COBRA has a special function to run the single gene knock outs of a list of genes. \n",
"\n",
"The function's name is single_gene_deletion\n",
"\n",
"First import the function"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Import the function single_gene_deletion\n",
"from cobra.flux_analysis import single_gene_deletion"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Then get the list of all the genes\n",
"all_genes = [g.id for g in model.genes]\n",
"\n",
"# Running in silico (takes a while)\n",
"knockout = single_gene_deletion(model, gene_list=all_genes)\n",
"\n",
"# This is a fix to get the gene's id as the index\n",
"knockout['ids'] = [list(i)[0] for i in knockout.ids]\n",
"knockout = knockout.set_index('ids')\n",
"\n",
"# The output of the function single_gene_deletion is a dataframe\n",
"knockout.head()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# We define a threshold to define whether the reduction on the biomass flux is considered lethal.\n",
"threshold = 0.01\n",
"\n",
"# Use this threshold to find which set of genes' knock out reduce the predicted growth below the threshold.\n",
"insilico_lethals = set(knockout.index[knockout.growth< threshold])\n",
"# The set of non-essential genes are the genes with a growth value above the threshold.\n",
"insilico_non_lethals = set(knockout.index[knockout.growth > threshold])\n",
"\n",
"print(\"in-silico lethals:\", len(insilico_lethals))\n",
"print(\"in-silico non lethals:\", len(insilico_non_lethals))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Now we need to experimentally verify essential and non-essential gene sets.\n",
"\n",
"# Read the set of essential genes in vivo\n",
"import json\n",
"fname = './data/m9_invivo_lethals.json'\n",
"with open(fname) as fh:\n",
" invivo_lethals = json.load(fh)\n",
" invivo_lethals = set(invivo_lethals)\n",
" \n",
"# Convert the list of all model genes into a set.\n",
"all_genes = set(all_genes)\n",
"\n",
"# We can use the difference to obtain the list of in vivo non-lethals\n",
"invivo_non_lethals = all_genes - invivo_lethals\n",
"\n",
"# Print the size of both sets\n",
"print(\"in-vivo lethals:\", len(invivo_lethals))\n",
"print(\"in-vivo non lethals:\", len(invivo_non_lethals))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# https://en.wikipedia.org/wiki/Receiver_operating_characteristic\n",
"\n",
"# True Positives, genes predicted as essentials that are essentials in vivo (correctly predicted)\n",
"TP = insilico_lethals & invivo_lethals\n",
"\n",
"# True Negatives, genes predicted as NON-essentials that are NON-essential in vivo (correctly predicted)\n",
"TN = insilico_non_lethals & invivo_non_lethals\n",
"\n",
"# False Positives, wrongly predicted as NON-essential genes\n",
"FN = insilico_non_lethals & invivo_lethals\n",
"\n",
"# False Positives, wrongly predicted as essential genes\n",
"FP = insilico_lethals & invivo_non_lethals\n",
"\n",
"# True in vivo esssential genes\n",
"P = TP | FN\n",
"# True in vivo NON-esssential genes\n",
"N = TN | FP"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Excercise 3.3:\n",
"Complete the following table using the values from Exercise 3.2 (*E. coli*)\n",
"\n",
"| In vivo \\ In silico | in silico lethal | in silico non-lethal |\n",
"| -------------------------- |:----------------:| --------------------:|\n",
"| in vivo lethal | ? | ? |\n",
"| in vivo non-lehtal | ? | ? |\n",
"\n",
"\n",
"Total negatives = {{N}}\n",
"\n",
"### Excercise 3.4:\n",
"Acces the following link:\n",
"\n",
"https://en.wikipedia.org/wiki/Sensitivity_and_specificity\n",
"\n",
"Get the formulas and calculate the following metrics:\n",
"* sensitivity\n",
"* specificity\n",
"* precision\n",
"* accuracy"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Sensitivity, recall, hit rate, or true positive rate (TPR)\n",
"# We computed the sensitivity as follows\n",
"sensitivity = len(TP) / len(P) \n",
"\n",
"# TODO\n",
"# complete the following code\n",
"\n",
"# Specificity, selectivity or true negative rate (TNR)\n",
"specificity = None ## COMPLETE HERE \n",
"\n",
"# Precision or positive predictive value (PPV)\n",
"precision = None ## COMPLETE HERE\n",
"\n",
"# Accuracy (ACC)\n",
"accuracy = None ## COMPLETE HERE\n",
"\n",
"# Print the different values and discuss their meanings"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "venv",
"language": "python",
"name": "venv"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.10"
},
"latex_metadata": {
"affiliation": "Barcelona Supercomputing Center, Barcelona Spain",
"authors": "Miguel Ponce de Leon, Arnau Montagud",
"title": "Constraint-based modelling tutorial"
}
},
"nbformat": 4,
"nbformat_minor": 1
}