{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Introductions to constrainat-based modeling using cobrapy\n", "\n", "## Part 3: In-silico gene knockouts\n", "\n", "### Instructor:\n", "* Miguel Ponce de León from (Barcelona Supercomputing Center)\n", "* Contact: miguel.ponce@bsc.es\n", "\n", "11 December, 2020" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import cobra\n", "from cobra.io import read_sbml_model\n", "\n", "# State the path to the file iJO1366.xml\n", "sbml_fname = './data/iJO1366.xml'\n", "\n", "# Read the model\n", "model = read_sbml_model(sbml_fname)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Pick a gene of interest\n", "gene = model.genes.b0720\n", "\n", "# Inspect the reactions associated to b0720\n", "print(\"id\\treaction_name\")\n", "for r in gene.reactions: \n", " print(\"%s \\t%s\" % (r.id,r.name))\n", "\n", "print()\n", "# We can also check the genes associated to this reaction\n", "reaction = model.reactions.CS\n", "print(\"GPR:\",reaction.gene_reaction_rule)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercise 3.1: Single knock out study.\n", "\n", "Documentation: [https://cobrapy.readthedocs.io/en/latest/deletions.html#Knocking-out-single-genes-and-reactions](https://cobrapy.readthedocs.io/en/latest/deletions.html#Knocking-out-single-genes-and-reactions)\n", "\n", "We will use gene b0720 as an example\n", "\n", "1. COBRA can find the proper reaction to be disabled when a gene is knocked out as follows:\n", "\n", "```\n", "gene = model.genes.b0720\n", "with model:\n", " gene.knock_out()\n", " ko_solution = model.optimize()\n", "```\n", "\n", "\n", "(This codes knocks out the gene b0720, recalculates the FBA and stores the new solution in ko_solution and If we perform the knockout using the \"with\" block we don't need to care about restoring the knocked out gene afterwards; it is automatically restored out of the \"with\" block..)\n", "\n", "2. Check the growth value (Hint: ko_solution.objective_values)\n", "3. Is the gene predicted as essential or non-essential\n", "4. Go to the Ecocyc database and check the in vivo experimental result for the knockout of b0720 by accessing the following link:\n", "* [https://ecocyc.org/gene?orgid=ECOLI&id=EG10402](https://ecocyc.org/gene?orgid=ECOLI&id=EG10402)\n", "\n", "Is b0720 essential or not?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "## TODO\n", "## Write your code below\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Systems-wide knock out study of *E. coli*.\n", " \n", "COBRA has a special function to run the single gene knock outs of a list of genes. \n", "\n", "The function's name is single_gene_deletion\n", "\n", "First import the function" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Import the function single_gene_deletion\n", "from cobra.flux_analysis import single_gene_deletion" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Then get the list of all the genes\n", "all_genes = [g.id for g in model.genes]\n", "\n", "# Running in silico (takes a while)\n", "knockout = single_gene_deletion(model, gene_list=all_genes)\n", "\n", "# This is a fix to get the gene's id as the index\n", "knockout['ids'] = [list(i)[0] for i in knockout.ids]\n", "knockout = knockout.set_index('ids')\n", "\n", "# The output of the function single_gene_deletion is a dataframe\n", "knockout.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# We define a threshold to define whether the reduction on the biomass flux is considered lethal.\n", "threshold = 0.01\n", "\n", "# Use this threshold to find which set of genes' knock out reduce the predicted growth below the threshold.\n", "insilico_lethals = set(knockout.index[knockout.growth< threshold])\n", "# The set of non-essential genes are the genes with a growth value above the threshold.\n", "insilico_non_lethals = set(knockout.index[knockout.growth > threshold])\n", "\n", "print(\"in-silico lethals:\", len(insilico_lethals))\n", "print(\"in-silico non lethals:\", len(insilico_non_lethals))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Now we need to experimentally verify essential and non-essential gene sets.\n", "\n", "# Read the set of essential genes in vivo\n", "import json\n", "fname = './data/m9_invivo_lethals.json'\n", "with open(fname) as fh:\n", " invivo_lethals = json.load(fh)\n", " invivo_lethals = set(invivo_lethals)\n", " \n", "# Convert the list of all model genes into a set.\n", "all_genes = set(all_genes)\n", "\n", "# We can use the difference to obtain the list of in vivo non-lethals\n", "invivo_non_lethals = all_genes - invivo_lethals\n", "\n", "# Print the size of both sets\n", "print(\"in-vivo lethals:\", len(invivo_lethals))\n", "print(\"in-vivo non lethals:\", len(invivo_non_lethals))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# https://en.wikipedia.org/wiki/Receiver_operating_characteristic\n", "\n", "# True Positives, genes predicted as essentials that are essentials in vivo (correctly predicted)\n", "TP = insilico_lethals & invivo_lethals\n", "\n", "# True Negatives, genes predicted as NON-essentials that are NON-essential in vivo (correctly predicted)\n", "TN = insilico_non_lethals & invivo_non_lethals\n", "\n", "# False Positives, wrongly predicted as NON-essential genes\n", "FN = insilico_non_lethals & invivo_lethals\n", "\n", "# False Positives, wrongly predicted as essential genes\n", "FP = insilico_lethals & invivo_non_lethals\n", "\n", "# True in vivo esssential genes\n", "P = TP | FN\n", "# True in vivo NON-esssential genes\n", "N = TN | FP" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Excercise 3.3:\n", "Complete the following table using the values from Exercise 3.2 (*E. coli*)\n", "\n", "| In vivo \\ In silico | in silico lethal | in silico non-lethal |\n", "| -------------------------- |:----------------:| --------------------:|\n", "| in vivo lethal | ? | ? |\n", "| in vivo non-lehtal | ? | ? |\n", "\n", "\n", "Total negatives = {{N}}\n", "\n", "### Excercise 3.4:\n", "Acces the following link:\n", "\n", "https://en.wikipedia.org/wiki/Sensitivity_and_specificity\n", "\n", "Get the formulas and calculate the following metrics:\n", "* sensitivity\n", "* specificity\n", "* precision\n", "* accuracy" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Sensitivity, recall, hit rate, or true positive rate (TPR)\n", "# We computed the sensitivity as follows\n", "sensitivity = len(TP) / len(P) \n", "\n", "# TODO\n", "# complete the following code\n", "\n", "# Specificity, selectivity or true negative rate (TNR)\n", "specificity = None ## COMPLETE HERE \n", "\n", "# Precision or positive predictive value (PPV)\n", "precision = None ## COMPLETE HERE\n", "\n", "# Accuracy (ACC)\n", "accuracy = None ## COMPLETE HERE\n", "\n", "# Print the different values and discuss their meanings" ] } ], "metadata": { "kernelspec": { "display_name": "venv", "language": "python", "name": "venv" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.10" }, "latex_metadata": { "affiliation": "Barcelona Supercomputing Center, Barcelona Spain", "authors": "Miguel Ponce de Leon, Arnau Montagud", "title": "Constraint-based modelling tutorial" } }, "nbformat": 4, "nbformat_minor": 1 }