{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Predict gene knockout strategies\n", "\n", "In cameo we have two ways of predicting gene knockout targets: using evolutionary algorithms (OptGene) or linear programming (OptKnock)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "If you're running this notebook on [try.cameo.bio](http://try.cameo.bio), things might run very slow due to our inability to provide access to the proprietary [CPLEX](https://www-01.ibm.com/software/commerce/optimization/cplex-optimizer/) solver on a public webserver. Furthermore, Jupyter kernels might crash and restart due to memory limitations on the server.\n", "
" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from cameo import models" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [], "source": [ "model = models.bigg.iJO1366" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [], "source": [ "wt_solution = model.solve()\n", "growth = wt_solution.fluxes[\"BIOMASS_Ec_iJO1366_core_53p95M\"]\n", "acetate_production = wt_solution.fluxes[\"EX_ac_e\"]" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from cameo import phenotypic_phase_plane" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "p = phenotypic_phase_plane(model, variables=['BIOMASS_Ec_iJO1366_core_53p95M'], objective='EX_ac_e')\n", "p.plot(points=[(growth, acetate_production)])" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "## OptGene\n", "\n", "OptGene is an approach to search for gene or reaction knockouts that relies on evolutionary algorithms[1]. The following image from authors summarizes the OptGene workflow.\n", "\n", "\n", "\n", "Every iteration we keep the best 50 individuals so we can generate a library of targets." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from cameo.strain_design.heuristic.evolutionary_based import OptGene" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false, "scrolled": false }, "outputs": [], "source": [ "optgene = OptGene(model)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false, "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Starting optimization at Fri, 17 Jun 2016 15:01:57\n", "Finished after 00:01:48\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/joao/.virtualenvs/cameo-py3/lib/python3.4/site-packages/bokeh/io.py:532: UserWarning:\n", "\n", "Cannot find a last shown plot to update. Call output_notebook() and show() before push_notebook()\n", "\n" ] } ], "source": [ "result = optgene.run(target=\"EX_ac_e\", \n", " biomass=\"BIOMASS_Ec_iJO1366_core_53p95M\",\n", " substrate=\"glc__D_e\",\n", " max_evaluations=5000,\n", " plot=False)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "data": { "text/html": [ "\n", "

OptGene Result

\n", " \n", "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
reactionsgenessizefva_minfva_maxtarget_fluxbiomass_fluxyieldfitness
0(UM4PL, PGCD, CPH4S, UM3PL, HEPT4, ATPS4rpp)((b4233, b2913, b2765, b3738, b3623),)5.00.014.97629613.0060110.3883641.3006010.505107
1(PGCD, ARBabcpp, ACNAMt2pp, ATPS4rpp)((b1900, b2913, b1033, b3738, b3224),)5.00.014.97629614.8013910.3883641.4801390.574833
2(PGCD, ATPS4rpp, ALAt4pp, GLYt4pp)((b2913, b0007, b1612, b1033, b3738), (b2913, ...5.00.014.97629614.8013910.3883641.4801390.574833
3(HPYRI, PGCD, MALDDH, QUINDH, ATPS4rpp)((b1800, b0508, b2913, b3738, b1692),)5.00.014.97629614.4719090.3883641.4471910.562037
4(PGCD, QUINDH, ATPS4rpp)((b4024, b2913, b1033, b3738, b1692),)5.00.014.97629614.8013910.3883641.4801390.574833
5(ALAt4pp, PGCD, GLYt4pp, QUINDH, ATPS4rpp)((b4226, b2913, b0007, b3738, b1692),)5.00.014.97629614.8013910.3883641.4801390.574833
6(PGCD, DDGLK, CPH4S, QUINDH, ATPS4rpp)((b3738, b2913, b2765, b3526, b1692),)5.00.014.97629614.4719090.3883641.4471910.562037
..............................
41(PGCD, CINNDO, ATPS4rpp, PPPNDO, ARGDCpp)((b2913, b1033, b2938, b4226, b2542, b3738, b2...7.00.014.97629614.8013910.3883641.4801390.574833
42(PGCD, SUCptspp, EDA, ACMUMptspp, AGMt2pp, ATP...((b2913, b0007, b2429, b3738, b1850, b0433, b1...7.00.014.97629614.5975540.3883641.4597550.566916
43(FUCtpp, PGCD, G6PDA, QUINDH, ATPS4rpp, NO3R1bpp)((b2913, b1033, b2204, b1692, b3738, b0678, b2...7.00.014.97629614.8013910.3883641.4801390.574833
44(HKNDDH, CITL, PGCD, FRD3, FRD2, SUCptspp, ACM...((b4154, b2913, b2429, b3738, b0614, b3517, b0...7.00.014.97629614.8013910.3883641.4801390.574833
45(PGCD, HCYSMT2, QUINDH, HCYSMT, ATPS4rpp, ALAt...((b2913, b0007, b1033, b1692, b3738, b0261, b2...7.00.014.97629614.3040260.3883641.4304030.555517
46(PGCD, FRD3, FRD2, SUCptspp, ACMUMptspp, HKNTD...((b4154, b2913, b2429, b3738, b0614, b1101, b0...7.00.014.97629614.8013910.3883641.4801390.574833
47(PGCD, QUINDH, ATPS4rpp)((b2913, b1033, b4226, b1692, b3738, b4244, b3...8.00.014.97629614.8013910.3883641.4801390.574833
\n", "

48 rows Ă— 9 columns

\n", "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "result" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "result.plot(0)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/joao/.virtualenvs/cameo-py3/lib/python3.4/site-packages/escher/plots.py:155: UserWarning:\n", "\n", "Map not in cache. Attempting download from https://escher.github.io/1-0-0/5/maps/Escherichia%20coli/iJO1366.Central%20metabolism.json\n", "\n" ] }, { "data": { "text/html": [ "\n", "\n", "\n", " \n", " \n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", "
\n", "\n", " \n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "result.display_on_map(0, \"iJO1366.Central metabolism\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## OptKnock\n", "\n", "OptKnock uses a bi-level mixed integer linear programming approach to identify reaction knockouts[2]:\n", "\n", "$$\n", "\\begin{matrix}\n", "maximize & \\mathit{v_{chemical}} & & (\\mathbf{OptKnock}) \\\\\n", "\\mathit{y_j} & & & \\\\\n", "subject~to & maximize & \\mathit{v_{biomass}} & (\\mathbf{Primal}) \\\\\n", "& \\mathit{v_j} & & & & \\\\\n", "\\end{matrix}\\\\\n", "\\begin{bmatrix}\n", "subject~to & \\sum_{j=1}^{M}S_{ij}v_{j} = 0,\\\\ \n", "& v_{carbon\\_uptake} = v_{carbon~target}\\\\ \n", "& v_{apt} \\ge v_{apt\\_main}\\\\ \n", "& v_{biomass} \\ge v_{target\\_biomass}\\\\ \n", "& v_{j}^{min} \\cdot y_j \\le v_j \\le v_{j}^{max} \\cdot y_j, \\forall j \\in \\boldsymbol{M} \\\\\n", "\\end{bmatrix}\\\\\n", "\\begin{align}\n", " & y_j = {0, 1}, & & \\forall j \\in \\boldsymbol{M} & \\\\\n", " & \\sum_{j \\in M} (1 - y_j) \\le K& & & \\\\\n", "\\end{align}\n", "$$\n", "\n" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from cameo.strain_design.deterministic.linear_programming import OptKnock" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false }, "outputs": [], "source": [ "optknock = OptKnock(model, fraction_of_optimum=0.1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Running multiple knockouts with OptKnock can take a few hours or days..." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/javascript": [ "\n", " jQuery(\"#07ef2349-73d6-4731-ba71-482680ad55a0\").remove();\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "result = optknock.run(max_knockouts=1, target=\"EX_ac_e\", biomass=\"BIOMASS_Ec_iJO1366_core_53p95M\")" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", "

OptKnock:

\n", "
    \n", "
  • Target: EX_ac_e
  • \n", "
\n", "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
reactionssizeEX_ac_ebiomassfva_minfva_max
0{ATPS4rpp}1.013.9429430.4024770.014.187817
\n", "
" ], "text/plain": [ "" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "result" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "result.plot(0)" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", " \n", " \n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", "
\n", "\n", " \n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "result.display_on_map(0, \"iJO1366.Central metabolism\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## References\n", "\n", "[1]Patil, K. R., Rocha, I., Förster, J., & Nielsen, J. (2005). Evolutionary programming as a platform for in silico metabolic engineering. BMC Bioinformatics, 6, 308. doi:10.1186/1471-2105-6-308\n", "\n", "[2]Burgard, A.P., Pharkya, P., Maranas, C.D. (2003), \"OptKnock: A Bilevel Programming Framework for Identifying Gene Knockout Strategies for Microbial Strain Optimization,\" Biotechnology and Bioengineering, 84(6), 647-657." ] } ], "metadata": { "kernelspec": { "display_name": "Python [default]", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.4.5" } }, "nbformat": 4, "nbformat_minor": 0 }