{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Pathway pYPK0_SsXYL1_SsXYL2_ScXKS1_ScTAL1\n", "\n", "This notebook describes the assembly of 4 single gene expression cassettes into a single pathway. \n", "Notebooks describing the single gene expression vectors are linked at the end of this document as are notebooks \n", "describing pYPKa promoter, gene and terminator vectors. Specific primers needed are also listed below.\n", "\n", "![pathway with N genes](pw.png \"pathway with N genes\")\n", "\n", "The [pydna](https://pypi.python.org/pypi/pydna/) package is imported in the code cell below. \n", "There is a [publication](http://www.biomedcentral.com/1471-2105/16/142) describing pydna as well as\n", "[documentation](http://pydna.readthedocs.org/en/latest/) available online. \n", "Pydna is developed on [Github](https://github.com/BjornFJohansson/pydna).\n", "\n", "The assembly performed here is based on content of the [INDATA_pth6.txt](INDATA_pth6.txt) text file.\n", "The assembly log can be viewed [here](log.txt)." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from pydna.parsers import parse_primers\n", "from pydna.readers import read\n", "from pydna.amplify import pcr\n", "from pydna.assembly import Assembly" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Initiate the standard primers needed to amplify each cassette.\n", "The first cassette in the pathway is amplified with standard\n", "primers 577 and 778, the last with\n", "775 and 578 and all others with 775 and 778.\n", "Standard primers are listed [here](standard_primers.txt)." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "p = { x.id: x for x in parse_primers(\"standard_primers.txt\") }" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The backbone vector is linearized with [EcoRV](http://rebase.neb.com/rebase/enz/EcoRV.html)." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from Bio.Restriction import EcoRV, NotI, PacI\n", "\n", "pYPKpw = read(\"pYPKpw.gb\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The cassette_products variable holds the list of expression cassette PCR products fragments to\n", "be assembled." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": true }, "outputs": [], "source": [ "cassette_products = []" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The expression cassettes comes from a series of single gene expression vectors \n", "held in the template_vectors list." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[File(id?)(o8024), File(id?)(o8580), File(id?)(o9223), File(id?)(o8383)]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cassette_vectors ='''\n", " \n", " pYPK0_TEF1_SsXYL1_TDH3.gb\n", " \n", " pYPK0_TDH3_SsXYL2_PGI.gb\n", " \n", " pYPK0_PGI_ScXKS1_FBA1.gb\n", " \n", " pYPK0_FBA1_ScTAL1_PDC1.gb'''.splitlines()\n", "\n", "template_vectors = [read(v.strip()) for v in cassette_vectors if v.strip()]\n", "\n", "template_vectors" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The first cassette in the pathway is amplified with standard primers 577 and 778. Suggested PCR conditions can be found at the end of this document." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": true }, "outputs": [], "source": [ "cassette_products.append( pcr( p['577'], p['778'], template_vectors[0] ) )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Cassettes in the middle cassettes are amplified with standard primers 775 and 778. Suggested PCR conditions can be found at the end of this document." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": true }, "outputs": [], "source": [ "cassette_products.extend( pcr( p['775'], p['778'], v) for v in template_vectors[1:-1] ) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The last cassette in the pathway is amplified with standard primers 775 and 578. Suggested PCR conditions can be found at the end of this document." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": true }, "outputs": [], "source": [ "cassette_products.append( pcr( p['775'], p['578'], template_vectors[-1] ) )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The cassettes are given names based on their order in the final construct in the code cell below." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Cassette 1\n", "Cassette 2\n", "Cassette 3\n", "Cassette 4\n" ] } ], "source": [ "for i, cp in enumerate(cassette_products):\n", " cp.name = \"Cassette {}\".format(i+1)\n", " print(cp.name)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Cassettes and plasmid backbone are joined by homologous recombination in a Saccharomyces cerevisiae ura3 host\n", "which selects for the URA3 gene in pYPKpw." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Assembly:\n", "Sequences........................: [5603] [2524] [2924] [3567] [3009]\n", "Sequences with shared homologies.: [5603] [2524] [3009] [2924] [3567]\n", "Homology limit (bp)..............: 110\n", "Number of overlaps...............: 5\n", "Nodes in graph(incl. 5' & 3')....: 7\n", "Only terminal overlaps...........: No\n", "Circular products................: [14800]\n", "Linear products..................: [15838] [15536] [15468] [15044] [14941] [13650] [13153] [12939] [12703] [11267] [10751] [10174] [9582] [8368] [7986] [7794] [7241] [5908] [5453] [4712] [1038] [736] [668] [244] [141]" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "asm = Assembly( [pYPKpw.linearize(EcoRV)] + cassette_products, limit=167-47-10)\n", "asm" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Normally, only one circular product should be formed since the \n", "homology limit is quite large (see cell above). More than one \n", "circular products might indicate an incorrect strategy. \n", "The largest recombination product is chosen as candidate for \n", "the pYPK0_SsXYL1_SsXYL2_ScXKS1_ScTAL1 pathway." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": true }, "outputs": [], "source": [ "candidate = asm.assemble_circular()[0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This assembly figure shows how the fragments came together." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ " -|pYPKpw|124\n", "| \\/\n", "| /\\\n", "| 124|Cassette 1|711\n", "| \\/\n", "| /\\\n", "| 711|Cassette 2|1013\n", "| \\/\n", "| /\\\n", "| 1013|Cassette 3|643\n", "| \\/\n", "| /\\\n", "| 643|Cassette 4|242\n", "| \\/\n", "| /\\\n", "| 242-\n", "| |\n", " -------------------------------------------------------------------------" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "candidate.figure()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The final pathway is synchronized to the backbone vector. This means that\n", "the plasmid origin is shifted so that it matches the original." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": true }, "outputs": [], "source": [ "pw = candidate.synced(pYPKpw)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The cseguid checksum for the resulting plasmid is calculated for future reference.\n", "The [cseguid checksum](http://pydna.readthedocs.org/en/latest/pydna.html#pydna.utils.cseguid) \n", "uniquely identifies a circular double stranded sequence." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3RjS0AfdAqVibyeJfuLPrPqWkl4" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pw.cseguid()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The file is given a name based on the sequence of expressed genes." ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": true }, "outputs": [], "source": [ "pw.locus = \"pw\"\n", "pw.definition = \"pYPK0_SsXYL1_SsXYL2_ScXKS1_ScTAL1\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Stamp sequence with cseguid checksum. This can be used to verify the \n", "integrity of the sequence file." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "cSEGUID_3RjS0AfdAqVibyeJfuLPrPqWkl4" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pw.stamp()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Write sequence to a local file." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/html": [ "pYPK0_SsXYL1_SsXYL2_ScXKS1_ScTAL1.gb
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "pw.write(\"pYPK0_SsXYL1_SsXYL2_ScXKS1_ScTAL1.gb\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The pathway can be extended by digestion with either NotI or PacI or both provided that the enzymes cut once in the final pathway sequence." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "NotI cuts 1 time(s) and PacI cuts 1 time(s) in the final pathway.\n" ] } ], "source": [ "print(\"NotI cuts {} time(s) and PacI cuts {} time(s) in the final pathway.\".format(len(pw.cut(NotI)), len(pw.cut(PacI))))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## DOWNLOAD [pYPK0_SsXYL1_SsXYL2_ScXKS1_ScTAL1](pYPK0_SsXYL1_SsXYL2_ScXKS1_ScTAL1.gb)" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "cSEGUID_3RjS0AfdAqVibyeJfuLPrPqWkl4" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pydna\n", "\n", "reloaded = read(\"pYPK0_SsXYL1_SsXYL2_ScXKS1_ScTAL1.gb\")\n", "\n", "reloaded.verify_stamp()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### New Primers needed for assembly.\n", "\n", "This list contains all needed primers that are not in the standard primer [list](standard_primers.txt) above." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ ">fw630 FBA1\n", "ttaaatATAACAATACTGACAGTACTAAATAATT\n", ">rv630 FBA1\n", "taattaaTTTGAATATGTATTACTTGGTTATG\n", ">fw630 FBA1\n", "ttaaatATAACAATACTGACAGTACTAAATAATT\n", ">rv630 FBA1\n", "taattaaTTTGAATATGTATTACTTGGTTATG\n", ">fw955 PDC1\n", "ttaaatAGGGTAGCCTCCCC\n", ">rv955 PDC1\n", "taattaaTTTGATTGATTTGACTGTGTT\n", ">fw955 PDC1\n", "ttaaatAGGGTAGCCTCCCC\n", ">rv955 PDC1\n", "taattaaTTTGATTGATTTGACTGTGTT\n", ">fw1000 PGI\n", "ttaaatAATTCAGTTTTCTGACTGAGT\n", ">rv1000 PGI\n", "taattaaTTTTAGGCTGGTATCTTGAT\n", ">fw1000 PGI\n", "ttaaatAATTCAGTTTTCTGACTGAGT\n", ">rv1000 PGI\n", "taattaaTTTTAGGCTGGTATCTTGAT\n", ">fw698 TDH3\n", "ttaaatATAAAAAACACGCTTTTTC\n", ">rv698 TDH3\n", "taattaaTTTGTTTGTTTATGTGTGTTT\n", ">fw698 TDH3\n", "ttaaatATAAAAAACACGCTTTTTC\n", ">rv698 TDH3\n", "taattaaTTTGTTTGTTTATGTGTGTTT\n", ">fw579 TEF1\n", "ttaaatACAATGCATACTTTGTACGT\n", ">rv579 TEF1\n", "taattaaTTTGTAATTAAAACTTAGATTAGATTG\n", ">fw579 TEF1\n", "ttaaatACAATGCATACTTTGTACGT\n", ">rv579 TEF1\n", "taattaaTTTGTAATTAAAACTTAGATTAGATTG\n", ">fw1008 ScTAL1\n", "tgcccactttctcactagtgacctgcagccgacAAATGTCTGAACCAGCTCA\n", ">rv1008 ScTAL1\n", "AAatcctgatgcgtttgtctgcacagatggCACTTAAGCGGTAACTTTCTTTTC\n", ">fw1008 ScTAL1\n", "tgcccactttctcactagtgacctgcagccgacAAATGTCTGAACCAGCTCA\n", ">rv1008 ScTAL1\n", "AAatcctgatgcgtttgtctgcacagatggCACTTAAGCGGTAACTTTCTTTTC\n", ">fw1803 ScXKS1\n", "tgcccactttctcactagtgacctgcagccgacAAATGTTGTGTTCAGTAATTCAG\n", ">rv1803 ScXKS1\n", "AAatcctgatgcgtttgtctgcacagatggCACTTAGATGAGAGTCTTTTCCAG\n", ">fw1803 ScXKS1\n", "tgcccactttctcactagtgacctgcagccgacAAATGTTGTGTTCAGTAATTCAG\n", ">rv1803 ScXKS1\n", "AAatcctgatgcgtttgtctgcacagatggCACTTAGATGAGAGTCTTTTCCAG\n", ">fw1092 SsXYL2\n", "tgcccactttctcactagtgacctgcagccgacAAATGACTGCTAACCCTTCC\n", ">rv1092 SsXYL2\n", "AAatcctgatgcgtttgtctgcacagatggCACTTACTCAGGGCCGTCA\n", ">fw1092 SsXYL2\n", "tgcccactttctcactagtgacctgcagccgacAAATGACTGCTAACCCTTCC\n", ">rv1092 SsXYL2\n", "AAatcctgatgcgtttgtctgcacagatggCACTTACTCAGGGCCGTCA\n", ">fw957 SsXYL1\n", "tgcccactttctcactagtgacctgcagccgacAAATGCCTTCTATTAAGTTGAAC\n", ">rv957 SsXYL1\n", "AAatcctgatgcgtttgtctgcacagatggCACTTAGACGAAGATAGGAATCTTG\n", ">fw957 SsXYL1\n", "tgcccactttctcactagtgacctgcagccgacAAATGCCTTCTATTAAGTTGAAC\n", ">rv957 SsXYL1\n", "AAatcctgatgcgtttgtctgcacagatggCACTTAGACGAAGATAGGAATCTTG\n", "\n" ] } ], "source": [ "try:\n", " with open(\"new_primers.txt\") as f: \n", " text = f.read()\n", "except IOError:\n", " text = \"no new primers needed.\"\n", "print(text)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### New single gene expression vectors (pYPK0_prom_gene_term) needed for assembly.\n", "\n", "Hyperlinks to notebook files describing the singlke gene expression plasmids needed for the assembly.\n", "\n", "[pYPK0_TEF1_SsXYL1_TDH3](pYPK0_TEF1_SsXYL1_TDH3.ipynb) \n", "[pYPK0_TDH3_SsXYL2_PGI](pYPK0_TDH3_SsXYL2_PGI.ipynb) \n", "[pYPK0_PGI_ScXKS1_FBA1](pYPK0_PGI_ScXKS1_FBA1.ipynb) \n", "[pYPK0_FBA1_ScTAL1_PDC1](pYPK0_FBA1_ScTAL1_PDC1.ipynb) \n", "\n", "\n", "### New pYPKa vectors needed for assembly of the single gene expression vectors above.\n", "\n", "Hyperlinks to notebook files describing the pYPKa plasmids needed for the assembly of the single gene clones listed above.\n", "\n", "[pYPKa_ZE_TEF1](pYPKa_ZE_TEF1.ipynb) \n", "[pYPKa_ZE_TDH3](pYPKa_ZE_TDH3.ipynb) \n", "[pYPKa_ZE_PGI](pYPKa_ZE_PGI.ipynb) \n", "[pYPKa_ZE_FBA1](pYPKa_ZE_FBA1.ipynb) \n", "[pYPKa_ZE_PDC1](pYPKa_ZE_PDC1.ipynb) \n", "\n", "\n", "### Suggested PCR conditions" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "\n", "\n", "\n", "\n", "product name: Cassette 1\n", "forward primer 577\n", "reverse primer 778\n", "\n", "Taq (rate 30 nt/s) 35 cycles |2524bp\n", "95.0°C |95.0°C | |Tm formula: Biopython Tm_NN\n", "|_________|_____ 72.0°C |72.0°C|SaltC 50mM\n", "| 03min00s|30s \\ ________|______|Primer1C 1.0µM\n", "| | \\ 53.1°C/ 1min16s| 5min |Primer2C 1.0µM\n", "| | \\_____/ | |GC 40%\n", "| | 30s | |4-12°C\n", "\n", "\n", "\n", "\n", "\n", "product name: Cassette 2\n", "forward primer 775\n", "reverse primer 778\n", "\n", "Taq (rate 30 nt/s) 35 cycles |2924bp\n", "95.0°C |95.0°C | |Tm formula: Biopython Tm_NN\n", "|_________|_____ 72.0°C |72.0°C|SaltC 50mM\n", "| 03min00s|30s \\ ________|______|Primer1C 1.0µM\n", "| | \\ 53.4°C/ 1min28s| 5min |Primer2C 1.0µM\n", "| | \\_____/ | |GC 42%\n", "| | 30s | |4-12°C\n", "\n", "\n", "\n", "\n", "\n", "product name: Cassette 3\n", "forward primer 775\n", "reverse primer 778\n", "\n", "Taq (rate 30 nt/s) 35 cycles |3567bp\n", "95.0°C |95.0°C | |Tm formula: Biopython Tm_NN\n", "|_________|_____ 72.0°C |72.0°C|SaltC 50mM\n", "| 03min00s|30s \\ ________|______|Primer1C 1.0µM\n", "| | \\ 52.6°C/ 1min47s| 5min |Primer2C 1.0µM\n", "| | \\_____/ | |GC 39%\n", "| | 30s | |4-12°C\n", "\n", "\n", "\n", "\n", "\n", "product name: Cassette 4\n", "forward primer 775\n", "reverse primer 578\n", "\n", "Taq (rate 30 nt/s) 35 cycles |3009bp\n", "95.0°C |95.0°C | |Tm formula: Biopython Tm_NN\n", "|_________|_____ 72.0°C |72.0°C|SaltC 50mM\n", "| 03min00s|30s \\ ________|______|Primer1C 1.0µM\n", "| | \\ 56.0°C/ 1min30s| 5min |Primer2C 1.0µM\n", "| | \\_____/ | |GC 39%\n", "| | 30s | |4-12°C\n" ] } ], "source": [ "for prd in cassette_products:\n", " print(\"\\n\\n\\n\\n\")\n", " print(\"product name:\", prd.name)\n", " print(\"forward primer\", prd.forward_primer.name)\n", " print(\"reverse primer\", prd.reverse_primer.name)\n", " print(prd.program())" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.2" } }, "nbformat": 4, "nbformat_minor": 2 }