{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Construction of promoter vector pYPKa_Z_TDH3 and terminator vector pYPKa_E_TDH3\n",
    "\n",
    "This notebook describe the construction of _E. coli_ vectors [pYPKa_Z_TDH3](pYPKa_Z_TDH3.gb) and [pYPKa_E_TDH3](pYPKa_E_TDH3.gb)\n",
    "with the same insert for which PCR primers are also designed.\n",
    "\n",
    "The insert defined below is cloned in pYPKa using the blunt restriction\n",
    "enzymes [ZraI](http://rebase.neb.com/rebase/enz/ZraI.html) and [EcoRV](http://rebase.neb.com/rebase/enz/EcoRV.html) in\n",
    "two different plasmids. The insert cloned in [ZraI](http://rebase.neb.com/rebase/enz/ZraI.html)\n",
    "will be used as a promoter, while in the [EcoRV](http://rebase.neb.com/rebase/enz/EcoRV.html) site the insert will be used as a\n",
    "terminator.\n",
    "\n",
    "![pYPKa_Z and pYPKa_E plasmids](pYPK_ZE.png \"pYPKa_Z and pYPKa_E plasmids\")\n",
    "\n",
    "The [pydna](https://pypi.python.org/pypi/pydna/) package is imported in the code cell below.\n",
    "There is a [publication](http://www.biomedcentral.com/1471-2105/16/142) describing pydna as well as\n",
    "[documentation](http://pydna.readthedocs.org/en/latest/) available online.\n",
    "Pydna is developed on [Github](https://github.com/BjornFJohansson/pydna)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pydna.readers import read\n",
    "from pydna.parsers import parse\n",
    "from pydna.parsers import parse_primers\n",
    "from pydna.design import primer_design\n",
    "from pydna.amplify import pcr\n",
    "from pydna.amplify import Anneal"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The vector backbone [pYPKa](pYPKa.gb) is read from a local file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "pYPKa = read(\"pYPKa.gb\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Both restriction enzymes are imported from [Biopython](http://biopython.org)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "from Bio.Restriction import ZraI, EcoRV"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The vector is linearized with both enzymes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "pYPKa_ZraI  = pYPKa.linearize(ZraI)\n",
    "pYPKa_EcoRV = pYPKa.linearize(EcoRV)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The insert sequence is read from a local file. This sequence was parsed from the ypkpathway data file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "ins = read(\"TDH3.gb\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Primers for the terminator promoter need specific tails in order to produce\n",
    "a [SmiI](http://rebase.neb.com/rebase/enz/SmiI.html) and a [PacI](http://rebase.neb.com/rebase/enz/PacI.html) \n",
    "when cloned in pYPKa in the EcoRV cloning position."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "fp_tail = \"ttaaat\"\n",
    "rp_tail = \"taattaa\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Primers with the tails above are designed in the code cell below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "ins = primer_design(ins)\n",
    "fp = fp_tail + ins.forward_primer\n",
    "rp = rp_tail + ins.reverse_primer"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The primers are included in the [new_primer.txt](new_primers.txt) list and in the end of the [pathway notebook](pw.ipynb) file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ">fw698 TDH3\n",
      "ttaaatATAAAAAACACGCTTTTTC\n",
      "\n",
      ">rv698 TDH3\n",
      "taattaaTTTGTTTGTTTATGTGTGTTT\n",
      "\n"
     ]
    }
   ],
   "source": [
    "print(fp.format(\"fasta\"))\n",
    "print(rp.format(\"fasta\"))\n",
    "with open(\"new_primers.txt\", \"a+\") as f:\n",
    "    f.write(fp.format(\"fasta\"))\n",
    "    f.write(rp.format(\"fasta\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "PCR to create the insert using the newly designed primers."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "prd = pcr(fp, rp, ins)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The PCR product has this length in bp."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "711"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "len(prd)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A figure of the primers annealing on template."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "      5ATAAAAAACACGCTTTTTC...AAACACACATAAACAAACAAA3\n",
       "                             ||||||||||||||||||||| tm 50.8 (dbd) 56.4\n",
       "                            3TTTGTGTGTATTTGTTTGTTTaattaat5\n",
       "5ttaaatATAAAAAACACGCTTTTTC3\n",
       "       ||||||||||||||||||| tm 48.8 (dbd) 55.3\n",
       "      3TATTTTTTGTGCGAAAAAG...TTTGTGTGTATTTGTTTGTTT5"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "prd.figure()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A suggested PCR program."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\n",
       "Taq (rate 30 nt/s) 35 cycles             |711bp\n",
       "95.0°C    |95.0°C                 |      |Tm formula: Biopython Tm_NN\n",
       "|_________|_____          72.0°C  |72.0°C|SaltC 50mM\n",
       "| 03min00s|30s  \\         ________|______|Primer1C 1.0µM\n",
       "|         |      \\ 50.9°C/ 0min21s| 5min |Primer2C 1.0µM\n",
       "|         |       \\_____/         |      |GC 34%\n",
       "|         |         30s           |      |4-12°C"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "prd.program()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The final vectors are:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "pYPKa_Z_TDH3 = (pYPKa_ZraI  + prd).looped().synced(pYPKa)\n",
    "pYPKa_E_TDH3 = (pYPKa_EcoRV + prd).looped().synced(pYPKa)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The final vectors with reverse inserts are created below. These vectors theoretically make up\n",
    "fifty percent of the clones. The PCR strategy below is used to identify the correct clones."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "pYPKa_Z_TDH3b = (pYPKa_ZraI  + prd.rc()).looped().synced(pYPKa)\n",
    "pYPKa_E_TDH3b = (pYPKa_EcoRV + prd.rc()).looped().synced(pYPKa)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A combination of standard primers and the newly designed primers are\n",
    "used for the strategy to identify correct clones.\n",
    "Standard primers are listed [here](standard_primers.txt)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "p = { x.id: x for x in parse_primers(\"standard_primers.txt\") }"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Diagnostic PCR confirmation\n",
    "\n",
    "The correct structure of pYPKa_Z_TDH3 is confirmed by PCR using standard primers\n",
    "577 and 342 that are vector specific together with the TDH3fw primer specific for the insert\n",
    "in a multiplex PCR reaction with\n",
    "all three primers present.\n",
    "\n",
    "Two PCR products are expected if the insert was cloned, the sizes depend\n",
    "on the orientation. If the vector is empty or contains another insert, only one\n",
    "product is formed.\n",
    "\n",
    "#### Expected PCR products sizes from pYPKa_Z_TDH3:\n",
    "\n",
    "pYPKa_Z_TDH3 with insert in correct orientation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[Amplicon(1645), Amplicon(1477)]"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Anneal( (p['577'], p['342'], fp), pYPKa_Z_TDH3).products"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "pYPKa_Z_TDH3 with insert in reverse orientation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[Amplicon(1645), Amplicon(879)]"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Anneal( (p['577'], p['342'], fp), pYPKa_Z_TDH3b).products"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Empty pYPKa clone."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[Amplicon(934)]"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Anneal( (p['577'], p['342'], fp), pYPKa).products"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Expected PCR products sizes pYPKa_E_TDH3:\n",
    "\n",
    "pYPKa_E_TDH3 with insert in correct orientation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[Amplicon(1645), Amplicon(1396)]"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Anneal( (p['577'], p['342'], fp), pYPKa_E_TDH3).products"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "pYPKa_E_TDH3 with insert in reverse orientation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[Amplicon(1645), Amplicon(960)]"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Anneal( (p['577'], p['342'], fp), pYPKa_E_TDH3b).products\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The cseguid checksum for the resulting plasmids are calculated for future reference.\n",
    "The [cseguid checksum](http://pydna.readthedocs.org/en/latest/pydna.html#pydna.utils.cseguid)\n",
    "uniquely identifies a circular double stranded sequence."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Z8LFAVHm3ruuq_dov31lqpmfeqA\n",
      "8zg87HoFsdPFl5Ao-Sup64AGvFs\n"
     ]
    }
   ],
   "source": [
    "print(pYPKa_Z_TDH3.cseguid())\n",
    "print(pYPKa_E_TDH3.cseguid())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The sequences are named based on the name of the cloned insert."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [],
   "source": [
    "pYPKa_Z_TDH3.locus = \"pYPKa_Z_TDH3\"[:16]\n",
    "pYPKa_E_TDH3.locus = \"pYPKa_Z_TDH3\"[:16]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Sequences are stamped with the cseguid checksum.\n",
    "This can be used to verify the integrity of the sequence file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "cSEGUID_8zg87HoFsdPFl5Ao-Sup64AGvFs"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pYPKa_Z_TDH3.stamp()\n",
    "pYPKa_E_TDH3.stamp()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Sequences are written to local files."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<font face=monospace><a href='pYPKa_Z_TDH3.gb' target='_blank'>pYPKa_Z_TDH3.gb</a></font><br>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<font face=monospace><a href='pYPKa_E_TDH3.gb' target='_blank'>pYPKa_E_TDH3.gb</a></font><br>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "pYPKa_Z_TDH3.write(\"pYPKa_Z_TDH3.gb\")\n",
    "pYPKa_E_TDH3.write(\"pYPKa_E_TDH3.gb\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Download [pYPKa_Z_TDH3](pYPKa_Z_TDH3.gb)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "cSEGUID_Z8LFAVHm3ruuq_dov31lqpmfeqA"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pydna\n",
    "reloaded = read(\"pYPKa_Z_TDH3.gb\")\n",
    "reloaded.verify_stamp()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Download [pYPKa_E_TDH3](pYPKa_E_TDH3.gb)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "cSEGUID_8zg87HoFsdPFl5Ao-Sup64AGvFs"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pydna\n",
    "reloaded = read(\"pYPKa_E_TDH3.gb\")\n",
    "reloaded.verify_stamp()"
   ]
  }
 ],
 "metadata": {},
 "nbformat": 4,
 "nbformat_minor": 2
}