# Construction of promoter vector pYPKa_Z_TEF1 and terminator vector pYPKa_E_TEF1

This notebook describe the construction of _E. coli_ vectors [pYPKa_Z_TEF1](pYPKa_Z_TEF1.gb) and [pYPKa_E_TEF1](pYPKa_E_TEF1.gb)
with the same insert for which PCR primers are also designed.

The insert defined below is cloned in pYPKa using the blunt restriction
enzymes [ZraI](http://rebase.neb.com/rebase/enz/ZraI.html) and [EcoRV](http://rebase.neb.com/rebase/enz/EcoRV.html) in
two different plasmids. The insert cloned in [ZraI](http://rebase.neb.com/rebase/enz/ZraI.html)
will be used as a promoter, while in the [EcoRV](http://rebase.neb.com/rebase/enz/EcoRV.html) site the insert will be used as a
terminator.

![pYPKa_Z and pYPKa_E plasmids](pYPK_ZE.png "pYPKa_Z and pYPKa_E plasmids")

The [pydna](https://pypi.python.org/pypi/pydna/) package is imported in the code cell below.
There is a [publication](http://www.biomedcentral.com/1471-2105/16/142) describing pydna as well as
[documentation](http://pydna.readthedocs.org/en/latest/) available online.
Pydna is developed on [Github](https://github.com/BjornFJohansson/pydna).

In [1]:
from pydna.readers import read
from pydna.parsers import parse
from pydna.parsers import parse_primers
from pydna.design import primer_design
from pydna.amplify import pcr
from pydna.amplify import Anneal

The vector backbone [pYPKa](pYPKa.gb) is read from a local file.

In [2]:
pYPKa = read("pYPKa.gb")

Both restriction enzymes are imported from [Biopython](http://biopython.org)

In [3]:
from Bio.Restriction import ZraI, EcoRV

The vector is linearized with both enzymes.

In [4]:
pYPKa_ZraI  = pYPKa.linearize(ZraI)
pYPKa_EcoRV = pYPKa.linearize(EcoRV)

The insert sequence is read from a local file. This sequence was parsed from the ypkpathway data file.

In [5]:
ins = read("TEF1.gb")

Primers for the terminator promoter need specific tails in order to produce
a [SmiI](http://rebase.neb.com/rebase/enz/SmiI.html) and a [PacI](http://rebase.neb.com/rebase/enz/PacI.html) 
when cloned in pYPKa in the EcoRV cloning position.

In [6]:
fp_tail = "ttaaat"
rp_tail = "taattaa"

Primers with the tails above are designed in the code cell below.

In [7]:
ins = primer_design(ins)
fp = fp_tail + ins.forward_primer
rp = rp_tail + ins.reverse_primer

The primers are included in the [new_primer.txt](new_primers.txt) list and in the end of the [pathway notebook](pw.ipynb) file.

In [8]:
print(fp.format("fasta"))
print(rp.format("fasta"))
with open("new_primers.txt", "a+") as f:
    f.write(fp.format("fasta"))
    f.write(rp.format("fasta"))

>fw579 TEF1
ttaaatACAATGCATACTTTGTACGT

>rv579 TEF1
taattaaTTTGTAATTAAAACTTAGATTAGATTG



PCR to create the insert using the newly designed primers.

In [9]:
prd = pcr(fp, rp, ins)

The PCR product has this length in bp.

In [10]:
len(prd)

592

A figure of the primers annealing on template.

In [11]:
prd.figure()

      5ACAATGCATACTTTGTACGT...CAATCTAATCTAAGTTTTAATTACAAA3
                              ||||||||||||||||||||||||||| tm 50.0 (dbd) 56.5
                             3GTTAGATTAGATTCAAAATTAATGTTTaattaat5
5ttaaatACAATGCATACTTTGTACGT3
       |||||||||||||||||||| tm 53.5 (dbd) 56.4
      3TGTTACGTATGAAACATGCA...GTTAGATTAGATTCAAAATTAATGTTT5

A suggested PCR program.

In [12]:
prd.program()


Taq (rate 30 nt/s) 35 cycles             |592bp
95.0°C    |95.0°C                 |      |Tm formula: Biopython Tm_NN
|_________|_____          72.0°C  |72.0°C|SaltC 50mM
| 03min00s|30s  \         ________|______|Primer1C 1.0µM
|         |      \ 51.1°C/ 0min18s| 5min |Primer2C 1.0µM
|         |       \_____/         |      |GC 34%
|         |         30s           |      |4-12°C

The final vectors are:

In [13]:
pYPKa_Z_TEF1 = (pYPKa_ZraI  + prd).looped().synced(pYPKa)
pYPKa_E_TEF1 = (pYPKa_EcoRV + prd).looped().synced(pYPKa)

The final vectors with reverse inserts are created below. These vectors theoretically make up
fifty percent of the clones. The PCR strategy below is used to identify the correct clones.

In [14]:
pYPKa_Z_TEF1b = (pYPKa_ZraI  + prd.rc()).looped().synced(pYPKa)
pYPKa_E_TEF1b = (pYPKa_EcoRV + prd.rc()).looped().synced(pYPKa)

A combination of standard primers and the newly designed primers are
used for the strategy to identify correct clones.
Standard primers are listed [here](standard_primers.txt).

In [15]:
p = { x.id: x for x in parse_primers("standard_primers.txt") }

## Diagnostic PCR confirmation

The correct structure of pYPKa_Z_TEF1 is confirmed by PCR using standard primers
577 and 342 that are vector specific together with the TEF1fw primer specific for the insert
in a multiplex PCR reaction with
all three primers present.

Two PCR products are expected if the insert was cloned, the sizes depend
on the orientation. If the vector is empty or contains another insert, only one
product is formed.

#### Expected PCR products sizes from pYPKa_Z_TEF1:

pYPKa_Z_TEF1 with insert in correct orientation.

In [16]:
Anneal( (p['577'], p['342'], fp), pYPKa_Z_TEF1).products

[Amplicon(1526), Amplicon(1358)]

pYPKa_Z_TEF1 with insert in reverse orientation.

In [17]:
Anneal( (p['577'], p['342'], fp), pYPKa_Z_TEF1b).products

[Amplicon(1526), Amplicon(760)]

Empty pYPKa clone.

In [18]:
Anneal( (p['577'], p['342'], fp), pYPKa).products

[Amplicon(934)]

#### Expected PCR products sizes pYPKa_E_TEF1:

pYPKa_E_TEF1 with insert in correct orientation.

In [19]:
Anneal( (p['577'], p['342'], fp), pYPKa_E_TEF1).products

[Amplicon(1526), Amplicon(1277)]

pYPKa_E_TEF1 with insert in reverse orientation.

In [20]:
Anneal( (p['577'], p['342'], fp), pYPKa_E_TEF1b).products


[Amplicon(1526), Amplicon(841)]

The cseguid checksum for the resulting plasmids are calculated for future reference.
The [cseguid checksum](http://pydna.readthedocs.org/en/latest/pydna.html#pydna.utils.cseguid)
uniquely identifies a circular double stranded sequence.

In [21]:
print(pYPKa_Z_TEF1.cseguid())
print(pYPKa_E_TEF1.cseguid())

0s47KRvklrl6IzfaACskS5ZBWPM
AJXtr6m2hFB9TWlbAgnFlGedTi8


The sequences are named based on the name of the cloned insert.

In [22]:
pYPKa_Z_TEF1.locus = "pYPKa_Z_TEF1"[:16]
pYPKa_E_TEF1.locus = "pYPKa_Z_TEF1"[:16]

Sequences are stamped with the cseguid checksum.
This can be used to verify the integrity of the sequence file.

In [23]:
pYPKa_Z_TEF1.stamp()
pYPKa_E_TEF1.stamp()

cSEGUID_AJXtr6m2hFB9TWlbAgnFlGedTi8

Sequences are written to local files.

In [24]:
pYPKa_Z_TEF1.write("pYPKa_Z_TEF1.gb")
pYPKa_E_TEF1.write("pYPKa_E_TEF1.gb")

# Download [pYPKa_Z_TEF1](pYPKa_Z_TEF1.gb)

In [25]:
import pydna
reloaded = read("pYPKa_Z_TEF1.gb")
reloaded.verify_stamp()

cSEGUID_0s47KRvklrl6IzfaACskS5ZBWPM

# Download [pYPKa_E_TEF1](pYPKa_E_TEF1.gb)

In [26]:
import pydna
reloaded = read("pYPKa_E_TEF1.gb")
reloaded.verify_stamp()

cSEGUID_AJXtr6m2hFB9TWlbAgnFlGedTi8