{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Defining alignment targets\n", "Example of how to define alignment targets." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Set up for analysis\n", "Import necessary Python modules:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import tempfile\n", "\n", "import Bio.SeqIO\n", "\n", "from alignparse.targets import Target, Targets" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## A single target\n", "First we show how to define a single `Target`, using an example an amplicon for PacBio sequencing of RecA for a deep mutational scanning experiment.\n", "The amplicon is defined in [Genbank Flat File format](https://www.ncbi.nlm.nih.gov/genbank/samplerecord/).\n", "First, let's just look at that file:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "LOCUS RecA_PacBio_amplicon 1342 bp ds-DNA linear 06-AUG-2018\n", "DEFINITION PacBio amplicon for deep mutational scanning of E. coli RecA.\n", "ACCESSION None\n", "VERSION \n", "SOURCE Danny Lawrence\n", " ORGANISM .\n", "COMMENT PacBio amplicon for RecA libraries.\n", "COMMENT There are single nucleotide tags in the 5' and 3' termini to measure strand exchange.\n", "FEATURES Location/Qualifiers\n", " termini5 1..147\n", " /label=\"termini 5' of gene\"\n", " gene 148..1206\n", " /label=\"RecA gene\"\n", " spacer 1207..1285\n", " /label=\"spacer between gene & barcode\"\n", " barcode 1286..1303\n", " /label=\"18 nucleotide barcode\"\n", " termini3 1304..1342\n", " /label=\"termini 3' of barcode\"\n", " variant_tag5 33..33\n", " /label=\"5' variant tag\"\n", " variant_tag3 1311..1311\n", " /label=\"3' variant tag\"\n", "ORIGIN\n", " 1 gcacggcgtc acactttgct atgccatagc atRtttatcc ataagattag cggatcctac\n", " 61 ctgacgcttt ttatcgcaac tctctactgt ttctccataa cagaacatat tgactatccg\n", " 121 gtattacccg gcatgacagg agtaaaaATG GCTATCGACG AAAACAAACA GAAAGCGTTG\n", " 181 GCGGCAGCAC TGGGCCAGAT TGAGAAACAA TTTGGTAAAG GCTCCATCAT GCGCCTGGGT\n", " 241 GAAGACCGTT CCATGGATGT GGAAACCATC TCTACCGGTT CGCTTTCACT GGATATCGCG\n", " 301 CTTGGGGCAG GTGGTCTGCC GATGGGCCGT ATCGTCGAAA TCTACGGACC GGAATCTTCC\n", " 361 GGTAAAACCA CGCTGACGCT GCAGGTGATC GCCGCAGCGC AGCGTGAAGG TAAAACCTGT\n", " 421 GCGTTTATCG ATGCTGAACA CGCGCTGGAC CCAATCTACG CACGTAAACT GGGCGTCGAT\n", " 481 ATCGACAACC TGCTGTGCTC CCAGCCGGAC ACCGGCGAGC AGGCACTGGA AATCTGTGAC\n", " 541 GCCCTGGCGC GTTCTGGCGC AGTAGACGTT ATCGTCGTTG ACTCCGTGGC GGCACTGACG\n", " 601 CCGAAAGCGG AAATCGAAGG CGAAATCGGC GACTCTCATA TGGGCCTTGC GGCACGTATG\n", " 661 ATGAGCCAGG CGATGCGTAA GCTGGCGGGT AACCTGAAGC AGTCCAACAC GCTGCTGATC\n", " 721 TTCATCAACC AGATCCGTAT GAAAATTGGT GTGATGTTCG GCAACCCGGA AACCACTACC\n", " 781 GGTGGTAACG CGCTGAAATT CTACGCCTCT GTTCGTCTCG ACATCCGTCG TATCGGCGCG\n", " 841 GTGAAAGAGG GCGAAAACGT GGTGGGTAGC GAAACCCGCG TGAAAGTGGT GAAGAACAAA\n", " 901 ATCGCTGCGC CGTTTAAACA GGCTGAATTC CAGATCCTCT ACGGCGAAGG TATCAACTTC\n", " 961 TACGGCGAAC TGGTTGACCT GGGCGTAAAA GAGAAGCTGA TCGAGAAAGC AGGCGCGTGG\n", " 1021 TACAGCTACA AAGGTGAGAA GATCGGTCAG GGTAAAGCGA ATGCGACTGC CTGGCTGAAA\n", " 1081 GATAACCCGG AAACCGCGAA AGAGATCGAG AAGAAAGTAC GTGAGTTGCT GCTGAGCAAC\n", " 1141 CCGAACTCAA CGCCGGATTT CTCTGTAGAT GATAGCGAAG GCGTAGCAGA AACTAACGAA\n", " 1201 GATTTTTAAt cgtcttgttt gatacacaag ggtcgcatct gcggcccttt tgctttttta\n", " 1261 agttgtaagg atatgccatt ctagannnnn nnnnnnnnnn nnnagatcgg Yagagcgtcg\n", " 1321 tgtagggaaa gagtgtggta cc \n", "//\n", "\n" ] } ], "source": [ "recA_targetfile = \"../notebooks/input_files/recA_amplicon.gb\"\n", "\n", "with open(recA_targetfile) as f:\n", " print(f.read())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Read the Genbank file for the target into a BioPython SeqRecord:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "recA_seqrecord = Bio.SeqIO.read(recA_targetfile, format=\"genbank\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a `Target` object:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "target = Target(\n", " seqrecord=recA_seqrecord,\n", " req_features=[\n", " \"termini5\",\n", " \"gene\",\n", " \"spacer\",\n", " \"barcode\",\n", " \"termini3\",\n", " \"variant_tag5\",\n", " \"variant_tag3\",\n", " ],\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can get specific features out of the `Target` object.\n", "Below we look for two features and print the one that exists:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "target lacks feature non-existent\n", "\n", "Here is feature termini5:\n", "Feature(name=termini5, seq=GCACGGCGTCACACTTTGCTATGCCATAGCATRTTTATCCATAAGATTAGCGGATCCTACCTGACGCTTTTTATCGCAACTCTCTACTGTTTCTCCATAACAGAACATATTGACTATCCGGTATTACCCGGCATGACAGGAGTAAAA, start=0, end=147)\n" ] } ], "source": [ "for feature in [\"non-existent\", \"termini5\"]:\n", " if target.has_feature(feature):\n", " print(f\"Here is feature {feature}:\\n{target.get_feature(feature)}\")\n", " else:\n", " print(f\"target lacks feature {feature}\\n\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can get a [dna_features_viewer](https://edinburgh-genome-foundry.github.io/DnaFeaturesViewer/) `GraphicRecord` with `Target.image`, and then plot this using its `.plot` method, which returns a `matplotlib.Axes` instance.\n", "(Note that `Target.image` also provides options for setting colors, labels):" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "image = target.image()\n", "ax, _ = image.plot()\n", "_ = ax.set_title(target.name)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Multiple targets\n", "We can read multiple targets into a `Targets` object. Below is an example with the two LASV GP constructs - wildtype and codon optimized - from the Josiah strain.\n", "\n", "First, let's look at these files:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "LOCUS LASV_Josiah_WT 1730 bp ds-DNA linear 14-JUN-2019\n", "DEFINITION .\n", "ACCESSION \n", "VERSION \n", "SOURCE Kate Crawford\n", " ORGANISM .\n", "COMMENT PacBio amplicon for LASV Josiah WT sequence\n", "FEATURES Location/Qualifiers\n", " T2A 85..147\n", " /label=\"T2A\"\n", " WPRE 1639..1730\n", " /label=\"WPRE\"\n", " ZsGreen 15..84\n", " /label=\"ZsGreen\"\n", " termini3 1639..1730\n", " /label=\"3'Termini\"\n", " index 9..14\n", " /label=\"index\"\n", " leader5 1..8\n", " /label=\"5' leader\"\n", " termini5 1..147\n", " /label=\"5'Termini\"\n", " variant_tag5 34..34\n", " /variant_1=T\n", " /variant_2=C\n", " /label=\"5'VariantTag\"\n", " variant_tag3 1702..1702\n", " /variant_1=G\n", " /variant_2=A\n", " /label=\"3'VariantTag\"\n", " spacer 1624..1638\n", " /label=\"3'Spacer\"\n", " gene 148..1623\n", " /label=\"LASV_Josiah_WT\"\n", "\n", "ORIGIN\n", " 1 GACTGATANN NNNNcagcga cgccaagaac cagYagtggc acctgaccga gcacgccatc\n", " 61 gcctccggcT CCGCCTTGCC CGCTGGATCC GGCGAGGGCA GAGGAAGTCT GCTAACATGC\n", " 121 GGTGACGTCG AGGAGAATCC TGGCCCAATG GGACAAATAG TGACATTCTT CCAGGAAGTG\n", " 181 CCTCATGTAA TAGAAGAGGT GATGAACATT GTTCTCATTG CACTGTCTGT ACTAGCAGTG\n", " 241 CTGAAAGGTC TGTACAATTT TGCAACGTGT GGCCTTGTTG GTTTGGTCAC TTTCCTCCTG\n", " 301 TTGTGTGGTA GGTCTTGCAC AACCAGTCTT TATAAAGGGG TTTATGAGCT TCAGACTCTG\n", " 361 GAACTAAACA TGGAGACACT CAATATGACC ATGCCTCTCT CCTGCACAAA GAACAACAGT\n", " 421 CATCATTATA TAATGGTGGG CAATGAGACA GGACTAGAAC TGACCTTGAC CAACACGAGC\n", " 481 ATTATTAATC ACAAATTTTG CAATCTGTCT GATGCCCACA AAAAGAACCT CTATGACCAC\n", " 541 GCTCTTATGA GCATAATCTC AACTTtccac ttgtccatcc ccaacTTCAA TCAGTATGAG\n", " 601 GCAATGAGCT GCGATTTTAA TGGGGGAAAG ATTAGTGTGC AGTACAACCT GAGTCACAGC\n", " 661 TATGCTGGGG ATGCAGCCAA CCATTGTGGT ACTGTTGCAA ATGGTGTGTT ACAGACTTTT\n", " 721 ATGAGGATGG CTTGGGGTGG GAGCTACATT GCTCTTGACT CAGGCCGTGG CAACTGGGAC\n", " 781 TGTATTATGA CTAGTTATCA ATATCTGATA ATCCAAAATA CAACCTGGGA AGATCACTGC\n", " 841 CAATTCTCGA GACCATCTCC CATCGGTTAT CTCGGGCTCC TCTCACAAAG GACTAGAGAT\n", " 901 ATTTATATTA GTAGAAGATT GCTAGGCACA TTCACATGGA CACTGTCAGA TTCTGAAGGT\n", " 961 AAAGACACAC CAGGGGGATA TTGTCTGACC AGGTGGATGC TAATTGAGGC TGAACTAAAA\n", " 1021 TGCTTCGGGA ACACAGCTGT GGCAAAATGT AATGAGAAGC ATGATGAgga attttgtgac\n", " 1081 atgctgaggc TGTTTGACTT CAACAAACAA GCCATTCAAA GGTTGAAAGC TGAAGCACAA\n", " 1141 ATGAGCATTC AGTTGATCAA CAAAGCAGTA AATGCTTTGA TAAATGACCA ACTTATAATG\n", " 1201 AAGAACCATC TACGGGACAT CATGGGAATT CCATACTGTA ATTACAGCAA GTATTGGTAC\n", " 1261 CTCAACCACA CAACTACTGG GAGAACATCA CTGCCCAAAT GTTGGCTTGT ATCAAATGGT\n", " 1321 TCATACTTGA ACGAGACCCA CTTTTCTGAT GATATTGAAC AACAAGCTGA CAATATGATC\n", " 1381 ACTGAGATGT TACAGAAGGA GTATATGGAG AGGCAGGGGA AGACACCATT GGGTCTAGTT\n", " 1441 GACCTCTTTG TGTTCAGCAC AAGTTTCTAT CTTATTAGCA TCTTCCTTCA CCTAGTCAAA\n", " 1501 ATACCAACTC ATAGGCATAT TGTAGGCAAG TCGTGTCCCA AACCTCACAG ATTGAATCAT\n", " 1561 ATGGGCATTT GTTCCTGTGG ACTCTACAAA CAGCCTGGTG TGCCTGTGAA ATGGAAGAGA\n", " 1621 TGAGCTAGCT AAACGCGTTG ATCCtaatca acctctggat tacaaaattt gtgaaagatt\n", " 1681 gactggtatt cttaactatg tRgctccttt tacgctatgt ggatacgctg \n", "//\n", "\n", "LOCUS LASV_Josiah_OPT 1730 bp ds-DNA linear 14-JUN-2019\n", "DEFINITION .\n", "ACCESSION \n", "VERSION \n", "SOURCE Kate Crawford\n", " ORGANISM .\n", "COMMENT PacBio amplicon for LASV Josiah OPT sequence\n", "FEATURES Location/Qualifiers\n", " T2A 85..147\n", " /label=\"T2A\"\n", " WPRE 1639..1730\n", " /label=\"WPRE\"\n", " ZsGreen 15..84\n", " /label=\"ZsGreen\"\n", " termini3 1639..1730\n", " /label=\"3'Termini\"\n", " index 9..14\n", " /label=\"index\"\n", " leader5 1..8\n", " /label=\"5' leader\"\n", " termini5 1..147\n", " /label=\"5'Termini\"\n", " variant_tag5 34..34\n", " /variant_1=T\n", " /variant_2=C\n", " /label=\"5'VariantTag\"\n", " variant_tag3 1702..1702\n", " /variant_1=G\n", " /variant_2=A\n", " /label=\"3'VariantTag\"\n", " spacer 1624..1638\n", " /label=\"3'Spacer\"\n", " gene 148..1623\n", " /label=\"LASV_Josiah_OPT\"\n", "\n", "ORIGIN\n", " 1 GACTGATANN NNNNcagcga cgccaagaac cagYagtggc acctgaccga gcacgccatc\n", " 61 gcctccggcT CCGCCTTGCC CGCTGGATCC GGCGAGGGCA GAGGAAGTCT GCTAACATGC\n", " 121 GGTGACGTCG AGGAGAATCC TGGCCCAATG GGCCAGATCG TGACCTTCTT CCAAGAAGTG\n", " 181 CCTCATGTGA TTGAGGAGGT GATGAATATC GTGCTGATCG CTTTAAGCGT GCTGGCCGTT\n", " 241 CTTAAGGGCC TCTATAACTT CGCCACTTGT GGTTTAGTCG GACTGGTGAC ATTTCTGCTG\n", " 301 CTGTGTGGCA GATCTTGTAC CACATCTTTA TACAAGGGCG TGTACGAGCT GCAGACTTTA\n", " 361 GAACTGAACA TGGAGACTTT AAACATGACC ATGCCTTTAA GCTGTACCAA GAACAATAGC\n", " 421 CACCACTACA TCATGGTGGG CAACGAGACC GGTTTAGAAC TGACACTCAC CAACACCAGC\n", " 481 ATTATCAACC ATAAGTTCTG CAACCTCTCC GACGCTCACA AGAAGAATTT ATACGACCAC\n", " 541 GCTTTAATGA GCATCATCTC CACCTTCCAT CTCTCCATTC CTAATttcaa ccagtacgag\n", " 601 gccatgAGCT GCGACTTTAA CGGCGGCAAG ATCTCCGTGC AGTACAATTT ATCCCATAGC\n", " 661 TACGCCGGCG ATGCCGCCAA TCACTGCGGA ACCGTGGCCA ACGGCGTGCT GCAGACATTC\n", " 721 ATGAGGATGG CTTGGGGCGG CTCCTATATC GCTTTAGACT CCGGCAGAGG AAACTGGGAC\n", " 781 TGTATCATGA CCAGCTACCA ATATTTAATC ATTCAGAACA CCACATGGGA GGACCACTGC\n", " 841 CAATTCTCTC GTCCCTCTCC TATCGGCTAT CTGGGACTGC TGTCCCAGAG GACCAGAGAC\n", " 901 ATCTACATCT CTCGTAGGCT GCTGGGCACA TTCACTTGGA CTTTAAGCGA CAGCGAAGGC\n", " 961 AAAGATACTC CCGGTGGCTA CTGTTTAACA AGATGGATGC TGATCGAGGC CGAGCTCAAG\n", " 1021 TGCTTCGGAA ATACCGCCGT GGCCAAATGC AACGAGAAAC ACGACGAGGA GTTCTGCGAC\n", " 1081 ATGCTGAGGC TCTTCGACTT CAacaagcaa gccattcaga ggcTGAAGGC CGAAGCCCAG\n", " 1141 ATGTCCATCC AGCTGATTAA TAAGGCCGTG AATGCCCTCA TTAACGACCA GCTGATCATG\n", " 1201 AAGAACCATT TAAGGGACAT CATGGGCATC CCTTATTGCA ACTACAGCAA ATACTGGTAT\n", " 1261 TTAAATCATA CCACCACCGG TCGTACATCC TTACCTAAGT GCTGGCTGGT CAGCAATGGC\n", " 1321 TCCTATTTAA ACGAGACACA CTTCTCCGAC GACATCGAGC AGCAAGCCGA CAACATGATC\n", " 1381 ACCGAAATGC TCCAGAAGGA GTACATGGAG AGGCAAGGTA AGACTCCTCT GGGTTTAGTG\n", " 1441 GATTTATTCG TCTTCAGCAC CTCCTTCTAT TTAATCTCCA TCTTTCTTCA TCTGGTGAAG\n", " 1501 ATTCCTACCC ACAGACACAT TGTGGGCAAG AGCTGTCCTA AGCCTCATAG ACTGAACCAC\n", " 1561 ATGGGCATCT GTAGCTGCGG TTTATATAAA CAGCCCGGTG TTCCCGTTAA GTGGAAGAGG\n", " 1621 TGAGCTAGCT AAACGCGTTG ATCCtaatca acctctggat tacaaaattt gtgaaagatt\n", " 1681 gactggtatt cttaactatg tRgctccttt tacgctatgt ggatacgctg \n", "//\n", "\n" ] } ], "source": [ "target_file_names = [\"LASV_Josiah_WT\", \"LASV_Josiah_OPT\"]\n", "\n", "targetfiles = [\n", " f\"../notebooks/input_files/{target_file_name}.gb\"\n", " for target_file_name in target_file_names\n", "]\n", "\n", "for targetfile in targetfiles:\n", " with open(targetfile) as f:\n", " print(f.read())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Read the sequences into a `Targets`:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "scrolled": true }, "outputs": [], "source": [ "lasv_parse_specs_file = \"../notebooks/input_files/lasv_feature_parse_specs.yaml\"\n", "\n", "lasv_targets = Targets(\n", " seqsfile=targetfiles,\n", " feature_parse_specs=lasv_parse_specs_file,\n", " allow_extra_features=True,\n", " allow_clipped_muts_seqs=True,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Iterate through lasv_targets to identify features present:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "target = LASV_Josiah_WT\n", "target lacks feature non-existent\n", "\n", "Here is feature termini5:\n", "Feature(name=termini5, seq=GCACGGCGTCACACTTTGCTATGCCATAGCATRTTTATCCATAAGATTAGCGGATCCTACCTGACGCTTTTTATCGCAACTCTCTACTGTTTCTCCATAACAGAACATATTGACTATCCGGTATTACCCGGCATGACAGGAGTAAAA, start=0, end=147)\n", "\n", "target = LASV_Josiah_OPT\n", "target lacks feature non-existent\n", "\n", "Here is feature termini5:\n", "Feature(name=termini5, seq=GCACGGCGTCACACTTTGCTATGCCATAGCATRTTTATCCATAAGATTAGCGGATCCTACCTGACGCTTTTTATCGCAACTCTCTACTGTTTCTCCATAACAGAACATATTGACTATCCGGTATTACCCGGCATGACAGGAGTAAAA, start=0, end=147)\n" ] } ], "source": [ "for lasv_target in lasv_targets.targets:\n", " print(f\"\\ntarget = {lasv_target.name}\")\n", " for feature in [\"non-existent\", \"termini5\"]:\n", " if lasv_target.has_feature(feature):\n", " print(f\"Here is feature {feature}:\\n{target.get_feature(feature)}\")\n", " else:\n", " print(f\"target lacks feature {feature}\\n\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can plot the `Targets`:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "_ = lasv_targets.plot(ax_width=10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can write them to a file for alignment:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ ">LASV_Josiah_WT\n", "GACTGATANNNNNNCAGCGACGCCAAGAACCAGYAGTGGCACCTGACCGAGCACGCCATCGCCTCCGGCTCCGCCTTGCCCGCTGGATCCGGCGAGGGCAGAGGAAGTCTGCTAACATGCGGTGACGTCGAGGAGAATCCTGGCCCAATGGGACAAATAGTGACATTCTTCCAGGAAGTGCCTCATGTAATAGAAGAGGTGATGAACATTGTTCTCATTGCACTGTCTGTACTAGCAGTGCTGAAAGGTCTGTACAATTTTGCAACGTGTGGCCTTGTTGGTTTGGTCACTTTCCTCCTGTTGTGTGGTAGGTCTTGCACAACCAGTCTTTATAAAGGGGTTTATGAGCTTCAGACTCTGGAACTAAACATGGAGACACTCAATATGACCATGCCTCTCTCCTGCACAAAGAACAACAGTCATCATTATATAATGGTGGGCAATGAGACAGGACTAGAACTGACCTTGACCAACACGAGCATTATTAATCACAAATTTTGCAATCTGTCTGATGCCCACAAAAAGAACCTCTATGACCACGCTCTTATGAGCATAATCTCAACTTTCCACTTGTCCATCCCCAACTTCAATCAGTATGAGGCAATGAGCTGCGATTTTAATGGGGGAAAGATTAGTGTGCAGTACAACCTGAGTCACAGCTATGCTGGGGATGCAGCCAACCATTGTGGTACTGTTGCAAATGGTGTGTTACAGACTTTTATGAGGATGGCTTGGGGTGGGAGCTACATTGCTCTTGACTCAGGCCGTGGCAACTGGGACTGTATTATGACTAGTTATCAATATCTGATAATCCAAAATACAACCTGGGAAGATCACTGCCAATTCTCGAGACCATCTCCCATCGGTTATCTCGGGCTCCTCTCACAAAGGACTAGAGATATTTATATTAGTAGAAGATTGCTAGGCACATTCACATGGACACTGTCAGATTCTGAAGGTAAAGACACACCAGGGGGATATTGTCTGACCAGGTGGATGCTAATTGAGGCTGAACTAAAATGCTTCGGGAACACAGCTGTGGCAAAATGTAATGAGAAGCATGATGAGGAATTTTGTGACATGCTGAGGCTGTTTGACTTCAACAAACAAGCCATTCAAAGGTTGAAAGCTGAAGCACAAATGAGCATTCAGTTGATCAACAAAGCAGTAAATGCTTTGATAAATGACCAACTTATAATGAAGAACCATCTACGGGACATCATGGGAATTCCATACTGTAATTACAGCAAGTATTGGTACCTCAACCACACAACTACTGGGAGAACATCACTGCCCAAATGTTGGCTTGTATCAAATGGTTCATACTTGAACGAGACCCACTTTTCTGATGATATTGAACAACAAGCTGACAATATGATCACTGAGATGTTACAGAAGGAGTATATGGAGAGGCAGGGGAAGACACCATTGGGTCTAGTTGACCTCTTTGTGTTCAGCACAAGTTTCTATCTTATTAGCATCTTCCTTCACCTAGTCAAAATACCAACTCATAGGCATATTGTAGGCAAGTCGTGTCCCAAACCTCACAGATTGAATCATATGGGCATTTGTTCCTGTGGACTCTACAAACAGCCTGGTGTGCCTGTGAAATGGAAGAGATGAGCTAGCTAAACGCGTTGATCCTAATCAACCTCTGGATTACAAAATTTGTGAAAGATTGACTGGTATTCTTAACTATGTRGCTCCTTTTACGCTATGTGGATACGCTG\n", ">LASV_Josiah_OPT\n", "GACTGATANNNNNNCAGCGACGCCAAGAACCAGYAGTGGCACCTGACCGAGCACGCCATCGCCTCCGGCTCCGCCTTGCCCGCTGGATCCGGCGAGGGCAGAGGAAGTCTGCTAACATGCGGTGACGTCGAGGAGAATCCTGGCCCAATGGGCCAGATCGTGACCTTCTTCCAAGAAGTGCCTCATGTGATTGAGGAGGTGATGAATATCGTGCTGATCGCTTTAAGCGTGCTGGCCGTTCTTAAGGGCCTCTATAACTTCGCCACTTGTGGTTTAGTCGGACTGGTGACATTTCTGCTGCTGTGTGGCAGATCTTGTACCACATCTTTATACAAGGGCGTGTACGAGCTGCAGACTTTAGAACTGAACATGGAGACTTTAAACATGACCATGCCTTTAAGCTGTACCAAGAACAATAGCCACCACTACATCATGGTGGGCAACGAGACCGGTTTAGAACTGACACTCACCAACACCAGCATTATCAACCATAAGTTCTGCAACCTCTCCGACGCTCACAAGAAGAATTTATACGACCACGCTTTAATGAGCATCATCTCCACCTTCCATCTCTCCATTCCTAATTTCAACCAGTACGAGGCCATGAGCTGCGACTTTAACGGCGGCAAGATCTCCGTGCAGTACAATTTATCCCATAGCTACGCCGGCGATGCCGCCAATCACTGCGGAACCGTGGCCAACGGCGTGCTGCAGACATTCATGAGGATGGCTTGGGGCGGCTCCTATATCGCTTTAGACTCCGGCAGAGGAAACTGGGACTGTATCATGACCAGCTACCAATATTTAATCATTCAGAACACCACATGGGAGGACCACTGCCAATTCTCTCGTCCCTCTCCTATCGGCTATCTGGGACTGCTGTCCCAGAGGACCAGAGACATCTACATCTCTCGTAGGCTGCTGGGCACATTCACTTGGACTTTAAGCGACAGCGAAGGCAAAGATACTCCCGGTGGCTACTGTTTAACAAGATGGATGCTGATCGAGGCCGAGCTCAAGTGCTTCGGAAATACCGCCGTGGCCAAATGCAACGAGAAACACGACGAGGAGTTCTGCGACATGCTGAGGCTCTTCGACTTCAACAAGCAAGCCATTCAGAGGCTGAAGGCCGAAGCCCAGATGTCCATCCAGCTGATTAATAAGGCCGTGAATGCCCTCATTAACGACCAGCTGATCATGAAGAACCATTTAAGGGACATCATGGGCATCCCTTATTGCAACTACAGCAAATACTGGTATTTAAATCATACCACCACCGGTCGTACATCCTTACCTAAGTGCTGGCTGGTCAGCAATGGCTCCTATTTAAACGAGACACACTTCTCCGACGACATCGAGCAGCAAGCCGACAACATGATCACCGAAATGCTCCAGAAGGAGTACATGGAGAGGCAAGGTAAGACTCCTCTGGGTTTAGTGGATTTATTCGTCTTCAGCACCTCCTTCTATTTAATCTCCATCTTTCTTCATCTGGTGAAGATTCCTACCCACAGACACATTGTGGGCAAGAGCTGTCCTAAGCCTCATAGACTGAACCACATGGGCATCTGTAGCTGCGGTTTATATAAACAGCCCGGTGTTCCCGTTAAGTGGAAGAGGTGAGCTAGCTAAACGCGTTGATCCTAATCAACCTCTGGATTACAAAATTTGTGAAAGATTGACTGGTATTCTTAACTATGTRGCTCCTTTTACGCTATGTGGATACGCTG\n", "\n" ] } ], "source": [ "with tempfile.NamedTemporaryFile(mode=\"w\") as f:\n", " lasv_targets.write_fasta(f.name)\n", " f.flush()\n", " fasta_text = open(f.name).read()\n", "print(fasta_text)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.7" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": false, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": true } }, "nbformat": 4, "nbformat_minor": 4 }