{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Notebook 2: \n", "\n", "This is an Jupyter/IPython notebook. Most of the code is composed of bash scripts, indicated by %%bash at the top of the cell, otherwise it is IPython code. This notebook includes code to download, assemble and analyze a published RADseq data set, and further code below to analyze missing data in that data set." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [] } ], "source": [ "### Notebook 2\n", "### Data set 2 (Phrynosomatidae)\n", "### Authors: Leache et al. (2015)\n", "### Data Location: NCBI SRA SRP063316" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Download the sequence data\n", "Sequence data for this study are archived on the NCBI sequence read archive (SRA). Below I read in [SraRunTable.txt](https://raw.githubusercontent.com/dereneaton/RADmissing/master/empirical_2_SraRunTable.txt) downloaded from the SRA website for this project. It contains all of the information on the id numbers needed to download data for all samples for this project. \n", "\n", "+ Project SRA: SRP063316\n", "+ BioProject ID: PRJNA294316\n", "+ Biosample numbers: SAMN04027506 -- SAMN04027579\n", "+ Runs: SRR2240500 -- SRR2240573\n" ] }, { "cell_type": "code", "execution_count": 83, "metadata": { "collapsed": true }, "outputs": [], "source": [ "%%bash\n", "## make a new directory for this analysis\n", "mkdir -p empirical_2/fastq/" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " BioSample_s Library_Name_s MBases_l MBytes_l Organism_s \\\n", "0 SAMN04027526 PHCE3 88 58 Phrynosoma cerroense \n", "1 SAMN04027532 PHCN3 67 44 Phrynosoma cornutum \n", "2 SAMN04027531 PHCN2 79 52 Phrynosoma cornutum \n", "3 SAMN04027530 PHCN1 47 31 Phrynosoma cornutum \n", "4 SAMN04027529 PHCE6 45 29 Phrynosoma cerroense \n", "\n", " Run_s SRA_Sample_s Sample_Name_s \\\n", "0 SRR2240520 SRS1054806 PHCE3 \n", "1 SRR2240526 SRS1054807 PHCN3 \n", "2 SRR2240525 SRS1054808 PHCN2 \n", "3 SRR2240524 SRS1054809 PHCN1 \n", "4 SRR2240523 SRS1054810 PHCE6 \n", "\n", " geo_loc_name_s sex_s \\\n", "0 Mexico:Colonia Guerrero, Baja California male \n", "1 USA:3.4 mi (by gravel road) N Steins, Hidalgo ... not collected \n", "2 USA:Hwy 180, 8.4 mi NW of Hwy 10 (at Deming), ... not collected \n", "3 USA:0.5 mi E Portal, Cochise Co., Arizona not collected \n", "4 Mexico:7.5 mi S Guerrero Negro, Baja Californi... female \n", "\n", " voucher_s Assay_Type_s AssemblyName_s BioProject_s \\\n", "0 MVZ:Herp:161187 OTHER PRJNA294316 \n", "1 AMCC:Herp:106073 OTHER PRJNA294316 \n", "2 CAS:Herp:228870 OTHER PRJNA294316 \n", "3 MVZ:Herp:238582 OTHER PRJNA294316 \n", "4 MVZ:Herp:161210 OTHER PRJNA294316 \n", "\n", " BioSampleModel_s Center_Name_s Consent_s InsertSize_l \\\n", "0 Model organism or animal NREMC public 0 \n", "1 Model organism or animal NREMC public 0 \n", "2 Model organism or animal NREMC public 0 \n", "3 Model organism or animal NREMC public 0 \n", "4 Model organism or animal NREMC public 0 \n", "\n", " LibraryLayout_s LoadDate_s \n", "0 SINGLE Sep 04, 2015 ... \n", "1 SINGLE Sep 04, 2015 ... \n", "2 SINGLE Sep 04, 2015 ... \n", "3 SINGLE Sep 04, 2015 ... \n", "4 SINGLE Sep 04, 2015 ... \n", "\n", "[5 rows x 29 columns]\n" ] } ], "source": [ "## IPython code\n", "\n", "## import libraries\n", "import pandas as pd\n", "import urllib2\n", "import os\n", "\n", "## read in the SRA run table from public github url\n", "## as a pandas data frame\n", "url = \"https://raw.githubusercontent.com/\"+\\\n", " \"dereneaton/RADmissing/master/empirical_2_SraRunTable.txt\"\n", "intable = urllib2.urlopen(url)\n", "indata = pd.read_table(intable, sep=\"\\t\")\n", "\n", "## print first few rows\n", "print indata.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Download sequence data using the SRA information\n", "This is a function to make `wget` calls to download data base on SRA IDs" ] }, { "cell_type": "code", "execution_count": 89, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def wget_download(SRR, outdir, outname):\n", " \"\"\" Python function to get sra data from ncbi and write to\n", " outdir with a new name using bash call wget \"\"\"\n", " \n", " ## get output name\n", " output = os.path.join(outdir, outname+\".sra\")\n", " \n", " ## create a call string \n", " call = \"wget -q -r -nH --cut-dirs=9 -O \"+output+\" \"+\\\n", " \"ftp://ftp-trace.ncbi.nlm.nih.gov/\"+\\\n", " \"sra/sra-instant/reads/ByRun/sra/SRR/\"+\\\n", " \"{}/{}/{}.sra;\".format(SRR[:6], SRR, SRR)\n", " \n", " ## call bash script\n", " ! $call " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we pass the SRR number and the sample name to the `wget_download` function so that the files are saved with their sample names. " ] }, { "cell_type": "code", "execution_count": 90, "metadata": { "collapsed": false }, "outputs": [], "source": [ "for ID, SRR in zip(indata.Library_Name_s, indata.Run_s):\n", " wget_download(SRR, \"empirical_2/fastq/\", ID)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Convert from SRA format to FASTQ format" ] }, { "cell_type": "code", "execution_count": 93, "metadata": { "collapsed": false, "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Read 1883604 spots for empirical_2/fastq/CADR1.sra\n", "Written 1883604 spots for empirical_2/fastq/CADR1.sra\n", "Read 1168210 spots for empirical_2/fastq/CADR2.sra\n", "Written 1168210 spots for empirical_2/fastq/CADR2.sra\n", "Read 1452471 spots for empirical_2/fastq/COTE1.sra\n", "Written 1452471 spots for empirical_2/fastq/COTE1.sra\n", "Read 5406187 spots for empirical_2/fastq/GAWI1.sra\n", "Written 5406187 spots for empirical_2/fastq/GAWI1.sra\n", "Read 699921 spots for empirical_2/fastq/HOMA1.sra\n", "Written 699921 spots for empirical_2/fastq/HOMA1.sra\n", "Read 2590961 spots for empirical_2/fastq/PETH1.sra\n", "Written 2590961 spots for empirical_2/fastq/PETH1.sra\n", "Read 1200924 spots for empirical_2/fastq/PHAS1.sra\n", "Written 1200924 spots for empirical_2/fastq/PHAS1.sra\n", "Read 2698423 spots for empirical_2/fastq/PHAS2.sra\n", "Written 2698423 spots for empirical_2/fastq/PHAS2.sra\n", "Read 1622179 spots for empirical_2/fastq/PHAS3.sra\n", "Written 1622179 spots for empirical_2/fastq/PHAS3.sra\n", "Read 1230297 spots for empirical_2/fastq/PHAS4.sra\n", "Written 1230297 spots for empirical_2/fastq/PHAS4.sra\n", "Read 1723043 spots for empirical_2/fastq/PHBL1.sra\n", "Written 1723043 spots for empirical_2/fastq/PHBL1.sra\n", "Read 1753039 spots for empirical_2/fastq/PHBL2.sra\n", "Written 1753039 spots for empirical_2/fastq/PHBL2.sra\n", "Read 1598240 spots for empirical_2/fastq/PHBL3.sra\n", "Written 1598240 spots for empirical_2/fastq/PHBL3.sra\n", "Read 1744633 spots for empirical_2/fastq/PHBL4.sra\n", "Written 1744633 spots for empirical_2/fastq/PHBL4.sra\n", "Read 651033 spots for empirical_2/fastq/PHBR1.sra\n", "Written 651033 spots for empirical_2/fastq/PHBR1.sra\n", "Read 1047774 spots for empirical_2/fastq/PHBR2.sra\n", "Written 1047774 spots for empirical_2/fastq/PHBR2.sra\n", "Read 2913849 spots for empirical_2/fastq/PHBR3.sra\n", "Written 2913849 spots for empirical_2/fastq/PHBR3.sra\n", "Read 3075277 spots for empirical_2/fastq/PHBR4.sra\n", "Written 3075277 spots for empirical_2/fastq/PHBR4.sra\n", "Read 1560169 spots for empirical_2/fastq/PHCE1.sra\n", "Written 1560169 spots for empirical_2/fastq/PHCE1.sra\n", "Read 1655737 spots for empirical_2/fastq/PHCE2.sra\n", "Written 1655737 spots for empirical_2/fastq/PHCE2.sra\n", "Read 2066291 spots for empirical_2/fastq/PHCE3.sra\n", "Written 2066291 spots for empirical_2/fastq/PHCE3.sra\n", "Read 1385481 spots for empirical_2/fastq/PHCE4.sra\n", "Written 1385481 spots for empirical_2/fastq/PHCE4.sra\n", "Read 2873860 spots for empirical_2/fastq/PHCE5.sra\n", "Written 2873860 spots for empirical_2/fastq/PHCE5.sra\n", "Read 1068532 spots for empirical_2/fastq/PHCE6.sra\n", "Written 1068532 spots for empirical_2/fastq/PHCE6.sra\n", "Read 1112293 spots for empirical_2/fastq/PHCN1.sra\n", "Written 1112293 spots for empirical_2/fastq/PHCN1.sra\n", "Read 1854834 spots for empirical_2/fastq/PHCN2.sra\n", "Written 1854834 spots for empirical_2/fastq/PHCN2.sra\n", "Read 1572449 spots for empirical_2/fastq/PHCN3.sra\n", "Written 1572449 spots for empirical_2/fastq/PHCN3.sra\n", "Read 1434649 spots for empirical_2/fastq/PHCN4.sra\n", "Written 1434649 spots for empirical_2/fastq/PHCN4.sra\n", "Read 755638 spots for empirical_2/fastq/PHCO1.sra\n", "Written 755638 spots for empirical_2/fastq/PHCO1.sra\n", "Read 1333197 spots for empirical_2/fastq/PHDI1.sra\n", "Written 1333197 spots for empirical_2/fastq/PHDI1.sra\n", "Read 1461823 spots for empirical_2/fastq/PHDI2.sra\n", "Written 1461823 spots for empirical_2/fastq/PHDI2.sra\n", "Read 1625284 spots for empirical_2/fastq/PHDO1.sra\n", "Written 1625284 spots for empirical_2/fastq/PHDO1.sra\n", "Read 2479285 spots for empirical_2/fastq/PHDO2.sra\n", "Written 2479285 spots for empirical_2/fastq/PHDO2.sra\n", "Read 993112 spots for empirical_2/fastq/PHGO1.sra\n", "Written 993112 spots for empirical_2/fastq/PHGO1.sra\n", "Read 848505 spots for empirical_2/fastq/PHGO2.sra\n", "Written 848505 spots for empirical_2/fastq/PHGO2.sra\n", "Read 2298845 spots for empirical_2/fastq/PHGO3.sra\n", "Written 2298845 spots for empirical_2/fastq/PHGO3.sra\n", "Read 4136233 spots for empirical_2/fastq/PHGO4.sra\n", "Written 4136233 spots for empirical_2/fastq/PHGO4.sra\n", "Read 1130365 spots for empirical_2/fastq/PHHE1.sra\n", "Written 1130365 spots for empirical_2/fastq/PHHE1.sra\n", "Read 1828106 spots for empirical_2/fastq/PHHE2.sra\n", "Written 1828106 spots for empirical_2/fastq/PHHE2.sra\n", "Read 1782490 spots for empirical_2/fastq/PHHE3.sra\n", "Written 1782490 spots for empirical_2/fastq/PHHE3.sra\n", "Read 2137199 spots for empirical_2/fastq/PHHE4.sra\n", "Written 2137199 spots for empirical_2/fastq/PHHE4.sra\n", "Read 1326138 spots for empirical_2/fastq/PHHE5.sra\n", "Written 1326138 spots for empirical_2/fastq/PHHE5.sra\n", "Read 3215345 spots for empirical_2/fastq/PHMC1.sra\n", "Written 3215345 spots for empirical_2/fastq/PHMC1.sra\n", "Read 1083680 spots for empirical_2/fastq/PHMC2.sra\n", "Written 1083680 spots for empirical_2/fastq/PHMC2.sra\n", "Read 1119901 spots for empirical_2/fastq/PHMC3.sra\n", "Written 1119901 spots for empirical_2/fastq/PHMC3.sra\n", "Read 2292044 spots for empirical_2/fastq/PHMC4.sra\n", "Written 2292044 spots for empirical_2/fastq/PHMC4.sra\n", "Read 1769810 spots for empirical_2/fastq/PHMO1.sra\n", "Written 1769810 spots for empirical_2/fastq/PHMO1.sra\n", "Read 1780446 spots for empirical_2/fastq/PHMO2.sra\n", "Written 1780446 spots for empirical_2/fastq/PHMO2.sra\n", "Read 566270 spots for empirical_2/fastq/PHOR1.sra\n", "Written 566270 spots for empirical_2/fastq/PHOR1.sra\n", "Read 2065124 spots for empirical_2/fastq/PHOR2.sra\n", "Written 2065124 spots for empirical_2/fastq/PHOR2.sra\n", "Read 2024585 spots for empirical_2/fastq/PHOR3.sra\n", "Written 2024585 spots for empirical_2/fastq/PHOR3.sra\n", "Read 718036 spots for empirical_2/fastq/PHOR4.sra\n", "Written 718036 spots for empirical_2/fastq/PHOR4.sra\n", "Read 1307372 spots for empirical_2/fastq/PHPL1.sra\n", "Written 1307372 spots for empirical_2/fastq/PHPL1.sra\n", "Read 2900231 spots for empirical_2/fastq/PHPL2.sra\n", "Written 2900231 spots for empirical_2/fastq/PHPL2.sra\n", "Read 1960451 spots for empirical_2/fastq/PHPL3.sra\n", "Written 1960451 spots for empirical_2/fastq/PHPL3.sra\n", "Read 1154546 spots for empirical_2/fastq/PHSH1.sra\n", "Written 1154546 spots for empirical_2/fastq/PHSH1.sra\n", "Read 1650407 spots for empirical_2/fastq/PHSH2.sra\n", "Written 1650407 spots for empirical_2/fastq/PHSH2.sra\n", "Read 940706 spots for empirical_2/fastq/PHSH3.sra\n", "Written 940706 spots for empirical_2/fastq/PHSH3.sra\n", "Read 814375 spots for empirical_2/fastq/PHSH4.sra\n", "Written 814375 spots for empirical_2/fastq/PHSH4.sra\n", "Read 978350 spots for empirical_2/fastq/PHSO1.sra\n", "Written 978350 spots for empirical_2/fastq/PHSO1.sra\n", "Read 1523072 spots for empirical_2/fastq/PHSO2.sra\n", "Written 1523072 spots for empirical_2/fastq/PHSO2.sra\n", "Read 1668677 spots for empirical_2/fastq/PHSO3.sra\n", "Written 1668677 spots for empirical_2/fastq/PHSO3.sra\n", "Read 1311410 spots for empirical_2/fastq/PHTA1.sra\n", "Written 1311410 spots for empirical_2/fastq/PHTA1.sra\n", "Read 1685806 spots for empirical_2/fastq/PHTA2.sra\n", "Written 1685806 spots for empirical_2/fastq/PHTA2.sra\n", "Read 1740419 spots for empirical_2/fastq/PHTA3.sra\n", "Written 1740419 spots for empirical_2/fastq/PHTA3.sra\n", "Read 1522268 spots for empirical_2/fastq/PHTA4.sra\n", "Written 1522268 spots for empirical_2/fastq/PHTA4.sra\n", "Read 1898189 spots for empirical_2/fastq/SCAN1.sra\n", "Written 1898189 spots for empirical_2/fastq/SCAN1.sra\n", "Read 1742309 spots for empirical_2/fastq/SCGA1.sra\n", "Written 1742309 spots for empirical_2/fastq/SCGA1.sra\n", "Read 1595531 spots for empirical_2/fastq/SCMA1.sra\n", "Written 1595531 spots for empirical_2/fastq/SCMA1.sra\n", "Read 1404985 spots for empirical_2/fastq/SCOC1.sra\n", "Written 1404985 spots for empirical_2/fastq/SCOC1.sra\n", "Read 806846 spots for empirical_2/fastq/UMNO1.sra\n", "Written 806846 spots for empirical_2/fastq/UMNO1.sra\n", "Read 2923832 spots for empirical_2/fastq/URBI1.sra\n", "Written 2923832 spots for empirical_2/fastq/URBI1.sra\n", "Read 3465996 spots for empirical_2/fastq/UROR1.sra\n", "Written 3465996 spots for empirical_2/fastq/UROR1.sra\n", "Read 4818547 spots for empirical_2/fastq/UTST1.sra\n", "Written 4818547 spots for empirical_2/fastq/UTST1.sra\n", "Read 131630146 spots total\n", "Written 131630146 spots total\n" ] } ], "source": [ "%%bash\n", "## convert sra files to fastq using fastq-dump tool\n", "## output as gzipped into the fastq directory\n", "fastq-dump --gzip -O empirical_2/fastq/ empirical_2/fastq/*.sra\n", "\n", "## remove .sra files\n", "rm empirical_2/fastq/*.sra" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Make a params file for pyrad analysis" ] }, { "cell_type": "code", "execution_count": 97, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "pyRAD 3.0.63\n" ] } ], "source": [ "%%bash\n", "pyrad --version" ] }, { "cell_type": "code", "execution_count": 104, "metadata": { "collapsed": false }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "\tnew params.txt file created\n" ] } ], "source": [ "%%bash\n", "## create a new default params file\n", "pyrad -n " ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%%bash\n", "## substitute new parameters into file\n", "sed -i '/## 1. /c\\empirical_2/ ## 1. working directory ' params.txt\n", "sed -i '/## 6. /c\\TGCAGG ## 6. cutters ' params.txt\n", "sed -i '/## 7. /c\\30 ## 7. N processors ' params.txt\n", "sed -i '/## 9. /c\\6 ## 9. NQual ' params.txt\n", "sed -i '/## 10./c\\.85 ## 10. clust threshold ' params.txt\n", "sed -i '/## 12./c\\4 ## 12. MinCov ' params.txt\n", "sed -i '/## 13./c\\10 ## 13. maxSH ' params.txt\n", "sed -i '/## 14./c\\empirical_2_m4 ## 14. output name ' params.txt\n", "sed -i '/## 18./c\\empirical_2/fastq/*.gz ## 18. data location ' params.txt\n", "sed -i '/## 29./c\\2,2 ## 29. trim overhang ' params.txt\n", "sed -i '/## 30./c\\p,n,s ## 30. output formats ' params.txt" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "==** parameter inputs for pyRAD version 3.0.63 **======================== affected step ==\r\n", "empirical_2/ ## 1. working directory \r\n", "./*.fastq.gz ## 2. Loc. of non-demultiplexed files (if not line 18) (s1)\r\n", "./*.barcodes ## 3. Loc. of barcode file (if not line 18) (s1)\r\n", "vsearch ## 4. command (or path) to call vsearch (or usearch) (s3,s6)\r\n", "muscle ## 5. command (or path) to call muscle (s3,s7)\r\n", "TGCAGG ## 6. cutters \r\n", "30 ## 7. N processors \r\n", "6 ## 8. Mindepth: min coverage for a cluster (s4,s5)\r\n", "6 ## 9. NQual \r\n", ".85 ## 10. clust threshold \r\n", "rad ## 11. Datatype: rad,gbs,pairgbs,pairddrad,(others:see docs)(all)\r\n", "4 ## 12. MinCov \r\n", "10 ## 13. maxSH \r\n", "empirical_2_m4 ## 14. output name \r\n", "==== optional params below this line =================================== affected step ==\r\n", " ## 15.opt.: select subset (prefix* only selector) (s2-s7)\r\n", " ## 16.opt.: add-on (outgroup) taxa (list or prefix*) (s6,s7)\r\n", " ## 17.opt.: exclude taxa (list or prefix*) (s7)\r\n", "empirical_2/fastq/*.gz ## 18. data location \r\n", " ## 19.opt.: maxM: N mismatches in barcodes (def= 1) (s1)\r\n", " ## 20.opt.: phred Qscore offset (def= 33) (s2)\r\n", " ## 21.opt.: filter: def=0=NQual 1=NQual+adapters. 2=strict (s2)\r\n", " ## 22.opt.: a priori E,H (def= 0.001,0.01, if not estimated) (s5)\r\n", " ## 23.opt.: maxN: max Ns in a cons seq (def=5) (s5)\r\n", " ## 24.opt.: maxH: max heterozyg. sites in cons seq (def=5) (s5)\r\n", " ## 25.opt.: ploidy: max alleles in cons seq (def=2;see docs) (s4,s5)\r\n", " ## 26.opt.: maxSNPs: (def=100). Paired (def=100,100) (s7)\r\n", " ## 27.opt.: maxIndels: within-clust,across-clust (def. 3,99) (s3,s7)\r\n", " ## 28.opt.: random number seed (def. 112233) (s3,s6,s7)\r\n", "2,2 ## 29. trim overhang \r\n", "p,n,s ## 30. output formats \r\n", " ## 31.opt.: maj. base call at depth>x> log.txt 2>&1 " ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "collapsed": true }, "outputs": [], "source": [ "%%bash\n", "sed -i '/## 12./c\\2 ## 12. MinCov ' params.txt\n", "sed -i '/## 14./c\\empirical_2_m2 ## 14. output name ' params.txt" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%%bash\n", "pyrad -p params.txt -s 7 >> log.txt 2>&1 " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Results\n", "We are interested in the relationship between the amount of input (raw) data between any two samples, the average coverage they recover when clustered together, and the phylogenetic distances separating samples. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Raw data amounts\n", "The average number of raw reads per sample is 1.5M." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "count 74.000000\n", "mean 1500455.675676\n", "std 753933.061083\n", "min 469134.000000\n", "25% 1023280.500000\n", "50% 1366480.000000\n", "75% 1648418.250000\n", "max 4436366.000000\n", "Name: passed.total, dtype: float64\n", "\n", "most raw data in sample:\n", "3 GAWI1\n", "Name: sample , dtype: object\n" ] } ], "source": [ "## read in the data\n", "s2dat = pd.read_table(\"empirical_2/stats/s2.rawedit.txt\", header=0, nrows=74)\n", "\n", "## print summary stats\n", "print s2dat[\"passed.total\"].describe()\n", "\n", "## find which sample has the most raw data\n", "maxraw = s2dat[\"passed.total\"].max()\n", "print \"\\nmost raw data in sample:\"\n", "print s2dat['sample '][s2dat['passed.total']==maxraw]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Look at distributions of coverage\n", "pyrad v.3.0.63 outputs depth information for each sample which I read in here and plot. First let's ask which sample has the highest depth of coverage. The std here is the std in means across samples. The std of depths within individuals is much higher." ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "collapsed": false, "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "summary of means\n", "==================\n", "count 74.000000\n", "mean 38.482919\n", "std 15.210132\n", "min 13.655000\n", "25% 27.747500\n", "50% 36.933500\n", "75% 45.540000\n", "max 97.009000\n", "Name: dpt.me, dtype: float64\n", "\n", "summary of std\n", "==================\n", "count 74.000000\n", "mean 339.613689\n", "std 304.751343\n", "min 108.546000\n", "25% 172.106500\n", "50% 235.372000\n", "75% 349.702750\n", "max 2090.841000\n", "Name: dpt.sd, dtype: float64\n", "\n", "summary of proportion lowdepth\n", "==================\n", "count 74.000000\n", "mean 0.716992\n", "std 0.057099\n", "min 0.618293\n", "25% 0.677747\n", "50% 0.702102\n", "75% 0.756614\n", "max 0.894946\n", "dtype: float64\n", "\n", "highest coverage in sample:\n", "17 PHBR4\n", "Name: taxa, dtype: object\n" ] } ], "source": [ "## read in the s3 results\n", "s3dat = pd.read_table(\"empirical_2/stats/s3.clusters.txt\", header=0, nrows=74)\n", "\n", "## print summary stats\n", "print \"summary of means\\n==================\"\n", "print s3dat['dpt.me'].describe()\n", "\n", "## print summary stats\n", "print \"\\nsummary of std\\n==================\"\n", "print s3dat['dpt.sd'].describe()\n", "\n", "## print summary stats\n", "print \"\\nsummary of proportion lowdepth\\n==================\"\n", "print pd.Series(1-s3dat['d>5.tot']/s3dat[\"total\"]).describe()\n", "\n", "## find which sample has the greatest depth of retained loci\n", "max_hiprop = (s3dat[\"d>5.tot\"]/s3dat[\"total\"]).max()\n", "print \"\\nhighest coverage in sample:\"\n", "print s3dat['taxa'][s3dat['d>5.tot']/s3dat[\"total\"]==max_hiprop]" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "57.1020689655 368.84812359\n" ] } ], "source": [ "## print mean and std of coverage for the highest coverage sample\n", "with open(\"empirical_2/clust.85/PHBR4.depths\", 'rb') as indat:\n", " depths = np.array(indat.read().strip().split(\",\"), dtype=int)\n", " \n", "print depths.mean(), depths.std()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Plot the coverage for the sample with highest mean coverage\n", "Green shows the loci that were discarded and orange the loci that were retained. The majority of data were discarded for being too low of coverage. " ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/html": [ "
0102030Depth of coverage (N reads)050001000015000N locidataset2/sample=PHBR4
  • Save as .csv
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import toyplot\n", "import toyplot.svg\n", "import numpy as np\n", "\n", "## read in the depth information for this sample\n", "with open(\"empirical_2/clust.85/PHBR4.depths\", 'rb') as indat:\n", " depths = np.array(indat.read().strip().split(\",\"), dtype=int)\n", " \n", "## make a barplot in Toyplot\n", "canvas = toyplot.Canvas(width=350, height=300)\n", "axes = canvas.axes(xlabel=\"Depth of coverage (N reads)\", \n", " ylabel=\"N loci\",\n", " label=\"dataset2/sample=PHBR4\")\n", "\n", "## select the loci with depth > 5 (kept)\n", "keeps = depths[depths>5]\n", "\n", "## plot kept and discarded loci\n", "edat = np.histogram(depths, range(30)) # density=True)\n", "kdat = np.histogram(keeps, range(30)) #, density=True)\n", "axes.bars(edat)\n", "axes.bars(kdat)\n", "\n", "#toyplot.svg.render(canvas, \"empirical_2_depthplot.svg\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Print final stats table" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "collapsed": false, "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\r\n", "\r\n", "41086 ## loci with > minsp containing data\r\n", "41033 ## loci with > minsp containing data & paralogs removed\r\n", "41011 ## loci with > minsp containing data & paralogs removed & final filtering\r\n", "\r\n", "## number of loci recovered in final data set for each taxon.\r\n", "taxon\tnloci\r\n", "CADR1\t2794\r\n", "CADR2\t2368\r\n", "COTE1\t2169\r\n", "GAWI1\t770\r\n", "HOMA1\t1473\r\n", "PETH1\t2020\r\n", "PHAS1\t4764\r\n", "PHAS2\t7858\r\n", "PHAS3\t6246\r\n", "PHAS4\t5945\r\n", "PHBL1\t9814\r\n", "PHBL2\t7591\r\n", "PHBL3\t6895\r\n", "PHBL4\t6951\r\n", "PHBR1\t4095\r\n", "PHBR2\t4235\r\n", "PHBR3\t8615\r\n", "PHBR4\t9529\r\n", "PHCE1\t8592\r\n", "PHCE2\t9942\r\n", "PHCE3\t8272\r\n", "PHCE4\t7031\r\n", "PHCE5\t9280\r\n", "PHCE6\t7123\r\n", "PHCN1\t5035\r\n", "PHCN2\t6544\r\n", "PHCN3\t5789\r\n", "PHCN4\t6183\r\n", "PHCO1\t5207\r\n", "PHDI1\t6054\r\n", "PHDI2\t6145\r\n", "PHDO1\t6949\r\n", "PHDO2\t8862\r\n", "PHGO1\t5427\r\n", "PHGO2\t4957\r\n", "PHGO3\t7282\r\n", "PHGO4\t7148\r\n", "PHHE1\t6596\r\n", "PHHE2\t8473\r\n", "PHHE3\t6519\r\n", "PHHE4\t8101\r\n", "PHHE5\t5674\r\n", "PHMC1\t8894\r\n", "PHMC2\t6284\r\n", "PHMC3\t6505\r\n", "PHMC4\t6808\r\n", "PHMO1\t5078\r\n", "PHMO2\t5500\r\n", "PHOR1\t3478\r\n", "PHOR2\t8272\r\n", "PHOR3\t8218\r\n", "PHOR4\t4160\r\n", "PHPL1\t5890\r\n", "PHPL2\t8079\r\n", "PHPL3\t6535\r\n", "PHSH1\t6346\r\n", "PHSH2\t7586\r\n", "PHSH3\t4994\r\n", "PHSH4\t5671\r\n", "PHSO1\t3819\r\n", "PHSO2\t5372\r\n", "PHSO3\t6552\r\n", "PHTA1\t5481\r\n", "PHTA2\t6787\r\n", "PHTA3\t6527\r\n", "PHTA4\t7901\r\n", "SCAN1\t1165\r\n", "SCGA1\t1383\r\n", "SCMA1\t1489\r\n", "SCOC1\t1651\r\n", "UMNO1\t1099\r\n", "URBI1\t1277\r\n", "UROR1\t1481\r\n", "UTST1\t1861\r\n", "\r\n", "\r\n", "## nloci = number of loci with data for exactly ntaxa\r\n", "## ntotal = number of loci for which at least ntaxa have data\r\n", "ntaxa\tnloci\tsaved\tntotal\r\n", "1\t-\r\n", "2\t-\t\t-\r\n", "3\t-\t\t-\r\n", "4\t10107\t*\t41011\r\n", "5\t5693\t*\t30904\r\n", "6\t4159\t*\t25211\r\n", "7\t3133\t*\t21052\r\n", "8\t2510\t*\t17919\r\n", "9\t1924\t*\t15409\r\n", "10\t1545\t*\t13485\r\n", "11\t1310\t*\t11940\r\n", "12\t1056\t*\t10630\r\n", "13\t986\t*\t9574\r\n", "14\t781\t*\t8588\r\n", "15\t690\t*\t7807\r\n", "16\t584\t*\t7117\r\n", "17\t529\t*\t6533\r\n", "18\t492\t*\t6004\r\n", "19\t413\t*\t5512\r\n", "20\t391\t*\t5099\r\n", "21\t329\t*\t4708\r\n", "22\t319\t*\t4379\r\n", "23\t299\t*\t4060\r\n", "24\t289\t*\t3761\r\n", "25\t264\t*\t3472\r\n", "26\t255\t*\t3208\r\n", "27\t237\t*\t2953\r\n", "28\t211\t*\t2716\r\n", "29\t185\t*\t2505\r\n", "30\t192\t*\t2320\r\n", "31\t179\t*\t2128\r\n", "32\t138\t*\t1949\r\n", "33\t119\t*\t1811\r\n", "34\t168\t*\t1692\r\n", "35\t151\t*\t1524\r\n", "36\t106\t*\t1373\r\n", "37\t107\t*\t1267\r\n", "38\t109\t*\t1160\r\n", "39\t96\t*\t1051\r\n", "40\t90\t*\t955\r\n", "41\t85\t*\t865\r\n", "42\t93\t*\t780\r\n", "43\t71\t*\t687\r\n", "44\t49\t*\t616\r\n", "45\t56\t*\t567\r\n", "46\t45\t*\t511\r\n", "47\t47\t*\t466\r\n", "48\t46\t*\t419\r\n", "49\t50\t*\t373\r\n", "50\t42\t*\t323\r\n", "51\t30\t*\t281\r\n", "52\t33\t*\t251\r\n", "53\t31\t*\t218\r\n", "54\t18\t*\t187\r\n", "55\t26\t*\t169\r\n", "56\t16\t*\t143\r\n", "57\t16\t*\t127\r\n", "58\t11\t*\t111\r\n", "59\t15\t*\t100\r\n", "60\t11\t*\t85\r\n", "61\t11\t*\t74\r\n", "62\t8\t*\t63\r\n", "63\t11\t*\t55\r\n", "64\t6\t*\t44\r\n", "65\t10\t*\t38\r\n", "66\t7\t*\t28\r\n", "67\t7\t*\t21\r\n", "68\t6\t*\t14\r\n", "69\t0\t*\t8\r\n", "70\t4\t*\t8\r\n", "71\t2\t*\t4\r\n", "72\t1\t*\t2\r\n", "73\t0\t*\t1\r\n", "74\t1\t*\t1\r\n", "\r\n", "\r\n", "## nvar = number of loci containing n variable sites (pis+autapomorphies).\r\n", "## sumvar = sum of variable sites (SNPs).\r\n", "## pis = number of loci containing n parsimony informative sites.\r\n", "## sumpis = sum of parsimony informative sites.\r\n", "\tnvar\tsumvar\tPIS\tsumPIS\r\n", "0\t8048\t0\t16090\t0\r\n", "1\t5782\t5782\t7148\t7148\r\n", "2\t4644\t15070\t5054\t17256\r\n", "3\t4018\t27124\t3757\t28527\r\n", "4\t3561\t41368\t2869\t40003\r\n", "5\t3270\t57718\t2249\t51248\r\n", "6\t2957\t75460\t1492\t60200\r\n", "7\t2201\t90867\t893\t66451\r\n", "8\t1702\t104483\t608\t71315\r\n", "9\t1332\t116471\t377\t74708\r\n", "10\t1037\t126841\t211\t76818\r\n", "11\t698\t134519\t128\t78226\r\n", "12\t542\t141023\t71\t79078\r\n", "13\t372\t145859\t36\t79546\r\n", "14\t262\t149527\t14\t79742\r\n", "15\t198\t152497\t8\t79862\r\n", "16\t115\t154337\t2\t79894\r\n", "17\t87\t155816\t2\t79928\r\n", "18\t67\t157022\t1\t79946\r\n", "19\t43\t157839\t1\t79965\r\n", "20\t28\t158399\t0\t79965\r\n", "21\t16\t158735\t0\t79965\r\n", "22\t8\t158911\t0\t79965\r\n", "23\t9\t159118\t0\t79965\r\n", "24\t7\t159286\t0\t79965\r\n", "25\t6\t159436\t0\t79965\r\n", "26\t0\t159436\t0\t79965\r\n", "27\t0\t159436\t0\t79965\r\n", "28\t0\t159436\t0\t79965\r\n", "29\t0\t159436\t0\t79965\r\n", "30\t1\t159466\t0\t79965\r\n", "total var= 159466\r\n", "total pis= 79965\r\n", "sampled unlinked SNPs= 32963\r\n", "sampled unlinked bi-allelic SNPs= 18577\r\n" ] } ], "source": [ "cat empirical_2/stats/empirical_2_m4.stats\n" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "\n", "87440 ## loci with > minsp containing data\n", "87387 ## loci with > minsp containing data & paralogs removed\n", "87325 ## loci with > minsp containing data & paralogs removed & final filtering\n", "\n", "## number of loci recovered in final data set for each taxon.\n", "taxon\tnloci\n", "CADR1\t7329\n", "CADR2\t6453\n" ] } ], "source": [ "%%bash \n", "head -n 10 empirical_2/stats/empirical_2_m2.stats\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Infer ML phylogeny in _raxml_ as an unrooted tree" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "%%bash\n", "## raxml argumement w/ ...\n", "raxmlHPC-PTHREADS-AVX -f a -m GTRGAMMA -N 100 -x 12345 -p 12345 -T 20 \\\n", " -w /home/deren/Documents/RADmissing/empirical_2/ \\\n", " -n empirical_2_m4 -s empirical_2/outfiles/empirical_2_m4.phy\n", " " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "%%bash\n", "## raxml argumement w/ ...\n", "raxmlHPC-PTHREADS-AVX -f a -m GTRGAMMA -N 100 -x 12345 -p 12345 -T 20 \\\n", " -w /home/deren/Documents/RADmissing/empirical_2/ \\\n", " -n empirical_2_m2 -s empirical_2/outfiles/empirical_2_m2.phy\n", " " ] }, { "cell_type": "code", "execution_count": 227, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "\n", "This is RAxML version 8.0.16 released by Alexandros Stamatakis on March 21 2014.\n", "\n", "With greatly appreciated code contributions by:\n", "Andre Aberer (HITS)\n", "Simon Berger (HITS)\n", "Alexey Kozlov (HITS)\n", "Nick Pattengale (Sandia)\n", "Wayne Pfeiffer (SDSC)\n", "Akifumi S. Tanabe (NRIFS)\n", "David Dao (KIT)\n", "Charlie Taylor (UF)\n", "\n", "\n", "Alignment has 296560 distinct alignment patterns\n", "\n", "Proportion of gaps and completely undetermined characters in this alignment: 86.18%\n", "\n", "RAxML rapid bootstrapping and subsequent ML search\n" ] } ], "source": [ "%%bash \n", "head -n 20 empirical_2/RAxML_info.empirical_2_m4" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "%%bash \n", "head -n 20 empirical_2/RAxML_info.empirical_2_m2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Plot the tree in R using `ape`" ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "collapsed": true }, "outputs": [], "source": [ "%load_ext rpy2.ipython" ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAlgAAAMgCAMAAADIvuz2AAADAFBMVEUAAAABAQECAgIDAwMEBAQF\nBQUGBgYHBwcICAgJCQkKCgoLCwsMDAwNDQ0ODg4PDw8QEBARERESEhITExMUFBQVFRUWFhYXFxcY\nGBgZGRkaGhobGxscHBwdHR0eHh4fHx8gICAhISEiIiIjIyMkJCQlJSUmJiYnJycoKCgpKSkqKior\nKyssLCwtLS0uLi4vLy8wMDAxMTEyMjIzMzM0NDQ1NTU2NjY3Nzc4ODg5OTk6Ojo7Ozs8PDw9PT0+\nPj4/Pz9AQEBBQUFCQkJDQ0NERERFRUVGRkZHR0dISEhJSUlKSkpLS0tMTExNTU1OTk5PT09QUFBR\nUVFSUlJTU1NUVFRVVVVWVlZXV1dYWFhZWVlaWlpbW1tcXFxdXV1eXl5fX19gYGBhYWFiYmJjY2Nk\nZGRlZWVmZmZnZ2doaGhpaWlqampra2tsbGxtbW1ubm5vb29wcHBxcXFycnJzc3N0dHR1dXV2dnZ3\nd3d4eHh5eXl6enp7e3t8fHx9fX1+fn5/f3+AgICBgYGCgoKDg4OEhISFhYWGhoaHh4eIiIiJiYmK\nioqLi4uMjIyNjY2Ojo6Pj4+QkJCRkZGSkpKTk5OUlJSVlZWWlpaXl5eYmJiZmZmampqbm5ucnJyd\nnZ2enp6fn5+goKChoaGioqKjo6OkpKSlpaWmpqanp6eoqKipqamqqqqrq6usrKytra2urq6vr6+w\nsLCxsbGysrKzs7O0tLS1tbW2tra3t7e4uLi5ubm6urq7u7u8vLy9vb2+vr6/v7/AwMDBwcHCwsLD\nw8PExMTFxcXGxsbHx8fIyMjJycnKysrLy8vMzMzNzc3Ozs7Pz8/Q0NDR0dHS0tLT09PU1NTV1dXW\n1tbX19fY2NjZ2dna2trb29vc3Nzd3d3e3t7f39/g4ODh4eHi4uLj4+Pk5OTl5eXm5ubn5+fo6Ojp\n6enq6urr6+vs7Ozt7e3u7u7v7+/w8PDx8fHy8vLz8/P09PT19fX29vb39/f4+Pj5+fn6+vr7+/v8\n/Pz9/f3+/v7////isF19AAAgAElEQVR4nO2dC3gU1dnHD7cCQcFIVWwUIdwNkEggC4IEJeAFUQFT\nxRuiEBWx9auXyEXiBZpoWxTlFmz5CtqqAbQgfNysWq1WJSqVWqoYBEWUKgEicmfPN2dmluwse052\nLoednfP/PU8zO9nNzFB/z+zmn/c9L6EASIAk+wJAMIFYQAoQC0gBYgEpQCwgBYgFpACxgBQgFpAC\nxAJSgFhAChALSAFiASlALCAFiAWkALGAFCAWkALEAlKAWEAKEAtIAWIBKUAsIAWIBaQAsYAUIBaQ\nAsQCUoBYQAoQC0gBYgEpQCwgBYgFpACxgBQgFpACxAJSgFhAChALSAFiASlALCAFiAWkALGAFCAW\nkALEAlKAWEAKEAtIAWIBKUAsIAWIBaQAsYAUIBaQAsQCUoBYQAoQC0gBYgEpQCwgBYgFpACxgBQg\nFpACxAJSgFhAChALSAFiASlALCAFiAWkALGAFCAWkALEAlKAWEAKEAtIAWIBKUAsIAWIBaQAsYAU\nIBaQAsQCUoBYQAoQC0gBYgEpQCwgBYgFpACxgBTkiLXrlvPP/o22vSqP0j3NQ53OfZtWNw/l9SgO\nT8npMFPKKYG/kCLW/pwl4b0ZW+kb17Wl9LWbaPgPWXTNWEq/bvHKiCNVLWScEvgMKWJNnax92VB9\ntP9XWbvpY9odak8anTafHigd9eZmuvxSGacEPsNbsVoTxp09PtP3Fk6lQz6gV79H6afZ9CrtiW6H\n6IEnz6/y9JTAn3grVsfOZRp/S/ue0i+n/niW5lIF7XiA0tKJNHM//aIJ3ZT/6/2enhH4FG/F6jVE\n3/SYRfcPXDz1YUofLf1vL0rf7rprey92sp19v/T0fMC3SBHro1DPPgu/zfqR0mfHrEgP5Q3/mi4d\nz05WekZW1lBPzwh8ihSxAIBYQAqSxOImpJSGh2319JzAl8gRi5+Q0lfanxb29JzAl3grVod2xRqT\ntvMTUnqkCgGpCngrVstG6RqnrhEkpHTJg56eEvgTOW+F/ISU0glLPT0l8CdyxOImpNr/Bn3t6SmB\nP5EjFjch1X4pzPL0jMCnIMcCUlBLLCNe23t3v5zBn9NLQyd3D81f1SoUCs1BuuY1SollxGv/yftT\nOPxEf22/vfZe/chs9gzSNa/xrVh7qr1mr1mAOK5E+1qTRul/s7UHQyvZ2ZCueY1fxdpTj3hNw41G\nvHYuqzTcpP0OsWKM9rtEK+2ZQUjXPMevYu0k15R7zB8OGPFa2g/a8adMpPSheZR+lWOcD+max/hX\nrBleHaoWI17Lf4qGn8vR7LrsI0pfGmc8hXTNY5QSS4/X6OcF5/UdX6O9C55zSLtTtdZ+KVyOdM1z\nfCPW6WdmRtOG/I+H1wVOOH4RK0yyCqO5kkz08sLAicY/Yj1k2ZfwVvjCryg92vqoGY/ObBKmdEG9\nGqMYEQGp1ygk1q9epPTfBfvMePSmjG/p/t5dqVGMiIDUa/wq1vfk/CJPGf/tBVu0W9TESSVUj0ez\n73iX/u4XN1OjGBEBqdf4Vay9Ta0f5l3T/v1mLCV92YxHa3rOeGFXwZOzzGJEBKRe41exvGfDAO3L\nRdvMePT125aWPbDmxvfNYkQEpF6jjljzJ1B6uBM149HHf//x0KG02wGjGBEBqeeoI9Ydyyj96KpI\nPFr4zxpSuSvPLEZEQOo56ogFTigQyyuOnHOU0l9WRPKxuW2OUNrliJmaqZeTQSyv+Ocg7UvvrZF8\n7JZ2K+j2/Ehqpl5OlpJi7fqkynccnjWZ0oNtwpF8rMeKy+nSeyeVUD01Uy8nS0mxWnteBOie4pGv\nUPr+lZF8bG9uOGvL5BePFRUql5OlpFh9Olf4ji1td1D6RGkkH3trLH1qwsWbjxUVKpeTpaRYBf2c\nnkYiLd6n2zK3RPKx6c/Q3W07hY8VFSqXk0Esr3i+Xfc+b0TyMTpyPaVFl9QWFSqXk0EsIAWIBaQA\nsVRkX/GATgO3GTWO2Svpay0pLX14/WVe5rgQS0H29X0uHH56nFHjqIl1aevwoY7fzSvxMsdVWqyL\nLixUkTcenKr9448eNGocs1e+N/qiPc/fSccu9zLHVVqsn5yeqyLPdvmCziDkaaPGMXvliE9HbDn/\nc5qzw8scV2mxTrnbm+OkGs1qtC9j1hg1jtm/G0nHzLma7mtHvcxxIZaCXPg0DS9q/IBR45h91j/p\nfWe/S98ZQb3McSGWgmzKzx44M2TWOGYPpfTXF1A6o8zTHBdiASlArBRh+3U5rUuPjfyIDqL8WUQI\nsVKEgcvCXzeOjPywBFH+LCKEWC44uqryRPFlxpw99JvIUGRLEOXPIkKI5YKyE1dHeNrHt3bovZqa\nIz8sQZQ/iwghlgueJ8+sPUH89jANL29ljvywBlH+LCKEWC6oIP/y5kLqJv0temh+obkmoTWI8mcR\nobditUpzvLYCuS7x0ygo1pLO3bLu3m2uSWgNovxZROitWDf3cfyXUVKU+GkUFCvl8FYs53j0Vjh5\nVLENGo9yd9EQi0/AxDrJ3u9aF8b+/G3N7Lx9n0H+4exfy8OSgkaaqM1M1JcxKJ+AidX5WjsnPf6t\ncFgTO4u5FZAP7JyubqJT0MhkFjMT9WcMygdiWRj/Uzs/H3krHNStwBPuj05BI5NZzEzUnzEoH4hl\nwZlYPz3DG7Huik5BI03UZibqzxiUD8Sy4EysLtfY+Sk+i6NT0EgTtZmJ+jMG5QOxLCRXLEsKGmmi\nNjNRf8agfCCWBXtiLSC36rFFy4vt/BQfSwoaaaI2M1F/xqB8IJYFe2L9wZx9Vy/F/qOfCCCWBXti\nRfDqrVDHkmWl7vwMiGXBB2JZKvpSd34GxLLgTKz2Azwrj3kzbKnoS935GRDLgjOxmnlY0fdudJaV\nwvMzIJYFZ2JVvuZZCfKHi6KzrBSen6G0WCcPjJ0bfeEpdn5eBpYsK4XnZygtVqPj34ka2Pl5GViy\nrBSen6G0WJs/i11U+6aWdn4e8FFarONx9hnLMb175XWeRFe1CoVCcy4Nndw9NN8cA5tKiVV8IJaF\nEyvWwY6UftGMPjLb2G3/Y2QMbEolVvGBWBZOrFiVI+nO+wbSoZX63n+ztS/GGNiUSqziA7EsHCfW\nntO6yFsB7VZCGvbfHG6l/dIwiNIVYyg9YoyBTanEKj4Qy8JxYm0huR4u0hhDwRp2iq9yjFM9NO/Y\nGNiUSqziA7EsHCfWNjLP3RFFdK1mX18aZ+xd9tGxMbAplVjFB2JZOKFi1XTSNxNaa78ULqfhcw4d\nGwObUolVfCCWhRN7xwoyEMsCxPIKiGXB12KZw4GNJNWoBzQe+zFOhVgWfC2WORzYSFKNekDjsR/j\nVIhlYVCDmC761uQB9v0fpiZ70GZFxQfmcGAjSTXqAY3HfoxTIZaFJ3rHdNFfT9iqjHSSh6V8Tulo\nDAc2klRq1AMaj/0Yp0IsMeZb4XPk1WRMMLfwvTEc2EhSzd5W/bEv41T/iJWe+CovTc/iHUaWWH8i\nn3p8YAcYw4GNJNWsB9Qf+zJO9YtYtP/wxFd5yejIO0qgxTKGAxtJqlkPqD/2ZZzqG7HscOLfCn0h\nVkoBscT4WSzrsAq9pXVumyOUdjnig2ALYonxs1iW1lajpfWWdivo9nw/BFsQS8znJC1doxlZ5/GB\nPcDS2mq0tPZYcTldeq8fgi2IJeZwwR1sPZmh5JNj35rTtK2dhUqlEaqMHlaht7TuzQ1nbZn8oh+C\nLYiVENFvhQ+RG+wsVCqNKw5EtbYaLa1vjaVPTbh4sx+CLYiVENFizSHfyDmJTSytrUZL6/Rn6O62\nncJ+CLYgVkL4Uazo1lazpXXkekqLLvFFsAWxEsKPYvkbiJUQEMsuECsh5pKxxwalDCbb5ZzEDWYR\nYHXzUF6PYro4I9RhAl1/WRLXAIRYCTG7/inpEdLIF3JO4gazCHDNWEq/bkHvf47uSKfzSpKYlEIs\n2yT0VrhoSuwKSVJZZBYBTptPD5SOohduOjzjSjp2eRKTUohlm4TEOv3EFgE2GmEUAV6lPe526Ghz\nbfMizdmRxKQUYtkmIbH6X1B9IvnBLALM3E+/aEI3av//vNtyXzuaxKQUYtkmIbHyj5tYJxejCHB7\nL8r+ky68X3sv7vPOCJrEpBRi2WYOWV33WqI98uVfSDRGEeDS8ZT9J72rTajHgM9mlCUzKYVYtrkz\noU89p8u/EF8DsWyzsyyBXq2sPvIvJCa8MieymnvJrvWDWHI4IZ+xLOHVPnMiqxllJbvWD2LJ4YSI\nZQmvJpVQfSKrGWUlu9YPYskho2G6dPqOjA6vIhNZjb2k1/pBLDlMGlEsnact4VVkIquxl/RaP4iV\nwljCK3Miq7mX9Fo/iJXCWMIrcyKruZf0Wj+IBaQAsYAUIJYYYyiJOYjEqJ8zE8j1fXraWCBHOSCW\nEHMoiTmIxKifMxLIcNeN3yd9Bp2PgVgcjo56qKysbIExlMQcRGLUzxkJ5MF1+58Y5NXJAgjE4vCh\n/qfk8/WhJOYgErN+LpJAXkMWe3WyAAKxOGwiz2pfR+tDScxBJEb9nJlAHg4ffvXEzqBLLSAWB0Ms\nYyiJOYjEqJ8zE8gm39Bl+V6dLIBALA66WOZQEnMQiVE/ZyaQ03r3HrHNq5MFEIjFwbhjAadALA61\nYnGjrPCUnA4zvTpf0FBerO84fS+VEbH4UdYrI45UtXD+rwg2qov1EL9qfYLxikpulPXmZro86Svn\n+RXVxXqSlMXvLX6U/NTwaxg/yjrw5PlVLv8pgUV1sZ4hX8V/YhPJ6FjGuJIbZW3K//V+m6dTB4jF\nFav9xfoDbpS1s++XNk+mEhCrDrH4UVbpGVlZQ22eTh0gVl13LOAIiAWxpACxOGL9m5zUnW35lX7J\n7jX2NxCLI9b2Fk07UFE8mvReY38DsThiUdrngkH3T1rNjUeT3mvsbyAWX6z2pPmpdwkq/ZI/V8TH\nQCy+WN3JRkGlX9J7jf0NxBKLxa/0S3qvsb+BWFyxcs8hGwWVfknvNfY3KSlWdiveRKxT7K4eJBDr\nNELecXmhCpOSYuW14M3wa5Rt81ACsbaUaG+FohyLrhzi7t8RZFJSLD5evhXS5zWxBDkWfTOvxNXF\nBhqIJRLrver3+TlWZdlda9xdbZCBWHyxHiaEnMfNsf41PRyqcXe1QQZi8cXadsvc8iHcHKufplcX\nF5cacCAWXyyGKMf6dJizi1QCiCUUS5Rj4U86IiCW+I4FHAKxIJYUIJYtsSxx6fbrclqXRnZQ9mdF\nIbGG9Ss8np625vBa49KBy8JfN47soOzPikJitTg193jOIZsSPvh3eaMscWnGnD30m8gOyv6sKCRW\nh+vjfNPOW+EKclp0XEo/vrVD79WRHfyOaAViJS7We2RQdFy6+DANL29l7qDsLwaIZUesc6Lj0vS3\n6KH5heYOyv5igFh2xDpL35px6ZLO3bLu3m3uoOwvBohlR6wV9q9IVSCWPbH4QRYmVViAWHWKtWeV\nOZl+AVnBD7IwqcIKxKpTrH61i/xNqeQGWZhUYQVi1SnWmOZrDR4jZLwgyMKkimggVp1i3dXSfLCa\nDL6RG2RhUoUViGVHrNu7coMsTKqwArHsiHWrpe7PEmRhUoUViGXrjuXsqlQEYkEsKUCsxMWqIMNF\nlX5ojI4GYiUu1odkvKDSD43RFiBW4mJ9R9rN51f6oTHaAsSyI1baWG5AisZoKxDLjli5o7kBKRqj\nrUAsW2LxA1I0RluBWHbEOpcfkKLo3QrESlysarOCFCQAxEpcLNpxJDfGOoIxvlZUF2su+fPaOrjq\nWP1ep2u4MRbG+Maguli/5I/uPUaDyIvb53NjLIzxjUF1sX5cUlkX16RHXpzWkF/nhzG+VlQXKwFq\nP2OtHMyNsTDGNwaIVSe1YlFujIUxvrFArDqpFauGG2NhjG8sEKtOou5YIGEgVp1ALCdArDqxiGVJ\nSC8Nndw9NJ/Sq/LYU6jziwZi1Um0WNZCP0rb/6h9eeO6thR1fjFArDphYt2bWaBzgyUhpf9lM6GO\n9v8qazfq/GKAWHXCxOrX1BCrvyUhpSvGaM8vnEqHfIA6vxggVp0wsUa1NR5bCv0ofWgepT+epUlW\ngTq/GCBWnUSLZUlIKb3sI0qnPkzpo6Wo84sBYtVJlFjWhJSGzzlEv83SPr8/OwZ1fjFArDqJvmOB\nRIFYdRItFjfGwnp+MUCsOhlzckVFfob+kBtjYT2/WCBWnegr+jU+On/t2rWv8WIsrOcXC8Sqkz0f\nVFUNP+f3TK+LeDEW1vOLBWIlxKi2q8kfKiuv58VYWM8vFoiVEKPariF/F8RYWM8vFoiVEIZY/BgL\n6/nFArESwrxjgYSBWAlRKxYvyQqjY9UCxEqIXvVPIi+xB9wkCx2rViBWQswccg15lT2o5CVZ6Fi1\nArESRH8rfPTULG6ShY5VCxArQXSxbqzXYSXbiZNkoWPVCsRKEF2syQ15SdZ96Fi1ArESxBSLl2QR\ndKxagVgJspRcUVTUo76bQygFxEqQNQ1bpKc3qefmEEoBsewwuaGoZTU8bKvrMwQGiGUHTSx+y+or\n7U8Luz5DYIBYdri3QcVfuC2rR6oQkNYCsexwOSHn8Wv90KcTBcSyw4/rqm7g1fpROmGp6xMEB4hl\nE0HL6qCvvThBQIBY9hC0rIazPDh+YIBYQAoQy3MsQZcRcRkzLJQKuiCW18QEXXrEZcywUCroglhe\nYy0FNCIuY4aFUkEXxPKOgy3OzMzMLLQEXUbEpc+wUCvogljesZv0LioqKogOuoyIy5hhoVbQBbG8\nYw/5HY0JuoyIy5hhoVbQBbG8QxfLEnSZEZcxw0KtoCtgYp1xdiGXkwZIPrlxxwI6AROr+9m5XBr3\nknxyiBVFwMQScYI+Y1F+RKpSt7SKYs2/vUwKDxNjtUhuRKpSt7SKYp2VwLheZ9xI6aIp5St5EalK\n3dIqinVxHzkn0N8KTyfkSm5EqlC3NMTyDl2s/hdU38yLSFXqloZY3qGLlX8hNyIdqFK3NMTyDlMs\nbkT6gErd0hDLOyJ3LEAhlpfsIVOrq/uyj1b8Wj9l5rBCLO/YpUcOrUS1furMYYVYHvI/vykv75gr\nqvVTZw4rxPKY/E4NTunIC7IUmsMKsTwmvz25vQ8vyFJoDivE8pj8LmSzoNZPmTmsEMtjNLE28Gv9\n1Cl7h1ge07kZ+VTi4VMGiOUxl51K/i3x8CkDxPKaP5LN/IBUnQm/EMtrNLG4AalCE34hltfMIvcX\nX8sJSBWa8AuxvObXhDTkN0MrM+EXYklgNC8gVWjCL8SSADcgVWjCL8TyHn4ztEITfiEWkALE8hhL\nhDWzSVj7xbBezd67++UM/lylflWI5THWCOumjG/p/t5d9+X9KRx+or9K/aoQy2OsNX7Zd7xLf/eL\nmyeVaM/UpKnUrwqxPORQA0IGRUdYNT1nvLCr4MlZ57Iu1U1ZKvWrQiwP+ZFcWjYkOsJ6/balZQ+s\nufH9tB+03SkTVepXhVge8iN5zBphPf77j4cOpd0O5D9Fw8/lbFWpXxVieYgmlnVwReE/a0jlrjz6\necF5fcfXlKrUrwqxsjvwl2qzyXnkrmT8u/yJ8mI1OqfAKy4k45Px7/Inyot10r2enYB9xhIFpErN\nPIFY3oolCkiVmnkCsbwUa0DxNEFAqtTME4jlnViHGjVL7ywKSNVp/oJYXorFGC0ISJWaeQKxvBVL\nEJD+oNTME4jlqViigJQqNfMEYnl7xwImEMuJWJawygypqpuH8noUU3XW7BMDsRyIZQmrIiHVmrGU\nft1CoTX7xEAsW2LtrGZYqvkmlVA9pJo2nx4oHaXQmn1iIJYdsSqMySbDosOqSEh1lbbT7ZBCa/aJ\ngVh2xHqF3F+uMSI6rIqEVJn76RdNqEJr9omBWHbEWk3eZhtLWGWGVNvZNET2/6Yya/aJgVj2xbKG\nVWZItZSVzLD/NxX6s40IiOXgjgXqBmJBLClALGdiCSJSdVbtEwGxasUqyYkdfB9Lf2JmVIKIVKFV\n+0SoK9bcnrorDW889oJejetql+hAZvYcU1RUdPtL/IhUoVX7RKgr1oAGmYz6tR1Z13eo6xCryZ3k\nLPZT4/gRqUKr9olQV6zbztR3ot4KExFrKtnAHozmR6QKrdonAmI5EksQkSq0ap8IiOVELFFEqtCq\nfSIglqM7FqgLiGVHrFVktCGWqNJPoa5UARDLjljztN/7/krFlX4qdaUKgFh2xDq66DGy4ZOqqlWC\nSj+VulIFQCw7YlG6iJRod616ohgL5Q0MiGVXrBmktOKiVexx3BhLqa5UARDLrlgzyTpxpZ9CXakC\nIJYDsYSVfip1pQqAWPbEmk7SyGrPLy2AQCx7Ym3IG0T+7vmlBRCIZU8s1qmzThSQKjXXRADEciKW\nICBVaq6JAIhlV6xnSUHhFH5AqtRcEwEKiXVaa7PAuFU7tutUrNUnZedeLAhIVZprIkAhsbJamwXG\nzTPYrlOxGII6P6XmmghQSKxjuHsrZPAD0p1KzTURoLxYTfKKI5ybkeABBAGpWnNNBCgvVhqppVlS\nLytYKC9WFIm/FcbHEm7taR7qdO7bdP1lilb+QaxaXIplHUrx2k00/IcsOq9E0co/iFWLO7FWv2kZ\nSvHYTEr3pNGxyxWt/INYtbgSawbpFR1u0avfo/TTbJqzQ9HKP4hViyuxlpDB0eEW7XiA0tKJ+1gW\nq2TlH8SqxZVYfyHto8Ot//ai9O2uu94ZQRWt/INYtbgUq42+NcOtFemhvOFf0xllqlb+QaxaRGJd\nX4/UxavyrjgFgVi1iMQa2bRYzHDynrwrTkEgVi0iscbXtYDMX8iH8QNSRSv/IFYtLsV6L35Aqmjl\nH8SqxZVYz5Bb4wekilb+QaxaXIn12mlXcgJSNSv/IFYtrsSKqf6rDUgVrfyDWLW4FKtr3IBU1co/\niFWLO7Gs1X/HAlJVK/8gVi0u71ggGohVi2Ox4gVYRheroiEWhVjROBUrXoWf2cWqaIhFIVY0TsT6\n20tr18ar8JtUQlkXq6IhFoVY0Qw/s4LLJZzxOE0JIUPiBFiRLlY1QywKsaLpKqpdaBD/WG0uq6y8\nIU6AZXaxKhpiUYgVzXcfVnG5qWX8Y7W/IX6AZXSxblU0xKIQK1Ein7EOVlvJvCF+gGV0saoaYlGI\nlSimWDvqx75FZku4wCAAsRLDFOsrMrzMwk8vp6IcS9FuVQqxEsUUaxuZZ/0++4wlyLEU7ValECtR\nRGJV8nMsRbtVKcRKFJ5Yp5yam3u9IMdSs1uVQqxE4Yk1uE9h4QB+jqVotyqFWInCE4vBz7F+ULRb\nlUKsRBGIJcixVO1WpRArUUR3LBAHiJUYEMsmECsxhjYqYFxAHor/vCUjNTerWoVCoTlrM7Kyyt1f\ndKoBsRLj3tP1lby7k8lxn7ZmpObmkdnsqVv/4/qCUxGIZYuYt8IdU81yrb9aMlJzM7RSe8nBNp1b\nLvDiqlMMiGWLGLFujPwpemh0Rmpuwq20r4M23vPdAhUbMSCWLWLE+i1Zb5ZrRWek5uarHPNVuxJd\nPz5IQCxbxIg1new2HlgyUnPz0jj2vcEb6fRiV5ebmkAsW3DEsmak5mZCa+2XwuWLew++54DLC05F\nIJYteHcsEAvEsgVHLH6KhUI/hfBcLEGKhUI/hXAl1g3lURQaYlVyUywU+qmEK7FiCKVnZnaZwk2x\nUOinEi7EojsszV/TSMefFRXdfq0gxUKhnzq4EcvKdNL3QipKsVDopxJeiyVIsVDopxKe37HA8UAs\nN0AsLhDLDRax+Cnp+j49OfWBwQViueFh0iY38pifkoa7bvyes7xWcIFYbvglIc1mGVnpwvf4tX7r\n9j8xyP3JUguI5Y7ux6LSBwUp6TVksRcnSyUgljtyyXNGVrpnNDclPRw+/KpyRaQQyx255C/mI35K\n2uQbuizfi5OlEhDLHcfEEqSk03r3HrHNi5OlEhDLHbV3LGABYrmjhymWoNSPrhzixZlSDIjljpbk\nN2wjKPWjb+aVeHGmFANiueNB8gLbCEr9KsvuWuPFmVIMiOWOD0n9dI1W3BDrX9PDoRovzpRiQCx3\nfEjIr4qLi3NXs514IVY/za4uXpwpxYBY7tDE+pGKS/0+HebFiVINFcVq3yRTo3lTL45liCUq9VO0\n6l1Fse4eUKTRpbkXx3rbuGOBWFQUy8Cbt8LNaWSvF8cJHBDLJY/pdyxuPjpb1dm9EMslulj8fFTZ\n2b0QyyWTyaNl09/k5qPKzu6FWC65gxBS/15+kZ+qs3shlhfwi/yUnd0LsbyAm4/uVHZ2L8TyAH4+\n2k/Z2b0QC0gBYsnCEm1tvy6ndak5zjesRLQFsSRhjbYGLgt/3dgc56tGtAWxJGEt/cuYs4d+M6mE\nsnG+akRbEEsCjQlpPMEyz/fjWzv0Xh0Z56tEtAWxJNDgorLHr4+OthYfpuHlrcxxvmpEWxBLAo0m\nxURb6W/RQ/MLjXG+W9WItiCWBDSxrNHWks7dsu7ebYzzLVUj2oJYEmB3LNWBWBKAWBBLzAudC51Q\n/2b9p7kRKVVgEArEEjGatM10QL2r2Q/zI1IVBqFALBG/Jo4mwjUqvuyXxcXPcyNSFQahQCwRTsUa\nRZqlp1/Nj0gVaAmDWCKcilVEKmKq/ywRqQqDUCCWCFdicSPSH1QYhAKxRDgUq2EBeUEQkVIVBqFA\nLBEOxapPiHIDA2KBWCIcirX+T+RFUYylwqQKiCXCoVj0X9pnLH6MpcSkCoglwrlY/1v9PjfGUmJS\nBcQS4VSsvxNC+gtiLAUmVUAsEU7FOlw0o/wKboylxKQKxcU6fMe0MgEXOxSLwY+xlJhUobhYS2Kn\n0sfiuIhYEGMpMalCcbHeIYurBTzo4o6lOMqLtUr0GqefsQDEkimWJSKlu245/+zf0LltjlDa5Ujg\nB6FALNFr3IlljUj35ywJ783Yeku7FXR7fvAHoUAs0WucizXo50VF1/08OiKdOln79obqHisup0vv\nDf4gFIgleutCwJMAABA+SURBVI1jsQ6SUzMzz7BEpD0+Y0/szQ1nbZn8YvAHoUAs0WtciDUtptKP\npn1P6ZdT3xpLn5pw8ebgD0KBWKLXuBPLEpHSHrPo/oGLpz9Dd7ftFA7+IBSIJXqNK7GsESn9KNSz\nz0I6cj2lRZfQ4Fe9QyzRa9zdsZQGYole41Isfo51OPCr+kEs0WumkncqHfEP8qgwxwr+qn4QS/Sa\nq+v6GzWfqz6uWjWSm2MFf1U/iCV6zfbZa52xkvQhpB4/xwr+qn4QS8rBD5IhpPwi/dhxc6zgr+oH\nsaQc/CC5kmzn51jfB39VP4gl5eBMrM/4OZYCq/pBLCkH1+9YUo6cKkAsKQeHWBBLysEPkFMIW7NP\nVOoX7FX9IJaco3dpQ74Ul/oFfFU/iCXp8HPJax2HDb2aX+oX8FX9IJakw88lj5MubQURacDrGyCW\npMPPJQvIOkGpX9BX9YNYkg6viyUs9Qv2qn4QS9LhnyQ/J6+LSv0CvqofxJJ0+Mnax6rlko6dCkAs\nWcff8zxZJwyygr2uH8SSdoJXNLEEQVbA1/WDWNJO8ApZWPk6v9Yv4Ov6QSxpJ3hS+5TVQxRkBXpd\nP4gl7QT7Z6xdW8gPsgK+rh/EknoSQZAV8HX9IJbMc4h6VgO+rh/ESvZ1BBSIlezrCCgQS9bxLdHo\nzCZhShfUqzEj0mBnozoQS9LhrdHoTRnf0v29u5oR6ZZgZ6M6EMv7I7/7swEFg+ZaotHsO96lv/vF\nzWZE+m2ws1EdiOX9keeRvIKCcdHRaE3PGS/sKnhylhmRBjwb1YFY3h95GamMWc7v9duWlj2w5sb3\nzYg04NmoDsTy/si6WJZo9PHffzx0KO12wIxIA56N6kAs74/MxLJGo4X/rCGVu/IiEWnAs1EdiOX9\nkfU7lupALO+PbIglyrECP5cCYkWL9cbTFZ5QzMQS5VhbAz+XAmJFi9XK+fp9Mfxy7drFghyrOvBz\nKSBWtFj5eVWeMI+51UKQYwV/LgXEihbrov7eHHkZaVxZebUgxwr+XAqIJUespuIcK/hzKSCWLLGE\nOVbQ121gQKza73h6x1IddcUqOl37nL1YoliCICsc+MEUCovV00gGZtV+xyuxZpH6VBxkzQr8YAqF\nxfrHA+Xl5cXR6yt4JdZnZ/5E+1opCLKWBn4whcJi6Uj5jEXvq986N/daUUFW4AdTQCwpYtXLKCzs\nLwiygj+YAmLJuWMNFwZZ84I/mAJiyRJLFGQpMJgCYskQ65Z6+R4dKXWBWLU7HooV9L8E1g3Eqt3x\nTCzaXfuMJa70C/ZYCgqx5IklrPQL+FgKCrHkiNUpVPHyP0SVfgEfS0EhlhyxWhBC7hJ2rAa+vAFi\n1e54J9aWDVWbRR2rQR9LQSGWHLEY4kq/YI+loBBLmljiSr+Aj6WgEMsQa8n/sratrnnJvpwAAbHY\nppFRm+V8oQ5LZmXs/HGi9n0FVljjALHYpuXPKzXy+jo9jCWzMnfGvay94ymwwhoHiMU2p93Jvjr/\njGUp6jN3em3ThFNghTUOEIttHIk1vWnbTJObozOr2frOwbb6i4K/whoHiMU2jsQqJjcXmVwanVkZ\nAda6K7UvKqywxgFisY0jsZ4guyIPLZmVsTN7mvZFhRXWOEAstnEpliWzMndGtwuFHlFhhTUOEItt\n3N6xwHFALLaBWJ4DsdjGiVijzz8mFiceVaDfmQ/EYhsnYhFCvjMe8eLRV4Lf78wHYrGNE7HqXUw2\n/qKc8WdOPPpm8Pud+UAstnEo1hjjT4zZvHg0+P3OfCAW2zgU6wny72qNUZx4VIF+Zz4Qi20ci6WX\n63Hi0Z0K9DvzgVhs41Cse3SxePFomgL9znwgFts4EitH+0j1uYyLCgQQi20ciXVfWZHxVsir8gt+\nV6oAiMU2jsQqoeW6WLwYiwa/K1UAxGIbV2Lxqvxo8LtSBUAstnEo1r2kRXp6+nBulV/gu1IFQCy2\ncSjW2/3vLy4u5lX5KdCVKgBisY0hVr8udmZ81ZtoHoNX5adAV6oAiMU2hlhn2hvxZX6A4lb5KdCV\nKgBisY0h1ud/rbRBvcnJvXSfA7HYxhDLHtpnLMAHYrGNS7EsCSndVzyg08Bte+/ulzOYBfOBH9Ib\nH4jFNu7Esq7dt6/vc+Hw06Pz/hQOP6H9mhn8Ib3xgVhs41SsLdeUlpWVPWVZu+/BqdqTR+/TnqU1\naVSBIb3xgVhs41Ssifqvh+2iE1La5Qs6Q3vAKvw2ZakwpDc+EIttnIq1UC9vsKzdR5sxkcaQH7Sv\nUyaqMKQ3PhCLbdyJZUlI6YVP0/CixrlP0fBzOT+oMKQ3PhCLbVyJZV27j27Kzx4484rPC87rO57d\nu1T9eyHEYht3dywQB4jFNi7FEuRYyi7pB7HYxpFYxdXVc3SxBDmWukv6QSy2cSKW8ZfoD2jskF5L\njqXukn4Qi22ciDWhrLx8tC7X6fwcS90l/RQXawXpkKvR8BJnP76QNCsrK7uAn2Opu6Sf4mJ93mZo\noUbjy539+EJyGhXmWOou6ae4WCZO3goZuliiHEvdJf0gFsOVWCAeEIsBsTwHYjGcijWdNNO3oko/\nRduhIRbDqVjPNWzKNqJKP1XboSEWw6lY9PZWI28rfuBlQaWfqu3QEIvhQizSND39ZlFCqmh5A8Ri\nuBHrUWGln7Lt0BCL4VIsYaWfou3QEIvhWKxbTiFTxJV+qrZDQyyGY7F6E3Klp1cSGCAWw7FYu9aQ\nh6kox1J2OgXEYjgWix5mn7EEOZay0ykgFsONWBOrq9/n51jKTqeAWAznYh1h2VV/QY6l6nQKiMVw\nLhYtfqy8/Ap+jqXsdAqIxXAhFoOfY21VdjoFxGK4E0uQY5UqO50CYjFc3rHA8UAsBsTyHIjFcCGW\nNRvddcv5Z/+GKl7jpwOxGM7Fsmaj+3OWhPdm/EfxGj8diMVISKyLrik6ntHXRmejU9lKyhvGlVCl\na/x0IBYjEbF2k5aZx/MzSzba4zP2ynMVr/HTgViMRMSqIb+N811rjV/a95R+OTVN8Ro/HYjFcC6W\nNRvtMYvuH7g4X/EaPx2IxXAsVkw2+lGoZ5+FVPUaPx2IxXB+xwIcIBbDsVjxUqxVrUKh0Jy1GVlZ\n5d5facoAsRhOxYqXYm19ZDZ76tb/eH6VKQXEYtgVa//HVTrvjDw+xaoeWqltDrbp3HKBtOtNASAW\nw65Yvc2RhblxUqxwK21n0MZ7vlug5oprJhCLYVesa1saU1aHxUmxvsoxX7QrQ8KVpgwQi2FXrDFn\nGdt4KdZL49j3Bm+k04slXGnKALEYDsWKm2JNaK39Urh8ce/B9xyQcq0pAsRi2BCr5Cfp6emN02Rf\nUcoDsRjpF9c9tX4BKWMvHU/GFhV1VnMogB0gFqNRQmPrr2UvfYQcqf2MJQhIdxTmlSXxX5R0IBbj\nk3fqnlr/JnmcvdQiFj8gPXze3747M4n/oqQDsRLF/Iyli3Vp00Kdq+OU+RkB6WLV1wqBWIkSLdao\nn+bqdOAGpFeMyUp/OrkXnFwgVqJEixUhXpmfEZC2u2/r2uZJuErfALESJZ5Y/IC02Va6bGAyLtMv\nQKxEiSOWICB98vw+I79NzoX6A4iVKPHuWIALxEqU+GJxg6zZqi7lZwKxEiWuWPwgS9ml/EwgVqLE\niPXJ31lquoxb6afsUn4mECtRYsRqqv+VpyE3yFJ2KT8TiJUoMWKdNXitxuXcIEvZpfxMIFai1JBH\nqzUmmGK1HcW+coOsncou5WcCsRJld6TIIUosfpDVT9ml/EwgVsLcMrFMYxATq7PmV69kX4+/gVg2\n0T9jdWhX3GJIsq/E30Asm+hi9RpifsYSlfpRulJh+SCWTWLEEvRC0zfzSpJ3nckGYtkkItYpZxYw\nruUmpLSy7K41Sb7YJAKxbBIRKy9LF6svNyH91/RwqCbZV5s8IJZNImKZ8Ev9+ml2dUnSRfoAiGWT\nGLH4pX6UfjosOZfoCyCWTaxiCUr91F7bFmLZJeaOBThALJvEEYsfZa3v0/OhZF1okoFYNjleLH6U\nFe668XtVm/Ehlk1qxVq+fK3Oy/xl/dbtf2JQkq83WUAsm9SKVc+sdmjFL/aj15DFSb7eZAGxbPIw\n+ayqqjsrOm4wxljUoZAbZR0OH35V1fUiIZZNrtLvUmxRkAZmmsCPspp8Q5flJ+tCkwzEssmnJRUV\nFe0G0B+qTbEEUda03r1HbEvmxSYRiOWEXkPu1m5bVyT7MvwMxHJCryETyNT6vzB2BBVZCo9YhVhO\n6DWklOw33woFFVkqj1iFWE6IFquSX5Gl8ohViOUEXax6Z+qLr3URxFgK/xkaYjlBF+u8AfpykR3+\nj30n/nQKhUesQiwn6GKZj0UVWQqPWIVYTogSS1SRpfKIVYjlhOg7FogLxHICxKoTiOWEeGJZctLF\nGaEOE+gfJ+rPKNm3CrGcEEcsa056/3N0Rzod9zJ7Rs2+VYjlhG49rrOI9djzS9625KQXbjo840ra\ni/0FWtG+VYjlhFMJIftqd9/SdsdE56RHm2tfXzzYVntO1b5ViOWELzfcF33H2kBmVt0cXe63sR+l\n77Zcx+bpqNq3CrGcUWoVa5E1J114P6WL+syepj+rZt8qxHKGIdaWKp2VZJE1J72rTajHgM9GtwuF\nHlH1D4YQyxm6WPcdm5E5LdnX4zsgljN0sR4j08sZJeTP/BxL0aZViOUMXaynyPf6DvuMxcuxVG1a\nhVjOOE6sSk6OpWrTKsRyhkWst8kZmddycixVm1YhljMsYu3Mvr5oMCfHUrVpFWI5wyIWg5djqdq0\nCrGcESsWN8dStWkVYjnjV2R4YeF55LtkX4dvgVjOeLRJZmZmS/JNsq/Dt0AsF0TeCjnZaFjl6b0Q\nywWmWLxsVOnpvRDLBZpYH95TXj712vjZqNLTeyGWCzSx8gkh9XjZqMrTeyGWCzSxfnVSdfUNnGxU\n6em9EMsFmlj3nMzNRtWe3guxXPAU+esNJ3Gz0TSlp/dCLBeM1T5O1U/2RfgUiOWCHQ9VDE2jgl5V\nrOgHHMI+Y3F7VbGiH3AKE4tX44cV/YBjLqmX3rKIW+OnZoOODsRyx7ODi4uHcXIsrOgHXMHvVcWK\nfsA5/F5VrOgHgMdALCAFiAWkALGAFCAWkALEAlKAWEAKEAtIAWIBKUAsIAWIBaQAsYAUIBaQAsQC\nUoBYQAoQC0gBYgEpQCwgBYgFpACxgBQgFpACxAJSgFhAChALSAFiASlALCAFiAWkALGAFCAWkALE\nAlKAWEAKEAtIAWIBKUAsIAWIBaQAsYAUIBaQAsQCUoBYQAoQC0gBYgEpQCwgBYgFpACxgBQgFpAC\nxAJSgFhAChALSAFiASlALCAFiAWkALGAFCAWkALEAlKAWEAKEAtIAWIBKUAsIAWIBaQAsYAUIBaQ\nAsQCUoBYQAoQC0gBYgEpQCwgBYgFpACxgBQgFpACxAJSgFhAChALSAFiASlALCAFiAWkALGAFCAW\nkALEAlKAWEAKEAtIAWIBKUAsIAWIBaQAsYAUIBaQAsQCUoBYQAoQC0gBYgEpQCwgBYgFpACxgBQg\nFpACxAJSgFhACv8PszbkaU4yj1oAAAAASUVORK5CYII=\n" }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%R -w 600 -h 800\n", "library(ape)\n", "tre <- read.tree(\"empirical_2/RAxML_bipartitions.empirical_2\")\n", "ltre <- ladderize(tre)\n", "plot(ltre, cex=0.8, edge.width=2)\n", "#nodelabels(ltre$node.label)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Get average phylo distances (GTRgamma distance)" ] }, { "cell_type": "code", "execution_count": 51, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "[1] 0.06260378\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%R\n", "mean(cophenetic.phylo(ltre))" ] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.12" } }, "nbformat": 4, "nbformat_minor": 0 }