{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "

ipyrad-analysis toolkit: mrbayes

\n", "\n", "In these analyses our interest is primarily in inferring accurate branch lengths under a relaxed molecular clock model. This means that tips are forced to line up at the present (time) but that rates of substitutions are allowed to vary among branches to best explain the variation in the sequence data. \n", "\n", "There is a huge range of models that can be employed using mrbayes by employing different combinations of parameter settings, model definitions, and prior settings. The ipyrad-analysis tool here is intended to make it easy to run such jobs many times (e.g., distributed in parallel) once you have decided on your settings. In addition, we provide a number of pre-set models (e.g., clock_model=2) that may be useful for simple scenarios. \n", "\n", "Here we use simulations to demonstrate the accuracy of branch length estimation when sequences come from a single versus multiple distinct genealogies (e.g., gene tree vs species tree), and show an option to fix the topology to speed up analyses when your only interest is to estimate branch lengths. " ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# conda install ipyrad -c conda-forge -c bioconda\n", "# conda install mrbayes -c conda-forge -c bioconda\n", "# conda install ipcoal -c conda-forge" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "import toytree\n", "import ipcoal\n", "import ipyrad.analysis as ipa" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Simulate a gene tree with 14 tips and MRCA of 1M generations" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
r0r1r2r3r4r5r6r705000001000000
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "TREE = toytree.rtree.bdtree(ntips=8, b=0.8, d=0.2, seed=123)\n", "TREE = TREE.mod.node_scale_root_height(1e6)\n", "TREE.draw(ts='o', layout='d', scalebar=True);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Simulate sequences on single gene tree and write to NEXUS\n", "When Ne is greater the gene tree is more likely to deviate from the species tree topology and branch lengths. By setting recombination rate to 0 there will only be one true underlying genealogy for the gene tree. We set nsamples=2 because we want to simulate diploid individuals. " ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "wrote concat locus (8 x 20000bp) to /tmp/mbtest-1.nex\n" ] } ], "source": [ "# init simulator\n", "model = ipcoal.Model(TREE, Ne=2e4, nsamples=2, recomb=0)\n", "\n", "# simulate sequence data on coalescent genealogies\n", "model.sim_loci(nloci=1, nsites=20000)\n", "\n", "# write results to database file\n", "model.write_concat_to_nexus(name=\"mbtest-1\", outdir='/tmp', diploid=True)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
r2-1r2-0r1-1r1-0r0-1r0-0r3-1r3-0r4-1r4-0r5-1r5-0r6-1r6-0r7-1r7-005162721032543
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# the simulated genealogy of haploid alleles\n", "gene = model.df.genealogy[0]\n", "\n", "# draw the genealogy\n", "toytree.tree(gene).draw(ts='o', layout='d', scalebar=True);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### View an example locus\n", "This shows the 2 haploid samples simulated for each tip in the species tree." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
r0-0r0-1r1-0r1-1r2-0r2-1r3-0r3-1r4-0r4-1r5-0r5-1r6-0r6-1r7-0r7-1
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "model.draw_seqview(idx=0, start=0, end=50);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### (1) Infer a tree under a relaxed molecular clock model" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "brlenspr clock:uniform \n", "clockratepr normal(0.01,0.005) \n", "clockvarpr igr \n", "igrvarpr exp(10.0) \n", "nchains 4 \n", "ngen 1000000 \n", "nruns 2 \n", "samplefreq 5000 \n", "topologypr fixed(fixedtree) \n", "\n" ] } ], "source": [ "# init the mb object\n", "mb = ipa.mrbayes(\n", " data=\"/tmp/mbtest-1.nex\",\n", " name=\"itest-1\",\n", " workdir=\"/tmp\",\n", " clock_model=2,\n", " constraints=TREE,\n", " ngen=int(1e6),\n", " nruns=2,\n", ")\n", "\n", "# modify a parameter\n", "mb.params.clockratepr = \"normal(0.01,0.005)\"\n", "mb.params.samplefreq = 5000\n", "\n", "# summary of priors/params\n", "print(mb.params)" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "job itest-1 finished successfully\n" ] } ], "source": [ "# start the run\n", "mb.run(force=True)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
r1r0r2r3r4r5r7r6r0r1r2r3r4r5r6r705000001000000
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# load the inferred tree\n", "mbtre = toytree.tree(\"/tmp/itest-1.nex.con.tre\", 10)\n", "\n", "# scale root node to 1e6\n", "mbtre = mbtre.mod.node_scale_root_height(1e6)\n", "\n", "# draw inferred tree \n", "c, a, m = mbtre.draw(ts='o', layout='d', scalebar=True);\n", "\n", "# draw TRUE tree in orange on the same axes\n", "TREE.draw(\n", " axes=a, \n", " ts='o', \n", " layout='d', \n", " scalebar=True, \n", " edge_colors=\"darkorange\",\n", " node_sizes=0,\n", " fixed_order=mbtre.get_tip_labels(),\n", ");" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
MeanVarianceLowerUpperMedianminESSavgESSPSRF
Parameter
TH1.057e-021.270e-068.548e-031.280e-021.058e-0296.759123.8800.997
TL3.975e-021.433e-053.405e-024.663e-023.930e-0293.967122.4840.997
r(A<->C)1.711e-011.907e-041.438e-011.966e-011.694e-0146.51698.7580.998
r(A<->G)1.618e-011.587e-041.335e-011.835e-011.633e-0182.391116.6950.999
r(A<->T)1.646e-011.955e-041.436e-011.969e-011.626e-0121.96986.4841.022
r(C<->G)1.522e-011.635e-041.286e-011.748e-011.528e-017.77579.3881.012
r(C<->T)1.920e-011.813e-041.660e-012.139e-011.928e-0121.51773.6231.023
r(G<->T)1.582e-011.961e-041.355e-011.874e-011.568e-0123.11287.0561.012
pi(A)2.473e-016.721e-062.422e-012.526e-012.477e-01137.517144.2590.998
pi(C)2.490e-019.690e-062.448e-012.545e-012.491e-018.14779.5731.083
pi(G)2.538e-011.512e-052.473e-012.600e-012.536e-015.87347.9141.175
pi(T)2.499e-018.684e-062.456e-012.562e-012.495e-0116.04954.0031.030
alpha1.736e+009.001e-013.765e-013.552e+001.649e+0068.018109.5090.997
igrvar2.744e-041.582e-071.800e-068.916e-041.563e-0485.32998.0340.997
clockrate1.036e-022.030e-053.860e-031.825e-029.978e-038.98428.5291.056
\n", "
" ], "text/plain": [ " Mean Variance Lower Upper Median minESS \\\n", "Parameter \n", "TH 1.057e-02 1.270e-06 8.548e-03 1.280e-02 1.058e-02 96.759 \n", "TL 3.975e-02 1.433e-05 3.405e-02 4.663e-02 3.930e-02 93.967 \n", "r(A<->C) 1.711e-01 1.907e-04 1.438e-01 1.966e-01 1.694e-01 46.516 \n", "r(A<->G) 1.618e-01 1.587e-04 1.335e-01 1.835e-01 1.633e-01 82.391 \n", "r(A<->T) 1.646e-01 1.955e-04 1.436e-01 1.969e-01 1.626e-01 21.969 \n", "r(C<->G) 1.522e-01 1.635e-04 1.286e-01 1.748e-01 1.528e-01 7.775 \n", "r(C<->T) 1.920e-01 1.813e-04 1.660e-01 2.139e-01 1.928e-01 21.517 \n", "r(G<->T) 1.582e-01 1.961e-04 1.355e-01 1.874e-01 1.568e-01 23.112 \n", "pi(A) 2.473e-01 6.721e-06 2.422e-01 2.526e-01 2.477e-01 137.517 \n", "pi(C) 2.490e-01 9.690e-06 2.448e-01 2.545e-01 2.491e-01 8.147 \n", "pi(G) 2.538e-01 1.512e-05 2.473e-01 2.600e-01 2.536e-01 5.873 \n", "pi(T) 2.499e-01 8.684e-06 2.456e-01 2.562e-01 2.495e-01 16.049 \n", "alpha 1.736e+00 9.001e-01 3.765e-01 3.552e+00 1.649e+00 68.018 \n", "igrvar 2.744e-04 1.582e-07 1.800e-06 8.916e-04 1.563e-04 85.329 \n", "clockrate 1.036e-02 2.030e-05 3.860e-03 1.825e-02 9.978e-03 8.984 \n", "\n", " avgESS PSRF \n", "Parameter \n", "TH 123.880 0.997 \n", "TL 122.484 0.997 \n", "r(A<->C) 98.758 0.998 \n", "r(A<->G) 116.695 0.999 \n", "r(A<->T) 86.484 1.022 \n", "r(C<->G) 79.388 1.012 \n", "r(C<->T) 73.623 1.023 \n", "r(G<->T) 87.056 1.012 \n", "pi(A) 144.259 0.998 \n", "pi(C) 79.573 1.083 \n", "pi(G) 47.914 1.175 \n", "pi(T) 54.003 1.030 \n", "alpha 109.509 0.997 \n", "igrvar 98.034 0.997 \n", "clockrate 28.529 1.056 " ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# check convergence statistics\n", "mb.convergence_stats" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### (2) Concatenated sequences from a species tree\n", "Here we use concatenated sequence data from 100 loci where each represents one or more distinct genealogies. In addition, Ne is increased to 1e5, allowing for more genealogical variation. *We expect the accuracy of estimated edge lengths will decrease since we are not adequately modeling the genealogical variation when using concatenation.* Here I set the recombination rate *within* loci to be zero. There is free recombination among loci, however, since they are unlinked." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "wrote concat locus (8 x 20000bp) to /tmp/mbtest-2.nex\n" ] } ], "source": [ "# init simulator\n", "model = ipcoal.Model(TREE, Ne=1e5, nsamples=2, recomb=0)\n", "\n", "# simulate sequence data on coalescent genealogies\n", "model.sim_loci(nloci=100, nsites=200)\n", "\n", "# write results to database file\n", "model.write_concat_to_nexus(name=\"mbtest-2\", outdir='/tmp', diploid=True)" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
r2-1r2-0r0-1r1-1r1-0r0-0r3-1r3-0r4-1r4-0r5-1r5-0r6-1r6-0r7-1r7-0r3-1r3-0r4-1r4-0r5-1r5-0r1-1r1-0r0-1r2-0r2-1r0-0r7-1r7-0r6-1r6-0r1-0r0-0r0-1r1-1r2-1r2-0r3-1r3-0r5-1r5-0r4-1r4-0r6-1r6-0r7-1r7-0r4-1r4-0r3-1r3-0r1-1r1-0r0-1r5-1r5-0r2-1r0-0r2-0r6-1r6-0r7-1r7-0
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# the simulated genealogies of haploid alleles\n", "genes = model.df.genealogy[:4]\n", "\n", "# draw the genealogies of the first four loci\n", "toytree.mtree(genes).draw(ts='o', layout='r', height=250);" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "brlenspr clock:uniform \n", "clockratepr normal(0.01,0.005) \n", "clockvarpr igr \n", "igrvarpr exp(10.0) \n", "nchains 4 \n", "ngen 1000000 \n", "nruns 2 \n", "samplefreq 1000 \n", "topologypr fixed(fixedtree) \n", "\n", "job itest-2 finished successfully\n" ] } ], "source": [ "# init the mb object\n", "mb = ipa.mrbayes(\n", " data=\"/tmp/mbtest-2.nex\",\n", " workdir=\"/tmp\",\n", " name=\"itest-2\",\n", " clock_model=2,\n", " constraints=TREE,\n", " ngen=int(1e6),\n", " nruns=2,\n", ")\n", "\n", "# summary of priors/params\n", "print(mb.params)\n", "\n", "# start the run\n", "mb.run(force=True)" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
r1r0r2r3r4r5r7r6r0r1r2r3r4r5r6r705000001000000
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# load the inferred tree\n", "mbtre = toytree.tree(\"/tmp/itest-2.nex.con.tre\", 10)\n", "\n", "# scale root node from unitless to 1e6\n", "mbtre = mbtre.mod.node_scale_root_height(1e6)\n", "\n", "# draw inferred tree \n", "c, a, m = mbtre.draw(ts='o', layout='d', scalebar=True);\n", "\n", "# draw true tree in orange on the same axes\n", "TREE.draw(\n", " axes=a, \n", " ts='o', \n", " layout='d', \n", " scalebar=True, \n", " edge_colors=\"darkorange\",\n", " node_sizes=0,\n", " fixed_order=mbtre.get_tip_labels(),\n", ");" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
MeanVarianceLowerUpperMedianminESSavgESSPSRF
Parameter
TH1.371e-029.286e-071.189e-021.567e-021.371e-02536.255571.2120.999
TL5.898e-021.207e-055.240e-026.553e-025.870e-02490.961540.8290.999
r(A<->C)1.496e-011.543e-041.236e-011.715e-011.498e-01510.420543.5071.000
r(A<->G)1.325e-011.460e-041.105e-011.569e-011.317e-01588.878616.8041.000
r(A<->T)2.180e-012.215e-041.871e-012.453e-012.173e-01592.980618.9631.000
r(C<->G)2.347e-012.366e-042.023e-012.616e-012.346e-01549.871584.6781.001
r(C<->T)1.403e-011.450e-041.191e-011.653e-011.403e-01543.051607.1571.001
r(G<->T)1.248e-011.276e-041.047e-011.485e-011.246e-01535.020643.0101.000
pi(A)2.498e-019.551e-062.432e-012.555e-012.498e-01396.019481.1910.999
pi(C)2.500e-018.635e-062.445e-012.558e-012.499e-01531.979532.0530.999
pi(G)2.502e-018.753e-062.440e-012.555e-012.503e-01525.809526.4851.000
pi(T)2.500e-018.477e-062.444e-012.556e-012.501e-01491.522495.1961.000
alpha4.149e-027.388e-046.260e-059.082e-023.843e-02650.513652.0011.000
igrvar1.155e-043.241e-081.220e-063.748e-046.512e-05331.231371.2821.000
clockrate1.196e-021.587e-054.939e-031.966e-021.190e-02133.216165.3190.999
\n", "
" ], "text/plain": [ " Mean Variance Lower Upper Median minESS \\\n", "Parameter \n", "TH 1.371e-02 9.286e-07 1.189e-02 1.567e-02 1.371e-02 536.255 \n", "TL 5.898e-02 1.207e-05 5.240e-02 6.553e-02 5.870e-02 490.961 \n", "r(A<->C) 1.496e-01 1.543e-04 1.236e-01 1.715e-01 1.498e-01 510.420 \n", "r(A<->G) 1.325e-01 1.460e-04 1.105e-01 1.569e-01 1.317e-01 588.878 \n", "r(A<->T) 2.180e-01 2.215e-04 1.871e-01 2.453e-01 2.173e-01 592.980 \n", "r(C<->G) 2.347e-01 2.366e-04 2.023e-01 2.616e-01 2.346e-01 549.871 \n", "r(C<->T) 1.403e-01 1.450e-04 1.191e-01 1.653e-01 1.403e-01 543.051 \n", "r(G<->T) 1.248e-01 1.276e-04 1.047e-01 1.485e-01 1.246e-01 535.020 \n", "pi(A) 2.498e-01 9.551e-06 2.432e-01 2.555e-01 2.498e-01 396.019 \n", "pi(C) 2.500e-01 8.635e-06 2.445e-01 2.558e-01 2.499e-01 531.979 \n", "pi(G) 2.502e-01 8.753e-06 2.440e-01 2.555e-01 2.503e-01 525.809 \n", "pi(T) 2.500e-01 8.477e-06 2.444e-01 2.556e-01 2.501e-01 491.522 \n", "alpha 4.149e-02 7.388e-04 6.260e-05 9.082e-02 3.843e-02 650.513 \n", "igrvar 1.155e-04 3.241e-08 1.220e-06 3.748e-04 6.512e-05 331.231 \n", "clockrate 1.196e-02 1.587e-05 4.939e-03 1.966e-02 1.190e-02 133.216 \n", "\n", " avgESS PSRF \n", "Parameter \n", "TH 571.212 0.999 \n", "TL 540.829 0.999 \n", "r(A<->C) 543.507 1.000 \n", "r(A<->G) 616.804 1.000 \n", "r(A<->T) 618.963 1.000 \n", "r(C<->G) 584.678 1.001 \n", "r(C<->T) 607.157 1.001 \n", "r(G<->T) 643.010 1.000 \n", "pi(A) 481.191 0.999 \n", "pi(C) 532.053 0.999 \n", "pi(G) 526.485 1.000 \n", "pi(T) 495.196 1.000 \n", "alpha 652.001 1.000 \n", "igrvar 371.282 1.000 \n", "clockrate 165.319 0.999 " ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mb.convergence_stats" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### To see the NEXUS file (data, parameters, priors):" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "#NEXUS\n", "\n", "[log block]\n", "log start filename=/tmp/itest-2.nex.log replace;\n", "\n", "[data block]\n", "execute /tmp/mbtest-2.nex;\n", "\n", "[tree block]\n", "begin trees;\n", " tree fixedtree = ((r7,r6),(r5,(r4,(r3,(r2,(r1,r0))))));\n", "end;\n", "\n", "[mb block]\n", "begin mrbayes;\n", " set autoclose=yes nowarn=yes;\n", "\n", " lset nst=6 rates=gamma;\n", "\n", " prset brlenspr=clock:uniform;\n", " prset clockvarpr=igr;\n", " prset igrvarpr=exp(10.0);\n", " prset clockratepr=normal(0.01,0.005);\n", " prset topologypr=fixed(fixedtree);\n", "\n", " \n", "\n", " mcmcp ngen=1000000 nrun=2 nchains=4;\n", " mcmcp relburnin=yes burninfrac=0.25;\n", " mcmcp samplefreq=1000;\n", " mcmcp printfreq=10000 diagnfr=5000;\n", " mcmcp filename=/tmp/itest-2.nex;\n", " mcmc;\n", "\n", " sump filename=/tmp/itest-2.nex;\n", " sumt filename=/tmp/itest-2.nex contype=allcompat;\n", "end;\n", "\n", "[log block]\n", "log stop filename=/tmp/itest-2.nex.log append;\n", "\n" ] } ], "source": [ "mb.print_nexus_string()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### (3) Tree inference (not fixed topology) and plotting support values\n", "Here we will try to infer the topology from a concatenated data set (i.e., not set a constraint on the topology). I increased the ngen setting since the MCMC chain takes longer to converge when searching over topology space. Take note that the support values from mrbayes newick files are available in the \"prob{percent}\" feature, as shown below. " ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "brlenspr clock:uniform \n", "clockratepr normal(0.01,0.005) \n", "clockvarpr igr \n", "igrvarpr exp(10.0) \n", "nchains 4 \n", "ngen 2000000 \n", "nruns 2 \n", "samplefreq 1000 \n", "topologypr uniform \n", "\n", "job itest-3 finished successfully\n" ] } ], "source": [ "# init the mb object\n", "mb = ipa.mrbayes(\n", " data=\"/tmp/mbtest-2.nex\",\n", " name=\"itest-3\",\n", " workdir=\"/tmp\",\n", " clock_model=2,\n", " ngen=int(2e6),\n", " nruns=2,\n", ")\n", "\n", "# summary of priors/params\n", "print(mb.params)\n", "\n", "# start run\n", "mb.run(force=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The tree topology was correctly inferred " ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
100100100100100100r1r0r2r3r4r5r7r605000001000000
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# load the inferred tree\n", "mbtre = toytree.tree(\"/tmp/itest-3.nex.con.tre\", 10)\n", "\n", "# scale root node from unitless to 1e6\n", "mbtre = mbtre.mod.node_scale_root_height(1e6)\n", "\n", "# draw inferred tree \n", "c, a, m = mbtre.draw(\n", " layout='d',\n", " scalebar=True, \n", " node_sizes=18, \n", " node_labels=\"prob{percent}\",\n", ");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The branch lengths are not very accurate in this case:" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
r1r0r2r3r4r5r7r6r0r1r2r3r4r5r6r705000001000000
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# load the inferred tree\n", "mbtre = toytree.tree(\"/tmp/itest-3.nex.con.tre\", 10)\n", "\n", "# scale root node from unitless to 1e6\n", "mbtre = mbtre.mod.node_scale_root_height(1e6)\n", "\n", "# draw inferred tree \n", "c, a, m = mbtre.draw(ts='o', layout='d', scalebar=True);\n", "\n", "# draw true tree in orange on the same axes\n", "TREE.draw(\n", " axes=a, \n", " ts='o', \n", " layout='d', \n", " scalebar=True, \n", " edge_colors=\"darkorange\",\n", " node_sizes=0,\n", " fixed_order=mbtre.get_tip_labels(),\n", ");" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.8" } }, "nbformat": 4, "nbformat_minor": 4 }