# Canarium GBS: Concordance analysis
### *Federman et al.*

This notebook provides all code necessary to reproduce the BUCKy concordance analyses. 

### Required software

In [1]:
## conda install ipyrad -c ipyrad
## conda install bucky -c ipyrad
## conda install mrbayes -c biobuilds

### Imports

In [7]:
import ipyrad as ip
import ipyrad.analysis as ipa
print "ipyrad v.{}".format(ip.__version__)

ipyrad v.0.7.20


### Connect to cluster

In [3]:
import ipyparallel as ipp
ipyclient = ipp.Client()
ip.cluster_info(ipyclient)

host compute node: [40 cores] on sacra


## Concordance factors
One sample was selected from each major clade that was recovered in every phyogenetic analysis (see notebook 2). Samples were selected that had the most data available and did not appear to be admixed. 

In [10]:
bucky_samples = [
 "D14269", ## outgroup o
 "SF328", ## betamponae 1A
 "D13052", ## pilicarpum 1B
 "SF276", ## pulchebracteatum 1C 
 "D14483", ## multinervis 2A
 "D14505", ## velutinifolium 2B
 "D14478", ## multiflorum 2C
 "D12950", ## longistipulatum 3A
 "SF224", ## obtusifolium 3B
 "D13097", ## compressum 3C
 ]

In [11]:
## initiate a bucky object
b = ipa.bucky(
 name="ten-clades",
 data="analysis-ipyrad/Canarium-min10_outfiles/Canarium-min10.alleles.loci",
 workdir="analysis-bucky",
 samples=bucky_samples,
 minsnps=2,
 seed=12345,
 mb_mcmc_burnin=1000000,
 mb_mcmc_ngen=4000000,
 mb_mcmc_sample_freq=4000,
 bucky_alpha=[0.1, 1.0, 10.0],
 bucky_nchains=4,
 bucky_nreps=4,
 bucky_niter=int(1e6),
)

In [12]:
## run full analysis [write-nex, mb, mbsum, bucky]
b.run(ipyclient=ipyclient, force=True)

wrote 750 nexus files to ~/Documents/Canarium/analysis-bucky/ten-clades
[####################] 100% [mb] infer gene-tree posteriors | 2:13:22 | 
[####################] 100% [mbsum] sum replicate runs | 0:00:01 | 
[####################] 100% [bucky] infer CF posteriors | 3:16:00 | 


## BUCKy results 

In [55]:
## figures were made by hand from parsing the resulting in 
## the "Splits in the Primary Concordance Tree" section.
! head -n 50 ./analysis-bucky/ten-clades/CF-a1.0.concordance

translate
 1 D14269,
 2 SF224,
 3 D12950,
 4 SF328,
 5 D13097,
 6 D13052,
 7 D14483,
 8 SF276,
 9 D14478,
 10 D14505;

Population Tree:
((((1,(4,(6,8))),((2,5),3)),7),9,10);

Primary Concordance Tree Topology:
((((1,(4,(6,8))),((2,5),3)),7),9,10);

Population Tree, With Branch Lengths In Estimated Coalescent Units:
((((1:10.000,(4:10.000,(6:10.000,8:10.000):0.617):1.512):0.462,((2:10.000,5:10.000):0.161,3:10.000):0.567):0.394,7:10.000):0.103,9:10.000,10:10.000);

Primary Concordance Tree with Sample Concordance Factors:
((((1:1.000,(4:1.000,(6:1.000,8:1.000):0.587):0.771):0.446,((2:1.000,5:1.000):0.375,3:1.000):0.392):0.338,7:1.000):0.305,9:1.000,10:1.000);

Four-way partitions in the Population Tree: sample-wide CF, coalescent units and Ties(if present)
{1,4,6,7,8,9,10; 3|2; 5}	0.432, 0.161, 
{1,2,3,5,7,9,10; 4|6; 8}	0.640, 0.617, 
{1; 2,3,5,7,9,10|4; 6,8}	0.853, 1.512, 
{1,2,3,4,5,6,8; 7|9; 10}	0.399, 0.103, 
{1,4,6,8; 7,9,10|2,5; 3}	0.622, 0.567, 
{1; 4

In [5]:
## figures were made by hand from parsing the resulting in 
## the "Splits in the Primary Concordance Tree" section.
! head -n 50 ./analysis-bucky/ten-clades/CF-a0.1.concordance

translate
 1 D14269,
 2 SF224,
 3 D12950,
 4 SF328,
 5 D13097,
 6 D13052,
 7 D14483,
 8 SF276,
 9 D14478,
 10 D14505;

Population Tree:
((((1,(4,(6,8))),((2,5),3)),7),9,10);

Primary Concordance Tree Topology:
((((1,(4,(6,8))),((2,5),3)),7),9,10);

Population Tree, With Branch Lengths In Estimated Coalescent Units:
((((1:10.000,(4:10.000,(6:10.000,8:10.000):0.495):1.514):0.438,((2:10.000,5:10.000):0.170,3:10.000):0.546):0.395,7:10.000):0.140,9:10.000,10:10.000);

Primary Concordance Tree with Sample Concordance Factors:
((((1:1.000,(4:1.000,(6:1.000,8:1.000):0.540):0.775):0.445,((2:1.000,5:1.000):0.372,3:1.000):0.394):0.329,7:1.000):0.314,9:1.000,10:1.000);

Four-way partitions in the Population Tree: sample-wide CF, coalescent units and Ties(if present)
{1,4,6,7,8,9,10; 3|2; 5}	0.438, 0.170, 
{1,2,3,5,7,9,10; 4|6; 8}	0.594, 0.495, 
{1; 2,3,5,7,9,10|4; 6,8}	0.853, 1.514, 
{1,2,3,4,5,6,8; 7|9; 10}	0.420, 0.140, 
{1,4,6,8; 7,9,10|2,5; 3}	0.614, 0.546, 
{1; 4

In [6]:
## figures were made by hand from parsing the resulting in 
## the "Splits in the Primary Concordance Tree" section.
! head -n 50 ./analysis-bucky/ten-clades/CF-a10.0.concordance

translate
 1 D14269,
 2 SF224,
 3 D12950,
 4 SF328,
 5 D13097,
 6 D13052,
 7 D14483,
 8 SF276,
 9 D14478,
 10 D14505;

Population Tree:
((((1,(4,(6,8))),((2,5),3)),7),9,10);

Primary Concordance Tree Topology:
((((1,(4,(6,8))),((2,5),3)),7),9,10);

Population Tree, With Branch Lengths In Estimated Coalescent Units:
((((1:10.000,(4:10.000,(6:10.000,8:10.000):0.472):1.619):0.438,((2:10.000,5:10.000):0.200,3:10.000):0.552):0.464,7:10.000):0.060,9:10.000,10:10.000);

Primary Concordance Tree with Sample Concordance Factors:
((((1:1.000,(4:1.000,(6:1.000,8:1.000):0.545):0.796):0.443,((2:1.000,5:1.000):0.398,3:1.000):0.415):0.375,7:1.000):0.280,9:1.000,10:1.000);

Four-way partitions in the Population Tree: sample-wide CF, coalescent units and Ties(if present)
{1,4,6,7,8,9,10; 3|2; 5}	0.454, 0.200, 
{1,2,3,5,7,9,10; 4|6; 8}	0.584, 0.472, 
{1; 2,3,5,7,9,10|4; 6,8}	0.868, 1.619, 
{1,2,3,4,5,6,8; 7|9; 10}	0.372, 0.060, 
{1,4,6,8; 7,9,10|2,5; 3}	0.616, 0.552, 
{1; 4