---
title: "TreeSearch results"
author: "Martin R. Smith"
---

# Corrected parsimony {#treesearch}

The phylogenetic dataset contains a considerable proportion of inapplicable codings (`r sum(my_chars == '-')`, = `r round(100 * sum(my_chars == '-') / sum(my_chars != '?'), 1)`% of `r sum(my_chars != '?')` non-ambiguous tokens;  `r round(sum(my_chars == '-') / length(my_chars), 3) * 100`% of `r length(my_chars)` total cells), which are known to introduce error and bias to phylogenetic reconstruction when the Fitch algorithm is employed [@Maddison1993;@Brazeau2018].  As such, we used the R package _TreeSearch_ v0.1.2 [@Smith2018TreeSearch] to conduct phylogenetic tree search with a tree-scoring algorithm that avoids logically impossible character transformations when handling inapplicable data [@Brazeau2018], implemented in the _MorphyLib_ C library [@Brazeau2017Morphylib].

## Search parameters

Heuristic searches were conducted using the parsimony ratchet [@Nixon1999] under equal and implied weights [@Goloboff1997].  The consensus tree presented in the main manuscript represents a strict consensus of all trees that are most parsimonious under one or more of the concavity constants (_k_) `r paste(kValues[-length(kValues)], collapse=', ')` and `r kValues[length(kValues)]`, an approach that has been shown to produce higher accuracy (i.e. more nodes and quartets resolved correctly) than equal weights at any given level of precision [@Smith2019].

## Analysis 

The R commands used to conduct the analysis are reproduced below.
The results can most readily be replicated using the [R markdown files](https://github.com/ms609/hyoliths/) used to generate these pages:
in Rstudio, run `r GitLink("index.Rmd", "index.Rmd")`, then run each block in 
`r GitLink("13_TreeSearch.Rmd", "TreeSearch.Rmd")`.  The complete analysis will take several hours.

### Initialize and load data

```{r treesearch-load-morphoBank, echo=TRUE, eval=FALSE}
# Load data from locally downloaded copy of MorphoBank matrix
my_data <- ReadAsPhyDat(nexusFile)
my_data[ignoredTaxa] <- NULL
iw_data <- PrepareDataIW(my_data)
```

### Generate starting tree

Start by quickly rearranging a neighbour-joining tree, rooted on the outgroup.

```{R treesearch-starting-tree, echo=TRUE}
nj.tree <- NJTree(my_data)
rooted.tree <- EnforceOutgroup(nj.tree, outgroup)
start.tree <- TreeSearch(tree=rooted.tree, dataset=my_data, maxIter=3000,
                         EdgeSwapper=RootedNNISwap, verbosity=0)
```

### Implied weights analysis

The position of the root does not affect tree score, so we keep it fixed (using `RootedXXXSwap` functions) to avoid unnecessary swaps.

```{r treesearch-implied-weights-analysis, echo=TRUE, eval=FALSE}
for (k in kValues) {
  iw.tree <- IWRatchet(start.tree, iw_data, concavity=k,
                       ratchHits = 20, ratchIter=4000, searchHits=56,
                       swappers=list(RootedTBRSwap, RootedSPRSwap, RootedNNISwap),
                       verbosity=0L)
  score <- IWScore(iw.tree, iw_data, concavity=k)
  # Write a single best tree
  write.nexus(iw.tree,
              file=paste0("TreeSearch/hk_iw_k", k, "_", 
                          signif(score, 5), ".nex", collapse=''))

  iw.consensus <- IWRatchetConsensus(iw.tree, iw_data, concavity=k,
                  swappers=list(RootedTBRSwap, RootedNNISwap),
                  searchHits=55, searchIter=4000, nSearch=250, verbosity=0L)
  write.nexus(iw.consensus,
              file=paste0("TreeSearch/hk_iw_k", k, "_", 
                          signif(IWScore(iw.consensus[[1]], iw_data, concavity=k), 5),
                          ".all.nex", collapse=''))
}
```


```{R treesearch-deeper-iw, echo=FALSE, eval=FALSE}
# This block provides a crude check that each run has found an optimum;
# if not it continues searching, and you may consider increasing the 
# ratchHits, ratchIter or searchHits parameters above.
for (k in kValues) {
  source('loadTrees.R')
  
  k.scores <- unlist(lapply(iw.exist, function (x) IWScore(x[[1]], iw_data, k)))
  ew.scores <- vapply(ew.trees, IWScore, double(1), iw_data, k)
  this.k <- kValues[iw.treesLoaded] == k
  if (!any(this.k)) next
  my.score <- k.scores[this.k]
  ew.best <- min(ew.scores)
  if (ew.best < min(k.scores)) {
    start.tree <- ew.trees[which.min(ew.scores)]
  } else if (k.scores[this.k] > min(k.scores)) {
    start.tree <- iw.trees[[which.min(k.scores)]][[1]]
  } else {
    cat("\n\n * The best tree at k =", k, "is better than any other optimal tree.")
    next
  }
  
  attr(start.tree, 'score') <- NULL
  cat("\n\n * Did not find global optimum for k =", k, "; continuing search.\n   ")
  iw.tree <- IWRatchet(start.tree, iw_data, concavity=k,
                       ratchHits = 30, ratchIter=4000, searchHits=56,
                       swappers=list(RootedTBRSwap, RootedSPRSwap, RootedNNISwap),
                       verbosity=1L)
  score <- IWScore(iw.tree, iw_data, concavity=k)
  # Write a single best tree
  write.nexus(iw.tree,
              file=paste0("TreeSearch/hk_iw_k", k, "_", 
                          signif(score, 5), ".nex", collapse=''))

  iw.consensus <- IWRatchetConsensus(iw.tree, iw_data, concavity=k,
                  swappers=list(RootedTBRSwap, RootedNNISwap),
                  searchHits=55, nSearch=250, searchIter=4000, verbosity=1L)
  write.nexus(iw.consensus,
              file=paste0("TreeSearch/hk_iw_k", k, "_", 
                          signif(IWScore(iw.consensus[[1]], iw_data, concavity=k), 5),
                          ".all.nex", collapse=''))
}
```

### Equal weights analysis

```{r treesearch-equal-weights-analysis, echo=TRUE, eval=FALSE}
ew.tree <- Ratchet(start.tree, my_data, verbosity=0L,
                   ratchHits = 20, ratchIter=4000, searchHits=55,
                   swappers=list(RootedTBRSwap, RootedSPRSwap, RootedNNISwap))
ew.consensus <- RatchetConsensus(ew.tree, my_data, nSearch=250, searchHits = 85,
                                 swappers=list(RootedTBRSwap, RootedNNISwap),
                                 verbosity=0L)
write.nexus(ew.consensus, file=paste0(collapse='', "TreeSearch/hk_ew_",
                                      Fitch(ew.tree, my_data), ".nex"))
```

## Results

Optimal trees can be downloaded in Nexus format from `r GitLink('TreeSearch', raw=FALSE)`.

```{R treesearch-load-trees, echo=FALSE}
source('loadTrees.R')
tipIndex <- sort(allTrees[[1]]$tip.label)
```

(ref:ts-all-cons) Consensus of all parsimony trees, under equal and implied weights.  

(ref:node-support) Node labels denote the frequency of each clade in most parsimonious trees under all analytical conditions.  

```{r treesearch-maj-consensus, fig.cap="(ref:ts-all-cons) (ref:node-support)", fig.width=6, fig.height=5.1, echo=FALSE}
majCon <- RootTree(consensus(allTrees, p=0.5), rootingTips)
supporters <- SplitFrequency(majCon, allTrees)

ColPlot(majCon)
nodelabels(paste0("\n\n", signif(supporters / length(allTrees), 2)),
           col=SupportColour(round(supporters / length(allTrees), 2)),
           adj=0, pos=2, frame='none', cex=0.75)

```

(ref:ts-trim-cons) Consensus of same trees, with taxa pruned before constructing consensus to give context to clade support.  

```{r treesearch-maj-consensus-pruned, fig.cap="(ref:ts-trim-cons) (ref:node-support)", fig.width=6, fig.height=5.1, echo=FALSE}
#omit <- c("Micrina", 'Mickwitzia_muralensis', 'Heliomedusa_orienta')
omit <- c("Cotyledion_tylodes", "Namacalathus", "Phoronis")
allPruned <- lapply(allTrees, drop.tip, omit)
majCon <- RootTree(consensus(allPruned, p=0.5), rootingTips)

####
supporters <- SplitFrequency(majCon, allPruned)
####
ColPlot(majCon)
nodelabels(paste0("\n", signif(supporters / length(allTrees), 2)),
           col=SupportColour(round(supporters / length(allTrees), 2)),
           adj=0, pos=2, frame='none', cex=0.75)
ColMissing(omit)

```

(ref:ts-iw-indiv) Strict consensus trees of implied weights analyses

\clearpage 

```{r treesearch-iw-indiv-1, fig.width=7.1, fig.height=5.1, fig.cap="(ref:ts-iw-indiv) (ref:first-panels)", echo=FALSE}
# Plot consensus results
par(mfrow=c(1, 2), mar=rep(0.2, 4))
PlotPanel(iw.trees, 1)
PlotPanel(iw.trees, 2)
```

\clearpage 

```{R treesearch-iw-indiv-2, fig.width=7.1, fig.height=5.1, fig.cap="(ref:ts-iw-indiv) (ref:second-panels)", echo=FALSE}
par(mfrow=c(1, 2), mar=rep(0.2, 4))
PlotPanel(iw.trees, 3)
PlotPanel(iw.trees, 4)
```

\clearpage 

```{R treesearch-iw-indiv-3, fig.width=7.1, fig.height=5.1, fig.cap="(ref:ts-iw-indiv) (ref:third-panels)", echo=FALSE}
par(mfrow=c(1, 2), mar=rep(0.2, 4))
PlotPanel(iw.trees, 5)
PlotPanel(iw.trees, 6)
```

\clearpage

```{r treesearch-equal-weights-results, fig.cap="Strict consensus of most parsimonious trees under equally weighted parsimony", fig.width=6, fig.height=7.1, echo=FALSE}
ew.trees <- lapply(ew.trees, drop.tip, 'Siphogonuchites_multa')
ColPlot(RootTree(consensus(ew.trees), rootingTips))
```


```{R treesearch-equal-weights-pruned-consensus, fig.cap="Strict consensus of equal weights results, taxa excluded", fig.width=6, fig.height=5.1, echo=FALSE, eval=FALSE}
omit <- c("Cotyledion_tylodes", "Micrina")
ColPlot(RootTree(ConsensusWithout(ew.trees, omit), rootingTips))
ColMissing(omit)
```

\clearpage