---
title: Solution to week 3 homework
author: "Karl Broman"
date: "4 Feb 2015"
output: html_document
---

Problem 3 in this week's homework said, "Take the script from the last
problem in homework 2 and turn it into an R Markdown document." Here's
my solution.

The script performed an exhaustive permutation test
(with the t-statistic) and plotted the results.

I'm going to use default chunk options of `fig.width=12` (wider figure
width) and `dev="svg"` (use SVG for the figures). Normally I'd use
`include=FALSE` to hide this code chunk, but here I'll leave it in the
output.

```{r set_chunk_opts}
knitr::opts_chunk$set(fig.width=12, dev="svg")
```

Two utility functions were defined, `binary.v()` and
`perm.test()`. I'll define them here, but in a way that will be hidden
in the output (using the chunk option `echo=FALSE`).

```{r define_functions, echo=FALSE}
# Utility function
#     returns binary representation of 1:(2^n)
binary.v <-
    function(n)
{
    x <- 1:(2^n)
    mx <- max(x)
    digits <- floor(log2(mx))
    ans <- 0:(digits-1); lx <- length(x)
    x <- matrix(rep(x,rep(digits, lx)),ncol=lx)
    (x %/% 2^ans) %% 2
}

# exhaustive permutation test with the t-statistic
perm.test <-
    function(x, y, var.equal=TRUE)
{
    # number of data points
    kx <- length(x)
    ky <- length(y)
    n <- kx + ky

    # Data re-compiled
    X <- c(x,y)
    z <- rep(1:0,c(kx,ky))

    tobs <- t.test(x,y,var.equal=var.equal)$statistic

    o <- binary.v(n)  # indicator of all possible samples
    o <- o[,apply(o,2,sum)==kx]
    nc <- choose(n,kx)
    allt <- 1:nc
    for(i in 1:nc) {
        xn <- X[o[,i]==1]
        yn <- X[o[,i]==0]
        allt[i] <- t.test(xn,yn,var.equal=var.equal)$statistic
    }

    attr(allt, "tobs") <- tobs

    allt
}
```

We first create the data objects

```{r define_data}
x <- c(6.20, 5.72, 6.07, 6.75, 5.50, 6.39, 4.30, 4.96, 5.48)
y <- c(6.49, 6.52, 6.28, 8.59, 7.18, 4.92, 6.74, 7.27)
```

Here's a plot of the data, using `stripchart()`. I use `set.seed()` so
that it will appear exactly the same way every time. (The permutation
results below will also then be the same every time I compile this
document.)

```{r plot_data, fig.height=3.5}
set.seed(99693682)
stripchart(list(x=x, y=y), method="jitter", jitter=0.03,
           pch=21, bg="slateblue", las=1)
```

We call `perm.test()` to run the permutation test.

```{r run_perm_test}
permt <- perm.test(x, y)
```

We grab the observed t-statistic (which was saved as attribute)

```{r grab_observed_t}
tobs <- attr(permt, "tobs")
```

We can calculate the p-value from the permutation test, as the
proportion of t-statistics from the permutations that were greater or
equal to the observed t-statistic, in absolute value.
Note that the nominal p-value is `r round(t.test(x,y)$p.value, 3)`

```{r calc_pvalue}
pval <- mean(abs(permt) >= abs(tobs))
```

The observed t-statistic was `r round(tobs, 2)`. The p-value, for the
test of whether the two population averages were different, was
`r round(pval, 3)`.


I'll save the results to a file, but I'll hide this in the output.

```{r save_results, echo=FALSE}
save(permt, tobs, pval, file="permt_results.RData")
```

I'll plot the permutation results, with a vertical line at the
observed t-statistic.

```{r plot_permutations}
hist(permt, breaks=200, xlab="t-statistic", las=1,
     main = paste("P-value =", round(pval, 3)))
abline(v=tobs, lwd=2, col="violetred")
```

### Session info

I try to remember to end every R Markdown document with information
about the R and package versions that were used. R is distributed with
the `sessionInfo()` function (in the utils package); I prefer
`devtools::session_info()`, as the output is nicer, but I won't use it
here.

```{r session_info}
sessionInfo()
```