---
title: Solution to week 3 homework
author: "Karl Broman"
date: "4 Feb 2015"
output: html_document
---
Problem 3 in this week's homework said, "Take the script from the last
problem in homework 2 and turn it into an R Markdown document." Here's
my solution.
The script performed an exhaustive permutation test
(with the t-statistic) and plotted the results.
I'm going to use default chunk options of `fig.width=12` (wider figure
width) and `dev="svg"` (use SVG for the figures). Normally I'd use
`include=FALSE` to hide this code chunk, but here I'll leave it in the
output.
```{r set_chunk_opts}
knitr::opts_chunk$set(fig.width=12, dev="svg")
```
Two utility functions were defined, `binary.v()` and
`perm.test()`. I'll define them here, but in a way that will be hidden
in the output (using the chunk option `echo=FALSE`).
```{r define_functions, echo=FALSE}
# Utility function
# returns binary representation of 1:(2^n)
binary.v <-
function(n)
{
x <- 1:(2^n)
mx <- max(x)
digits <- floor(log2(mx))
ans <- 0:(digits-1); lx <- length(x)
x <- matrix(rep(x,rep(digits, lx)),ncol=lx)
(x %/% 2^ans) %% 2
}
# exhaustive permutation test with the t-statistic
perm.test <-
function(x, y, var.equal=TRUE)
{
# number of data points
kx <- length(x)
ky <- length(y)
n <- kx + ky
# Data re-compiled
X <- c(x,y)
z <- rep(1:0,c(kx,ky))
tobs <- t.test(x,y,var.equal=var.equal)$statistic
o <- binary.v(n) # indicator of all possible samples
o <- o[,apply(o,2,sum)==kx]
nc <- choose(n,kx)
allt <- 1:nc
for(i in 1:nc) {
xn <- X[o[,i]==1]
yn <- X[o[,i]==0]
allt[i] <- t.test(xn,yn,var.equal=var.equal)$statistic
}
attr(allt, "tobs") <- tobs
allt
}
```
We first create the data objects
```{r define_data}
x <- c(6.20, 5.72, 6.07, 6.75, 5.50, 6.39, 4.30, 4.96, 5.48)
y <- c(6.49, 6.52, 6.28, 8.59, 7.18, 4.92, 6.74, 7.27)
```
Here's a plot of the data, using `stripchart()`. I use `set.seed()` so
that it will appear exactly the same way every time. (The permutation
results below will also then be the same every time I compile this
document.)
```{r plot_data, fig.height=3.5}
set.seed(99693682)
stripchart(list(x=x, y=y), method="jitter", jitter=0.03,
pch=21, bg="slateblue", las=1)
```
We call `perm.test()` to run the permutation test.
```{r run_perm_test}
permt <- perm.test(x, y)
```
We grab the observed t-statistic (which was saved as attribute)
```{r grab_observed_t}
tobs <- attr(permt, "tobs")
```
We can calculate the p-value from the permutation test, as the
proportion of t-statistics from the permutations that were greater or
equal to the observed t-statistic, in absolute value.
Note that the nominal p-value is `r round(t.test(x,y)$p.value, 3)`
```{r calc_pvalue}
pval <- mean(abs(permt) >= abs(tobs))
```
The observed t-statistic was `r round(tobs, 2)`. The p-value, for the
test of whether the two population averages were different, was
`r round(pval, 3)`.
I'll save the results to a file, but I'll hide this in the output.
```{r save_results, echo=FALSE}
save(permt, tobs, pval, file="permt_results.RData")
```
I'll plot the permutation results, with a vertical line at the
observed t-statistic.
```{r plot_permutations}
hist(permt, breaks=200, xlab="t-statistic", las=1,
main = paste("P-value =", round(pval, 3)))
abline(v=tobs, lwd=2, col="violetred")
```
### Session info
I try to remember to end every R Markdown document with information
about the R and package versions that were used. R is distributed with
the `sessionInfo()` function (in the utils package); I prefer
`devtools::session_info()`, as the output is nicer, but I won't use it
here.
```{r session_info}
sessionInfo()
```