This page was last updated on October 09, 2019.
In this tutorial we will learn about the following:
NOTE: In order to ensure we all get the same results when using random sampling routines, please run the following line of code every time you open an R/RStudio session (and disregard the warning it gives you):
## Warning in RNGkind("Mersenne-Twister", "Inversion", "Rounding"): non-
## uniform 'Rounding' sampler used
tigerstats
packagelibrary(tigerstats)
Import the bumpus.csv
data set:
Let’s inspect the bumpus
dataset, which contains data about a sparrow population caught in a wind storm in the 1880’s. Pay particular attention to the categorical variable sex
, which is stored as type “factor”:
##
## categorical variables:
## name class levels n missing
## 1 sex factor 2 136 0
## 2 age factor 3 136 0
## 3 survival logical 2 136 0
## distribution
## 1 m (64%), f (36%)
## 2 a (43.4%), (36%), y (20.6%)
## 3 TRUE (52.9%), FALSE (47.1%)
##
## quantitative variables:
## name class min Q1 median Q3
## 1 BumpusNumber integer 1.000 17.75000 34.500 51.25000
## 2 Total.length.mm. integer 152.000 157.00000 160.000 162.00000
## 3 alar.extent.mm. integer 230.000 242.00000 246.000 249.00000
## 4 weight.g. numeric 22.600 24.57500 25.550 26.50000
## 5 length.beak.and.head.mm. numeric 29.800 31.10000 31.600 32.02500
## 6 length.humerus.in. numeric 0.659 0.71775 0.733 0.74825
## 7 length.femur.in. numeric 0.653 0.70175 0.713 0.73125
## 8 length.tibiotarsus.in. numeric 1.011 1.11200 1.133 1.16225
## 9 skull.width.in. numeric 0.551 0.59200 0.602 0.61100
## 10 keel.length.in. numeric 0.734 0.80900 0.841 0.86525
## max mean sd n missing
## 1 72.000 34.6176471 19.90349265 136 0
## 2 167.000 159.5441176 3.56083146 136 0
## 3 256.000 245.1911765 5.52102340 136 0
## 4 31.000 25.5250000 1.47521499 136 0
## 5 33.400 31.5742647 0.70237653 136 0
## 6 0.780 0.7319412 0.02307821 136 0
## 7 0.767 0.7130294 0.02411329 136 0
## 8 1.230 1.1335662 0.04074450 136 0
## 9 0.640 0.6024779 0.01499579 136 0
## 10 0.927 0.8399338 0.03964924 136 0
Note in the inspection above that of the 136 individual birds, 87 (64%) were male.
The binomial distribution applies to cases in which we conduct a set number of independent random trials, and for each trial there is an equal probability of a success.
Let’s say there are n trials and the probability of success is p. Of those n trials, anywhere between 0 and n of them could produce an outcome “success” and the rest will be failures. Let’s call the number of successes X, and thus the probability of X successes is:
Here we will simulate rolling a fair, 6-sided die to illustrate what a binomial distribution is.
First let’s specify the number of random trials (rolls of the die) to do:
Now let’s create a vector object that includes all the possible outcomes associated with a single random trial; here, this is the numbers one through six:
Now let’s specify what a “success” is; here, rolling a “4” will be considered a success, which we know will have an associated probability of 1/6:
With these variables specified above, we now have the ingredients to conduct a simulation to show what a binomial distribution looks like.
First, let’s roll our die once. We use that handy command sample
that you learned about in the Simulate random trials tutorial, remembering to first set the seed to ensure reproducible results:
set.seed(16)
samp1 <- sample(die.outcomes, size = 1) # this samples one value at random from the numbers 1:6
samp1
## [1] 5
What number did you get? You should have gotten a 5.
Now let’s devise a way to keep track of whether the outcome was a success or not. First, let’s ask whether the outcome is equal to our definition of a success, i.e. whether it equals 4:
set.seed(18)
samp2 <- sample(die.outcomes, size = 1) # get sample
samp2 == success # check if the sample = 4
## [1] FALSE
This time we get the logical value “FALSE”.
A handy feature of R is that it considers a value of “TRUE” as equal to “1”, and a value of “FALSE” as equal to “0”.
Thus, the code below will return a “0” if the roll of the die did not produce a 4, and a “1” if it did!
## [1] 0
Here we got a zero.
Let’s change the arguments to the “sample” command so that it mimics rolling the single die number.of.trials
times (defined above), or equivalently, mimics rolling n independent dice simultaneously. To do this, we need to specify the argument replace = TRUE
, which tells it to put the number it sampled back into the list, for each random trial.
## [1] 5 2 4 6 6
What 5 numbers did you get? How many were a “success” (= 4)?
You can keep track of the number of successes all in one go by nesting the sample
routine within the sum
command as follows:
## [1] 0
Make sure you understand what the output is telling you: it is the number of trials (rolls of a die) out of 5 that yielded a “success”; or equivalently, the number of die out of the 5 die rolled simultaneously that yielded a “success”. Typically we represent this number as “X”, as in the formula for the binomial distribution shown above.
Set the seed with number 12, and roll a fair, six-sided die 12 times. Consider a success to be rolling a number 2.
It turns out there’s a handy function dbinom
that will calculate the exact probability associated with any particular outcome for a random trial with a given sample space (set of outcomes) and probability of success. It uses the equation shown above.
?dbinom
So, for our example involving rolling a fair, 6-sided die n = 6 times (six random trials), and a probability (p) of observing a 4 of 1/6 (0.1667), we can calculate, for example, the probability of observing two successes (i.e. two number 4s) in our 6 rolls of the die (random trials):
## [1] 0.2009388
In order to get the probabilities associated with each possible outcome (i.e. 0 through 6 successes), we use the following code:
## [1] 3.348980e-01 4.018776e-01 2.009388e-01 5.358368e-02 8.037551e-03
## [6] 6.430041e-04 2.143347e-05
The dbinom
function will accept a vector of values of x
, for which the associated probabilities are calculated.
Now let’s use these exact probabilities to create a barplot showing an exact, discrete probability distribution, corresponding to the binomial distribution with a sample size (number of trials) of n = 6 and a probability of success p = 1/6:
barplot(exact.probs,
names.arg = 0:sampsize,
xlab = "Number of successes (X)",
ylab = "Probability",
las = 1) # rotate axes labels
Figure 1: Probability of obtaining X successes out of 6 random trials, with probability of success = 1/6.
The names.arg
argument in the preceding barplot
code provides the labels for each bar.
Now let’s increase the sample size to n = 15, and compare the distribution to that observed using n = 6:
Now plot the binomial probability distribution:
barplot(exact.probs,
names.arg = 0:sampsize,
xlab = "Number of successes (X)",
ylab = "Probability",
las = 1) # rotate axes labels
Figure 2: Probability of obtaining X successes out of 15 random trials, with probability of success = 1/6.
dbinom
function to calculate the probability of rolling three “2”s when rolling a fair six-sided die 20 times.For this example we’ll use the bumpus
dataset we imported above. It contains data about sparrows caught in a wind storm in the 1880’s.
Let’s assume that, as with many animal species, male sparrows tend to move around more than females. Moving around more makes males more susceptible to predation, and to events such as wind storms. Let’s also assume that in the general population of sparrows, males represent 50% of the population at any given time.
Based on the available data, did the wind storm catch males more than females?
Identify the appropriate test statistic, and calculate the test statistic value using observed data
Simulate data and use them to construct a null distribution for the test statistic, under the assumption of a true null hypothesis
OR: Instead of simulating data to construct a null distribution, use an appropriate statistical test, which provides a theoretical null distribution
Draw the appropriate conclusion and communicate it clearly
Let’s devise an appropriate null and alternative hypothesis for this question.
TIP: Recall that “probability” and “proportion” are related: the probability of an individual randomly drawn from the population belonging to the particular category of interest is equal to the proportion of individuals in the population that belong to that category.
H0: Males and females were caught by the storm in equal proportions (p = 0.5).
HA: Males and females were not caught by the storm in equal proportions (p \(\ne\) 0.5).
We’ll use an \(\alpha\) level of 0.05.
It is a two-tailed alternative hypothesis; even though our research hypothesis suggests that males would have been more susceptible to being caught by the storm than females, we must allow for the possibility that they were less susceptible than females.
The test statistic is the number of males caught (successes) (which easily converts to a proportion)
We’ll use a binomial test to test the null hypothesis, because this is the most powerful test when analyzing frequency data pertaining to a categorical variable (eg. sex) that has only 2 categories (male / female).
The binomial test assumes that the random trials (samples) were independent, and the probability of success was equal in each trial - we’ll assume so
Imagine what the sampling distribution for this test statistic would look like, given a TRUE null hypothesis: it would be centred on 0.5 times the total number individuals in the sample (here, 136), so 68, and would also be symmetric around this value (because our null value for p is 0.5). We then use this sampling distribution as the reference against which we compare our observed number of males caught.
In practice, however, we use the binomial.test
function to do this comparison for us (see below).
## [1] 87
or using this code, which shows the counts of both the successes (“male”) and failures (“females”):
##
## f m
## 49 87
This represents our number of successes X, and corresponds to a \(\hat{p}\) = 87/136 = 0.6397. This observed proportion is certainly higher than the expected proportion of 0.5, but is it significantly higher?
Based on the fact that males represent 50% of the general sparrow population, the correct null value to use for p is 0.5.
We now have all we need to calculate the P-value associated with our test statistic, using the binomial test.
The binom.test
function in the tigerstats
package (in turn taken from the mosaic
package) is a modified version of the function by the same name in the base
package:
?binom.test # select the function associated with the "mosaic" package
You’ll see that there is an argument alternative
, and that the first (default) value for this argument is “two.sided”. This default ensures that the P-value calculated from the binomial distribution is appropriately two-sided.
Now let’s conduct the test:
binom.result <- binom.test(~ sex, data = bumpus, p = 0.5, success = "m", alternative = "two.sided", ci.method = "Agresti-Coull")
binom.result
##
## Exact binomial test (Agresti-Coull CI)
##
## data: bumpus$sex [with success = m]
## number of successes = 87, number of trials = 136, p-value =
## 0.001419
## alternative hypothesis: true probability of success is not equal to 0.5
## 95 percent confidence interval:
## 0.5561159 0.7156204
## sample estimates:
## probability of success
## 0.6397059
Alternative syntax for the binom.test
function:
binom.result <- binom.test(x = 87, n = 136, p = 0.5, alternative = "two.sided", ci.method = "Agresti-Coull")
binom.result
The output from the binomial test includes the number of successes, the number of trials, and the calculated P-value. It also includes the appropriate Agresti-Coull 95% confidence interval.
## [1] "statistic" "parameter" "p.value" "conf.int" "estimate"
## [6] "null.value" "alternative" "method" "data.name"
NOTE: the confidence interval provided in the binom.test
in the base
package (as opposed to the one in the tigerstats
package, wich we use here) is not the correct Agresti-Coull interval!
ALWAYS use the Agresti-Coull method to calculate a confidence interval for a proportion, as shown in the Estimating proportions tutorial.
Now we have all the ingredients for a proper concluding statement.
This is an example of an appropriate concluding statement for a binomial test:
Male sparrows were caught by the storm at a significantly higher proportion than females (n = 136; observed proportion of males = 0.64; Binomial test; P-value = 0.001; Agresti-Coull 95% confidence interval: 0.556 \(\leq\) p \(\leq\) 0.716.
In the preceding example, we calculated the Agresti-Coull 95% confidence interval as: 0.556 \(\leq\) p \(\leq\) 0.716.
Given that the interval excludes (does not encompass) the null hypothesized proportion of 0.5, we can reject the null hypothesis.
In this case, the appropriate concluding statement would be:
Male sparrows were caught by the storm at a significantly higher proportion than 0.5 (n = 136; observed proportion of males = 0.64; Agresti-Coull 95% confidence interval: 0.556 \(\leq\) p \(\leq\) 0.716).
NOTE: The binomtestGC
function available in the tigerstats
can be used to conduct a binomial test, but the 95% confidence interval that it calculates is not the appropriate Agresti-Coull interval. We therefore do not use this function.
In the Hypothesis testing tutorial, Activity 1 asked you to conduct all the steps of a hypothesis test for Example 6.2 in the text book (concerning handedness in toads) using a simulation approach: you were to simulate data to construct a null distribution, with which a P-value could be calculated.
Using the present tutorial as a guide, repeat this activity, but this time use a binomial test to calculate an exact probability (as shown above) on Example 6.2 in the text book (concerning handedness in toads). Be sure to include all the steps of a hypothesis test.
Getting started:
library
Data frame structure:
str
inspect
Tabulation:
xtabs
Simulation:
set.seed
sample
do
seq
Math:
sum
Graphs:
barplot
Binomial distribution:
binom.confint
(from the binom
package)binom.test
(from the binom
package)