{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "

GESIS Summer School in Survey Methodology 2018:
Meta-Analysis in Social Research and Survey Methodology

" ] }, { "cell_type": "markdown", "metadata": { "toc": true }, "source": [ "

Table of Contents

\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Preliminaries" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Please do not touch anything in this section, otherwise this notebook might not work properly. You have been warned! Also, if you have no clue what you are staring at, please consult our [Preface chapter](1-1_preface.ipynb)." ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [], "source": [ "source(\"run_me_first.R\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Getting to know Jupyter notebooks" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercise" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a new R \"Code\" and then a new \"Markdown\" cell under this (Markdown) cell. In the code cell, execute the expression `13*17*19`." ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [], "source": [ "## Please insert your solution here. Of course, feel free to add new code cells." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create more Markdown and R cells and execute them. Those of you who know LaTeX can use the `$...$` notation to display math content, e.g. $\\frac{17}{83}$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercise" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Delete the following cell (which contains the R code comment \"## Ave Caesar, morituri te salutant\"):" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [], "source": [ "## Ave Caesar, morituri te salutant." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Using R" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercise" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(Re)Load the R package `metafor` (again, the message starting with \"Loading required package: Matrix\" means everything is okay and it worked)." ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [], "source": [ "## Please insert your solution here. Of course, feel free to add new code cells." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### The \"BCG vaccine dataset\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Throughout many exercises, we will be using the \"BCG vaccine dataset\", which can be found in the package `metafor`. All details about his dataset can be found at [http://www.metafor-project.org/doku.php/analyses:berkey1995](http://www.metafor-project.org/doku.php/analyses:berkey1995). In brief, this dataset contains results from 13 studies examining the effectiveness of the Bacillus Calmette-Guerin (BCG) vaccine against tuberculosis." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "More details can be found by executing the help command `?dat.bcg` in R (at the bottom of this window should open a new window that contains the dataset description):" ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [], "source": [ "## Remove the ## in the next line to execute the help command. \n", "## ?dat.bcg " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "An excerpt of `?dat.cgb` can also found below. The data frame contains the following columns:" ] }, { "cell_type": "raw", "metadata": {}, "source": [ "trial\tnumeric\ttrial number\n", "author\tcharacter\tauthor(s)\n", "year\tnumeric\tpublication year\n", "tpos\tnumeric\tnumber of TB positive cases in the treated (vaccinated) group\n", "tneg\tnumeric\tnumber of TB negative cases in the treated (vaccinated) group\n", "cpos\tnumeric\tnumber of TB positive cases in the control (non-vaccinated) group\n", "cneg\tnumeric\tnumber of TB negative cases in the control (non-vaccinated) group\n", "ablat\tnumeric\tabsolute latitude of the study location (in degrees)\n", "alloc\tcharacter\tmethod of treatment allocation (random, alternate, or systematic assignment)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The 13 studies provide data in terms of 2x2 tables in the form:\n", "\n", "| .| TB positive | TB negative |\n", "|------------------|-------------|-------------|\n", "| vaccinated group | tpos | tneg |\n", "| control group\t | cpos\t | cneg |\n", "\n", "Actually, we like this version of the table better (it also corresponds with our slides on RRs):\n", "\n", "| .| vaccinated group | control group|\n", "|------------------|------------------|-------------|\n", "| TB positive | tpos | cpos |\n", "| TB negative \t | tneg\t | cneg |\n", "\n", "The goal of the meta-analysis was to examine the overall effectiveness of the BCG vaccine for preventing tuberculosis and to examine moderators that may potentially influence the size of the effect." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following function `escalc` will be introduced in more detail tomorrow. For now, you just need to know that it is used to calculate the effect size (relative risk/risk ratio) and corresponding variance (the log relative risk)." ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [], "source": [ "dat.bcg <- escalc(measure = \"RR\", ai = tpos, bi = tneg, ci = cpos, di = cneg, data = dat.bcg)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercise" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Inspect the BCG data by typing in the name `dat.bcg`. Make sure that you first load the `metafor` package.\n", "\n", "Please note, here, and in the following notebooks, the `metafor` package is loaded multiple times in a notebook (using the command `library(metafor)`). It is usually sufficient to load the `metafor` package only once per notebook. However, sometimes we load a notebook and jump right to a certain exercise without executing all cells. In that case it is helpful when each exercise is \"self-sufficient\", i.e. without running all cells above. " ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [], "source": [ "## Please insert your solution here. Of course, feel free to add new code cells." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In most cases, it is better to only show the first few rows of a dataset. This can be accomplished using the `head()` function (it also looks nicer in the Jupyter notebook):" ] }, { "cell_type": "code", "execution_count": 55, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
trialauthoryeartpostnegcposcnegablatallocyivi
1 Aronson 1948 4 119 11 128 44 random -0.8893113 0.325584765
2 Ferguson & Simes 1949 6 300 29 274 55 random -1.5853887 0.194581121
3 Rosenthal et al 1960 3 228 11 209 42 random -1.3480731 0.415367965
4 Hart & Sutherland 1977 62 13536 248 12619 52 random -1.4415512 0.020010032
5 Frimodt-Moller et al1973 33 5036 47 5761 13 alternate -0.2175473 0.051210172
6 Stein & Aronson 1953 180 1361 372 1079 44 alternate -0.7861156 0.006905618
\n" ], "text/latex": [ "\\begin{tabular}{r|lllllllllll}\n", " trial & author & year & tpos & tneg & cpos & cneg & ablat & alloc & yi & vi\\\\\n", "\\hline\n", "\t 1 & Aronson & 1948 & 4 & 119 & 11 & 128 & 44 & random & -0.8893113 & 0.325584765 \\\\\n", "\t 2 & Ferguson \\& Simes & 1949 & 6 & 300 & 29 & 274 & 55 & random & -1.5853887 & 0.194581121 \\\\\n", "\t 3 & Rosenthal et al & 1960 & 3 & 228 & 11 & 209 & 42 & random & -1.3480731 & 0.415367965 \\\\\n", "\t 4 & Hart \\& Sutherland & 1977 & 62 & 13536 & 248 & 12619 & 52 & random & -1.4415512 & 0.020010032 \\\\\n", "\t 5 & Frimodt-Moller et al & 1973 & 33 & 5036 & 47 & 5761 & 13 & alternate & -0.2175473 & 0.051210172 \\\\\n", "\t 6 & Stein \\& Aronson & 1953 & 180 & 1361 & 372 & 1079 & 44 & alternate & -0.7861156 & 0.006905618 \\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "trial | author | year | tpos | tneg | cpos | cneg | ablat | alloc | yi | vi | \n", "|---|---|---|---|---|---|\n", "| 1 | Aronson | 1948 | 4 | 119 | 11 | 128 | 44 | random | -0.8893113 | 0.325584765 | \n", "| 2 | Ferguson & Simes | 1949 | 6 | 300 | 29 | 274 | 55 | random | -1.5853887 | 0.194581121 | \n", "| 3 | Rosenthal et al | 1960 | 3 | 228 | 11 | 209 | 42 | random | -1.3480731 | 0.415367965 | \n", "| 4 | Hart & Sutherland | 1977 | 62 | 13536 | 248 | 12619 | 52 | random | -1.4415512 | 0.020010032 | \n", "| 5 | Frimodt-Moller et al | 1973 | 33 | 5036 | 47 | 5761 | 13 | alternate | -0.2175473 | 0.051210172 | \n", "| 6 | Stein & Aronson | 1953 | 180 | 1361 | 372 | 1079 | 44 | alternate | -0.7861156 | 0.006905618 | \n", "\n", "\n" ], "text/plain": [ " trial author year tpos tneg cpos cneg ablat alloc \n", "1 1 Aronson 1948 4 119 11 128 44 random \n", "2 2 Ferguson & Simes 1949 6 300 29 274 55 random \n", "3 3 Rosenthal et al 1960 3 228 11 209 42 random \n", "4 4 Hart & Sutherland 1977 62 13536 248 12619 52 random \n", "5 5 Frimodt-Moller et al 1973 33 5036 47 5761 13 alternate\n", "6 6 Stein & Aronson 1953 180 1361 372 1079 44 alternate\n", " yi vi \n", "1 -0.8893113 0.325584765\n", "2 -1.5853887 0.194581121\n", "3 -1.3480731 0.415367965\n", "4 -1.4415512 0.020010032\n", "5 -0.2175473 0.051210172\n", "6 -0.7861156 0.006905618" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "head(dat.bcg)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercise" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, estimate the simple mean and the simple variance for the effect size distribution (the `yi`’s). The\n", "function for the mean is `mean()` and for the variance `var()`. \n", "\n", "Tip: You probably need to know how to access the variable `yi` in the `dat.bcg` object. Have a look at the sections on \"It’s all about objects\" and \"Accessing elements of a data frame\". \n", "\n", "Tip 2: In [ROT13](https://rot13.de/index.php) encoded, you will need `qng.opt$lv` ;-)" ] }, { "cell_type": "code", "execution_count": 56, "metadata": {}, "outputs": [], "source": [ "## Please insert your solution here. Of course, feel free to add new code cells." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's learn more about assignments `<-`. Use `<-` to save the result of `mean()` to a new R object `m`. Use `v` for `var()`. " ] }, { "cell_type": "code", "execution_count": 57, "metadata": {}, "outputs": [], "source": [ "## Please insert your solution here. Of course, feel free to add new code cells." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Take the square root of `v` and `var(dat.bcg$yi)` and create the new objects `se` and `se_2`. Can you see the difference?" ] }, { "cell_type": "code", "execution_count": 58, "metadata": {}, "outputs": [], "source": [ "## Please insert your solution here. Of course, feel free to add new code cells." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Calculate the square of `se`." ] }, { "cell_type": "code", "execution_count": 59, "metadata": {}, "outputs": [], "source": [ "## Please insert your solution here. Of course, feel free to add new code cells." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercise" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So, let’s continue the exploration of the effect sizes in the BCG dataset; we will now learn\n", "about R’s graphical capabilities. Your next task is to draw a histogram. A histogram can be drawn with the function `hist(x)`, with `x` the respective vector of effect sizes " ] }, { "cell_type": "code", "execution_count": 60, "metadata": {}, "outputs": [], "source": [ "## Please insert your solution here. Of course, feel free to add new code cells." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercise" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The last exercise that is based on the BCG data investigates the association between the effect size `yi` and the moderator variable `ablat` . We can estimate the Pearson product-moment correlation or run a simple linear regression." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To estimate the product-moment correlation, we can use the function `cor(x, y)` or `cor.test(x, y)`." ] }, { "cell_type": "code", "execution_count": 61, "metadata": {}, "outputs": [], "source": [ "## Please insert your solution here. Of course, feel free to add new code cells." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To run a linear regression in R, we can use the function `lm(y ~ x, data = ...)` . The tilde (`~`) separates the outcome variable and the predictor variable(s) (more predictor variables can be added using the `+` symbol, e.g `... ~ x1 + x2 + x3, data = ... `). So, in this exercise you want to fit a simple regression model, where yi is the outcome and `ablat` is the predictor variable. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Tip: Compared to other statistical packages, the output of a lm() call looks rather minimal; no significance stars, no R 2 statistic, etc. The `summary()` function provides much more information about the model fit and the regression coefficients. So, your final R call might look like this: `summary(lm(y ~ x, data = ...))`." ] }, { "cell_type": "code", "execution_count": 62, "metadata": {}, "outputs": [], "source": [ "## Please insert your solution here. Of course, feel free to add new code cells." ] } ], "metadata": { "kernelspec": { "display_name": "R", "language": "R", "name": "ir" }, "language_info": { "codemirror_mode": "r", "file_extension": ".r", "mimetype": "text/x-r-source", "name": "R", "pygments_lexer": "r", "version": "3.5.1" }, "toc": { "base_numbering": 1, "nav_menu": { "height": "76px", "width": "182px" }, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": true, "toc_position": { "height": "calc(100% - 180px)", "left": "10px", "top": "150px", "width": "165px" }, "toc_section_display": "block", "toc_window_display": true } }, "nbformat": 4, "nbformat_minor": 2 }