{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Introduction to Reproducible Research with R\n", "## Lab 01 for PH 290: Targeted Learning in Biomedical Big Data\n", "### Author: [Nima Hejazi](https://nimahejazi.org)\n", "### Date: 16 January 2017\n", "### Attribution: adapted from source materials by [David Benkeser](https://github.com/benkeser)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## I. Introduction\n", "\n", "This document provides a brief introduction to a few of the core constructs of the `R` language and environment for statistical computing. These will be necessary for the first laboratory assignment and will be staples that you will rely on throughout the course." ] }, { "cell_type": "code", "execution_count": 52, "metadata": { "collapsed": true }, "outputs": [], "source": [ "options(repr.plot.width = 4, repr.plot.height = 3) ## resizing plots\n", "options(scipen = 999) ## has scientific notation ever annoyed you?" ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "\n", "Attaching package: ‘usethis’\n", "\n", "The following objects are masked from ‘package:devtools’:\n", "\n", " use_appveyor, use_build_ignore, use_code_of_conduct, use_coverage,\n", " use_cran_badge, use_cran_comments, use_data, use_data_raw,\n", " use_dev_version, use_git, use_git_hook, use_github,\n", " use_github_links, use_gpl3_license, use_mit_license, use_news_md,\n", " use_package, use_package_doc, use_rcpp, use_readme_md,\n", " use_readme_rmd, use_revdep, use_rstudio, use_test, use_testthat,\n", " use_travis, use_vignette\n", "\n", "here() starts at /Users/nimahejazi/Dropbox/UC_Berkeley-grad/teaching/tlbbd-labs/lab_01\n" ] } ], "source": [ "# packages for project management\n", "library(usethis) # commonly used package components\n", "library(here) # set paths relative to a .Rproj file" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's take a moment to see why we'd use something like `here`:\n", "* https://twitter.com/hadleywickham/status/940021008764846080\n", "\n", "While `usethis` is not a necessity (unlike `here`), it does make life _a lot_ easier\n", "* https://github.com/r-lib/usethis" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "\n", "Attaching package: ‘skimr’\n", "\n", "The following objects are masked from ‘package:dplyr’:\n", "\n", " contains, ends_with, everything, matches, num_range, one_of,\n", " starts_with\n", "\n" ] } ], "source": [ "# some more data-oriented packages we'll use\n", "library(tidyverse) ## core tools for \"tidy\" data science\n", "library(skimr) ## take a deeper look at your data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## II. Simulating data\n", "\n", "A big part of what we statisticians do is based on simulating data. You just came up with a method that, in theory, is brilliant/elegant/the-best-thing-since-sliced-bread, but you need to illustrate that the method actually works well in practice. For example, your theoretical results may be based on the assumption of a large sample size and you want to investigate how well the method performs when the sample size is modestly small. Providing such information can help convince readers that your method is in fact a good idea. Simulations can also be helpful for showing when a method breaks down, which is something reviewers also like to see. \n", "\n", "To accomplish these tasks, we need to create a data generating experiment where we know the truth so that we can benchmark our method by it. Let's start with a very simple example. Our simulated experiment enrolls `n=100` subjects and measures an outcome $Y$ on each subject. Suppose that we want to know how our method performs when $Y \\sim N(0,1)$. " ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
typestatlevelvalueformattedvariable
numeric missing .all 0.000000000 Y
numeric complete .all 100.00000000100 Y
numeric n .all 100.00000000100 Y
numeric mean .all -0.13230679-0.13 Y
numeric sd .all 1.028382321.03 Y
numeric p0 .all -2.27469359-2.27 Y
numeric p25 .all -0.78641356-0.79 Y
numeric median .all -0.07902102-0.079 Y
numeric p75 .all 0.622997590.62 Y
numeric p100 .all 1.972016081.97 Y
numeric hist .all NA▃▂▅▇▇▇▃▂ Y
\n" ], "text/latex": [ "\\begin{tabular}{r|llllll}\n", " type & stat & level & value & formatted & variable\\\\\n", "\\hline\n", "\t numeric & missing & .all & 0.00000000 & 0 & Y \\\\\n", "\t numeric & complete & .all & 100.00000000 & 100 & Y \\\\\n", "\t numeric & n & .all & 100.00000000 & 100 & Y \\\\\n", "\t numeric & mean & .all & -0.13230679 & -0.13 & Y \\\\\n", "\t numeric & sd & .all & 1.02838232 & 1.03 & Y \\\\\n", "\t numeric & p0 & .all & -2.27469359 & -2.27 & Y \\\\\n", "\t numeric & p25 & .all & -0.78641356 & -0.79 & Y \\\\\n", "\t numeric & median & .all & -0.07902102 & -0.079 & Y \\\\\n", "\t numeric & p75 & .all & 0.62299759 & 0.62 & Y \\\\\n", "\t numeric & p100 & .all & 1.97201608 & 1.97 & Y \\\\\n", "\t numeric & hist & .all & NA & ▃▂▅▇▇▇▃▂ & Y \\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "type | stat | level | value | formatted | variable | \n", "|---|---|---|---|---|---|---|---|---|---|---|\n", "| numeric | missing | .all | 0.00000000 | 0 | Y | \n", "| numeric | complete | .all | 100.00000000 | 100 | Y | \n", "| numeric | n | .all | 100.00000000 | 100 | Y | \n", "| numeric | mean | .all | -0.13230679 | -0.13 | Y | \n", "| numeric | sd | .all | 1.02838232 | 1.03 | Y | \n", "| numeric | p0 | .all | -2.27469359 | -2.27 | Y | \n", "| numeric | p25 | .all | -0.78641356 | -0.79 | Y | \n", "| numeric | median | .all | -0.07902102 | -0.079 | Y | \n", "| numeric | p75 | .all | 0.62299759 | 0.62 | Y | \n", "| numeric | p100 | .all | 1.97201608 | 1.97 | Y | \n", "| numeric | hist | .all | NA | ▃▂▅▇▇▇▃▂ | Y | \n", "\n", "\n" ], "text/plain": [ " type stat level value formatted variable\n", "1 numeric missing .all 0.00000000 0 Y \n", "2 numeric complete .all 100.00000000 100 Y \n", "3 numeric n .all 100.00000000 100 Y \n", "4 numeric mean .all -0.13230679 -0.13 Y \n", "5 numeric sd .all 1.02838232 1.03 Y \n", "6 numeric p0 .all -2.27469359 -2.27 Y \n", "7 numeric p25 .all -0.78641356 -0.79 Y \n", "8 numeric median .all -0.07902102 -0.079 Y \n", "9 numeric p75 .all 0.62299759 0.62 Y \n", "10 numeric p100 .all 1.97201608 1.97 Y \n", "11 numeric hist .all NA ▃▂▅▇▇▇▃▂ Y " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# we're simulating -- it's important to set the seed for the PRNG\n", "set.seed(3724591) ## note something like '5' is a BAD seed, it might be worth reading about PRNGs\n", "\n", "# simulate 100 draws from a normal distribution.\n", "Y <- rnorm(n = 100, mean = 0 , sd = 1)\n", "\n", "# we'll use skimr::skim to take a look at y\n", "skim(Y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The above code chunk illustrates several important `R` functions.\n", "\n", "* First is `rnorm`, which simulates `n` observations from a standard Normal distribution with `mean = 0` and standard deviation `sd = 1`.\n", "* We also illustrated several functions that are useful for ensuring that the data were created properly.\n", "* The function `skim` from the [`skimr` package](https://cran.r-project.org/web/packages/skimr/index.html) provides a catch-all way to take a look at some important properties of a data set (in this case, just a vector). We can see that we received information about the quantiles (just like `summary` would provide), in addition to missingness/completeness, and even a histogram!\n", "\n", "Now, let's move on to visualizing our data. We'll do this with `ggplot2`, a modern graphics engine that is an implementation of the [Grammar of Graphics](http://vita.had.co.nz/papers/layered-grammar.pdf)." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.\n" ] }, { "data": {}, "metadata": {}, "output_type": "display_data" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAeAAAAFoCAYAAACPNyggAAAEGWlDQ1BrQ0dDb2xvclNwYWNl\nR2VuZXJpY1JHQgAAOI2NVV1oHFUUPrtzZyMkzlNsNIV0qD8NJQ2TVjShtLp/3d02bpZJNtoi\n6GT27s6Yyc44M7v9oU9FUHwx6psUxL+3gCAo9Q/bPrQvlQol2tQgKD60+INQ6Ium65k7M5lp\nurHeZe58853vnnvuuWfvBei5qliWkRQBFpquLRcy4nOHj4g9K5CEh6AXBqFXUR0rXalMAjZP\nC3e1W99Dwntf2dXd/p+tt0YdFSBxH2Kz5qgLiI8B8KdVy3YBevqRHz/qWh72Yui3MUDEL3q4\n4WPXw3M+fo1pZuQs4tOIBVVTaoiXEI/MxfhGDPsxsNZfoE1q66ro5aJim3XdoLFw72H+n23B\naIXzbcOnz5mfPoTvYVz7KzUl5+FRxEuqkp9G/Ajia219thzg25abkRE/BpDc3pqvphHvRFys\n2weqvp+krbWKIX7nhDbzLOItiM8358pTwdirqpPFnMF2xLc1WvLyOwTAibpbmvHHcvttU57y\n5+XqNZrLe3lE/Pq8eUj2fXKfOe3pfOjzhJYtB/yll5SDFcSDiH+hRkH25+L+sdxKEAMZahrl\nSX8ukqMOWy/jXW2m6M9LDBc31B9LFuv6gVKg/0Szi3KAr1kGq1GMjU/aLbnq6/lRxc4XfJ98\nhTargX++DbMJBSiYMIe9Ck1YAxFkKEAG3xbYaKmDDgYyFK0UGYpfoWYXG+fAPPI6tJnNwb7C\nlP7IyF+D+bjOtCpkhz6CFrIa/I6sFtNl8auFXGMTP34sNwI/JhkgEtmDz14ySfaRcTIBInmK\nPE32kxyyE2Tv+thKbEVePDfW/byMM1Kmm0XdObS7oGD/MypMXFPXrCwOtoYjyyn7BV29/MZf\nsVzpLDdRtuIZnbpXzvlf+ev8MvYr/Gqk4H/kV/G3csdazLuyTMPsbFhzd1UabQbjFvDRmcWJ\nxR3zcfHkVw9GfpbJmeev9F08WW8uDkaslwX6avlWGU6NRKz0g/SHtCy9J30o/ca9zX3Kfc19\nzn3BXQKRO8ud477hLnAfc1/G9mrzGlrfexZ5GLdn6ZZrrEohI2wVHhZywjbhUWEy8icMCGNC\nUdiBlq3r+xafL549HQ5jH+an+1y+LlYBifuxAvRN/lVVVOlwlCkdVm9NOL5BE4wkQ2SMlDZU\n97hX86EilU/lUmkQUztTE6mx1EEPh7OmdqBtAvv8HdWpbrJS6tJj3n0CWdM6busNzRV3S9KT\nYhqvNiqWmuroiKgYhshMjmhTh9ptWhsF7970j/SbMrsPE1suR5z7DMC+P/Hs+y7ijrQAlhyA\ngccjbhjPygfeBTjzhNqy28EdkUh8C+DU9+z2v/oyeH791OncxHOs5y2AtTc7nb/f73TWPkD/\nqwBnjX8BoJ98VQNcC+8AAC52SURBVHgB7d0JkBTV/cDx397sxX1jBCw8Ihoh4kUSQhGMeFUU\nS4wikY2YSBlNPOIdjRAtDaiFVRYe0TUiCqiBMoIo8YhQaIgHUoKiEYGoYATZBWHZ+8/vJb3/\nnd2dme7ZnunX099XBbPT/fodn9fTv+mePnKaDyQhIYAAAggggEBGBXIzWhuVIYAAAggggIAR\nIACzIiCAAAIIIBCAAAE4AHSqRAABBBBAgADMOoAAAggggEAAAgTgANCpEgEEEEAAAQIw6wAC\nCCCAAAIBCBCAA0CnSgQQQAABBAjArAMIIIAAAggEIJAfQJ1WVVlVVSV1dXWu2lRWVib79u2T\npqYmV/mDyJSTkyPdunWT+vp62bt3bxBNcF1ncXGxaWdDQ4PrZYLI2L17d2lsbJQ9e/YEUb3r\nOouKikze2tpa18sEkbG8vFxyc3Oluro6iOpd11lQUCD5+flSU1PjepkgMpaWloq2VT1tvq9S\nXl6edOnSJRTbJf0s6eddP/epJO1rr169ki4a+QCswG6RFVWDr9v8SfXTkEE3bIWFhaaNNrdT\nu65t1WR7O3Xjpl9sbG+ntlE3wLa3U4OafpZsb6eOu66jtrdTLZ3PvM0BWNfPMIy7sw3NxGeJ\nQ9AmBPAfAggggAACmRUgAGfWm9oQQAABBBAwAgRgVgQEEEAAAQQCECAAB4BOlQgggAACCBCA\nWQcQQAABBBAIQIAAHAA6VSKAAAIIIEAAZh1AAAEEEEAgAAECcADoVIkAAggggAABmHUAAQQQ\nQACBAAQIwAGgUyUCCCCAAAIEYNYBBBBAAAEEAhAgAAeATpUIIIAAAghE/mEMrAIIZIvApEmT\nPHWlsrLSU34yI4CAvwLsAfvrSWkIIIAAAgi4EiAAu2IiEwIIIIAAAv4KEID99aQ0BBBAAAEE\nXAkQgF0xkQkBBBBAAAF/BQjA/npSGgIIIIAAAq4ECMCumMiEAAIIIICAvwIEYH89KQ0BBBBA\nAAFXAgRgV0xkQgABBBBAwF8BArC/npSGAAIIIICAKwECsCsmMiGAAAIIIOCvAAHYX09KQwAB\nBBBAwJUAAdgVE5kQQAABBBDwV4AA7K8npSGAAAIIIOBKgADsiolMCCCAAAII+CtAAPbXk9IQ\nQAABBBBwJUAAdsVEJgQQQAABBPwVIAD760lpCCCAAAIIuBIgALtiIhMCCCCAAAL+ChCA/fWk\nNAQQQAABBFwJEIBdMZEJAQQQQAABfwUIwP56UhoCCCCAAAKuBPJd5criTPn5+VJYWOiqh3l5\neVJSUiLNzc2u8geRKScnx1Sr/SorKwuiCa7rLCgoEG2vvtqecnNzrff0ahjU+qGWOu5B1e/W\nST9D+pkPQzu1T9pOm7dNOu5h2S6pZ3FxsRQVFemfnpPbcYh8AFaoxsZGV8Cat6mpyfxztUAA\nmZwA7KVfATTTVOl4uvUPqp1abxg8vfoE7R50/cm8NGDoP9vbqeumJm2n83eyvgUxX9sWhs+R\nbuM16Wu6xz7yAViB6+rqXK2P+o1o//79aR8UV42Jk0k3GN26dTNtrKmpiZPLjsm656v2tbW1\ndjQoTivUUzcctnvGaX7cyUH1p7S01OwBB1V/XJA2M5xgZns7u3TpYo4iaTudNrfpihVvnaON\ntns6R+R0u1RfX5+SnR45cZP4DdiNEnkQQAABBBDwWYAA7DMoxSGAAAIIIOBGgADsRok8CCCA\nAAII+CxAAPYZlOIQQAABBBBwI0AAdqNEHgQQQAABBHwWIAD7DEpxCCCAAAIIuBEgALtRIg8C\nCCCAAAI+CxCAfQalOAQQQAABBNwIEIDdKJEHAQQQQAABnwUifycsnz0pDgEEskygoqLCU48q\nKys95SdzdAXYA47u2NNzBBBAAIEABQjAAeJTNQIIIIBAdAUIwNEde3qOAAIIIBCgAAE4QHyq\nRgABBBCIrgABOLpjT88RQAABBAIUIAAHiE/VCCCAAALRFSAAR3fs6TkCCCCAQIACBOAA8aka\nAQQQQCC6AgTg6I49PUcAAQQQCFCAABwgPlUjgAACCERXgAAc3bGn5wgggAACAQoQgAPEp2oE\nEEAAgegKEICjO/b0HAEEEEAgQAECcID4VI0AAgggEF0BAnB0x56eI4AAAggEKEAADhCfqhFA\nAAEEoitAAI7u2NNzBBBAAIEABQjAAeJTNQIIIIBAdAUIwNEde3qOAAIIIBCgAAE4QHyqRgAB\nBBCIrgABOLpjT88RQAABBAIUIAAHiE/VCCCAAALRFSAAR3fs6TkCCCCAQIACBOAA8akaAQQQ\nQCC6AgTg6I49PUcAAQQQCFCAABwgPlUjgAACCERXgAAc3bGn5wgggAACAQoQgAPEp2oEEEAA\ngegKEICjO/b0HAEEEEAgQAECcID4VI0AAgggEF0BAnB0x56eI4AAAggEKEAADhCfqhFAAAEE\noitAAI7u2NNzBBBAAIEABfIDrDth1Vu3bpXVq1dLz549ZfTo0VJWVpYwvzPzrbfekqqqKhk/\nfrwziVcEAheoqKjw3IbKykrPy6R7gVT64aVNNvbZS/vJi4AXASv3gOfNmydTpkyRDRs2yKJF\ni2T69Omya9eupP368ssv5eabb5YVK1YkzUsGBBBAAAEEghSwLgDrnq9+C54zZ47MmDFDHnjg\nASkqKpKFCxcmdGpqapKZM2dKTk5OwnzMRAABBBBAwAYB6wLwmjVrZODAgTJixAjjk5+fLxMm\nTEi6V/vUU0+Z4Dtu3DgbXGkDAggggAACCQWs+w1427ZtMmjQoJhGa0DesWOH6F5ubm777wwb\nN24UDcB/+tOf5IknnohZtvWbVatWyRVXXNF6ktxzzz0yZsyYmGnx3ujede/evePNtmp6ly5d\npF+/fla1qW1j1LOkpESam5vbzrLuvX4RzLRnuutLd/mpDKJNbUr1aFqm++C0s2/fvqmQZ3QZ\nbWumfbx20PHU849STfX19a4WtS4Ab9++Xbp27RrT+PLychN8q6urpUePHjHzamtrzaHnyy67\nTPr37x8zr+0bDUoDBgyImVxYWCiNjY0x0+K9ycvLM+2wPWDolxRto9t+xetvuqeHxVODr6ZM\ne6a7vnSXn8r6Y1Ob9HPkbIy99CXTfdDPkbYz0/V6MdG82kY1tb2d2sbObpt0Z9FNsi4AFxQU\nSENDQ0zbnfe6t9Q23X///TJ48GA59dRT285q937UqFGydOnSmOk7d+40e9cxE+O80W9E+iXA\n5hVIVx79hqlfTPRscJuTftHSduo/m5N+sdN1UI/CZDKlu750l5+KlU1t0i/s+gXda8p0H3Sn\nRNuq2zKbdw70i6zuTLk5odaruZ/5dbtUWlpqtp9u92Tb1q8BvLi4uO3kdu+tC8B6iHfz5s0x\nDd29e7fZ89WTsVonPet58eLFcvTRR8t1111nZn3yySdSV1dn3t9www3SvXv31ovwNwIIIIAA\nAlYIWBeAhw4dKsuXLzd7HM6hv/Xr17f7XVj19BvGtGnTYiC//vpr2bt3rxx55JGie9MkBBBA\nAAEEbBSwLgDrDTTmzp0r8+fPN9cC697wsmXL5MYbb2zx03l6lvTw4cPloosuapmuf3z11Vfm\nX9vpMZl4gwACCCCAQMAC7U8pDrhBephZr+fVQ8t6+dGVV14pEydONHfDcpqm1wavXbvWecsr\nAggggAACoROwbg9YBUeOHClLliwR/Y23T58+7S49WrlyZVzoa665Ju48ZiCAAAIIIGCLgJUB\n2MGx/Xoxp528IoAAAggg4FXAukPQXjtAfgQQQAABBMIoQAAO46jRZgQQQACB0AsQgEM/hHQA\nAQQQQCCMAgTgMI4abUYAAQQQCL0AATj0Q0gHEEAAAQTCKEAADuOo0WYEEEAAgdALEIBDP4R0\nAAEEEEAgjAIE4DCOGm1GAAEEEAi9AAE49ENIBxBAAAEEwihAAA7jqNFmBBBAAIHQCxCAQz+E\ndAABBBBAIIwCBOAwjhptRgABBBAIvQABOPRDSAcQQAABBMIoQAAO46jRZgQQQACB0AsQgEM/\nhHQAAQQQQCCMAgTgMI4abUYAAQQQCL0AATj0Q0gHEEAAAQTCKEAADuOo0WYEEEAAgdALEIBD\nP4R0AAEEEEAgjAIE4DCOGm1GAAEEEAi9AAE49ENIBxBAAAEEwiiQH8ZGR7XNFRUVnrteWVnp\neRkWiIZAKutTumVSaZNt63gm+uC1DtuM0r0ehaV89oDDMlK0EwEEEEAgqwQIwFk1nHQGAQQQ\nQCAsAgTgsIwU7UQAAQQQyCoBAnBWDSedQQABBBAIiwABOCwjRTsRQAABBLJKwHMA/uMf/yhT\np06VV199VZqbm7MKg84ggAACCCCQKQHPAfiggw6SJUuWyLhx4+SQQw6RW2+9VTZt2pSp9lIP\nAggggAACWSHgOQBfcMEFsn37dlmwYIEceeSRcscdd8iwYcNkzJgx8uijj8qePXuyAoZOIIAA\nAgggkE4BzwFYG9OlSxc577zzZOnSpfLZZ5/J3XffLfX19TJt2jTp37+//OxnP+MQdTpHjbIR\nQAABBEIvkFIAbt3rfv36yZVXXimPPPKI/OpXv5La2lqZN2+eOUR9xBFHyOLFi1tn528EEEAA\nAQQQOCDQqQC8detWufPOO+Woo46S4cOHy4MPPihnn3222TNevny5DBkyRM455xx57LHHwEYA\nAQQQQACBVgKe7wVdXV0tTz/9tDzxxBPy+uuvmzOhR44cKffdd5/o78O9evVqKf7kk08W3QvW\n34b1zGkSAggggAACCPxXwHMAvueee2TGjBnSu3dvueKKK0RvCn7MMcd06JmbmysDBgwQPUxN\nQgABBBBAAIH/F/AcgI899lh59tln5YwzzpDCwsL/LynOX6+99prk5OTEmctkBBBAAAEEoing\n+TfgqqoqefPNN+MGX71GePDgwVJTU2NECb7RXLHoNQIIIIBAYgFXe8BfffWV1NXVmZLeffdd\nWbNmjXz++eftStY8y5YtEz05a//+/VJcXNwuDxMQQAABBBBAQMRVANaHOV933XUxXnpHrHhp\nxIgR0qNHj3izmY4AAggggEDkBVwFYL3Ot6GhwdxsQ+8BvWXLlg7Pas7PzzeB99xzz408LAAI\nIIAAAggkEnAVgAsKCuTGG2805ehlRRs2bDD3gE5UMPMQQAABBBBAIL6AqwDcenG9BWU2Jd1r\nd3M2t/Y5Ly9PSkpKQvUUqLKyMmuHS7/Y6Ul6+mp70kvqMm2Z6fpsH4N47UuXk24b9DOfiZSu\nPjhtT3f5Tj2JXvUzpKY2tCVRO53tkZ7DVFRUlChr3HlunxSYNAB/8cUX8uMf/1hGjx4tDz30\nkNx///0yd+7cuBU7M95//33nT6tfFaqxsdFVGzVvU1OT+edqAQsyue1bEE11PG1uo+PiZT1x\nlunsaxhcOttHP5ZPl5MGDP2XiZSuPjhtT3f5Tj2JXvUzFMTnKFGbOpqn23hN+pput6QB2Pnm\nrw9g0KR7i7Z/gzENdfmfAjtneCdbRL8R6dnd6R6UZO3wMt+5HMzLMpnKq9801V7vH25z6tat\nm9lwZNoy0/XZPAaJ2pYuJw0WmUrp6oPT/nSX79ST6NU52mhDWxK109kD1u2SPmQoleT2yEnS\nAKxPN9Lrfp10ySWXiP4jIYAAAggggEDqAikfX2m9F6hnSL/yyisyf/58+frrr1NvDUsigAAC\nCCAQEYGUAvC9994rgwYNModj1eniiy+WH/3oR3LhhReau2CtX78+Inx0EwEEEEAAgdQEPAfg\nlStXytVXXy19+/Y1t5t8++235fHHH5cxY8bIokWLzCMINRCTEEAAAQQQQCC+QNLfgNsuqrea\n1CccrV271pwhqPd+1jR79mw57rjjzI/WGoD37Nkj5eXlbRfnPQIIIIAAAggcEPC8B/zRRx+Z\nS5Kc0/NfeOEF6dOnj4waNcqADh8+3JwxunnzZoARQAABBBBAII6A5wDcs2dP2bhxoylu27Zt\n8s4775jrhJ2nHunJWJp0L5mEAAIIIIAAAh0LeA7AEyZMEL3JxmWXXSbnn3++2dudPHmyuTZW\nD0PffvvtcsIJJ0jv3r07rpGpCCCAAAIIIODuaUitnc4++2y5/PLLzR2x9DD0b3/7Wzn11FNN\nAL755pvN2dB6ljQJAQQQQAABBOILeD4JS4PunDlz5A9/+IMp1TnRSu/8oTfs0EcRkhBAAAEE\nEEAgsYDnAOwU5wRe572+Enxba/A3AggggAAC8QVSDsDxi2QOAtkvUFFRkf2dDEkPs2EssqEP\nIVldrGpmSgH42Weflbvvvlu2bNlibsbR0U3Ld+3aZVVHaQwCCCCAAAI2CXgOwKtXrxZ9JrA+\nGeiYY44xd8RyLkGyqWO0BQEEEEAAAZsFPAfgp59+WvTRhHr976GHHmpz32gbAggggAAC1gp4\nvg5Yb76hd70i+Fo7pjQMAQQQQCAEAp4DsAZf3fvdt29fCLpHExFAAAEEELBTwHMAnjp1qgwc\nOFB+//vfS11dnZ29olUIIIAAAghYLuD5N+BXX33VPHxh1qxZct9998lBBx0kpaWl7br53nvv\ntZvGBAQQQAABBBD4r4DnAKyXF9XW1ppHD4KIAAIIIIAAAqkJeA7Av/jFL0T/kRBAAAEEEEAg\ndQHPAbh1VevWrRN9PrDelvKUU04xN+YYPHhw6yz8jQACCCCAAAIdCHg+CUvL2LBhg4wZM8bc\niOPcc8+VyspKU7TemOOWW24xh6g7qItJCCCAAAIIIPA/Ac97wLt375bTTjtN6uvr5eqrrxa9\nM5amxsZG0WcFz5w5Uz7//HN55JFH/lcFLwgggAACCCDQVsDzHvBDDz0k1dXV8sYbb8js2bPN\nWdBaqD6OcMGCBXLVVVfJ448/Lnv37m1bF+8RQAABBBBA4H8CngPwu+++K2PHjpWDDz64Q8Sf\n/vSn0tDQIJs3b+5wPhMRQAABBBBAQMRzAC4pKTG/AcfDc+6Q1atXr3hZmI4AAggggEDkBTwH\n4OOPP96c+bx48eJ2ePr78G233WbulNW/f/9285mAAAIIIIAAAv8V8HwSlj44Wn8Hnjhxopx0\n0kmiQVcfTTh58mTRoFxTUyMLFy7EFwEEEEAAAQQSCHgOwPn5+bJs2TK5/vrr5bHHHpOmpiZT\n/FtvvSUDBgwwwXnSpEkJqmQWAggggAACCHgOwErWp08fc5nR3XffLR9//LHs2LFDDjnkEPOv\noKAAVQQQQAABBBBIIpBSAHbK7N69O/eEdjB4RQABBBBAwINA0gCsN9X4/ve/76HI/2b99NNP\nPS/DAggggAACCERFIGkA1t98hw0bFuPxr3/9y1znq9cC6+0ne/bsKV988YWsXLnS3BHrvPPO\ni8nPGwQQQAABBBCIFUgagPv16ycrVqxoWUqD7wknnCB33XWXuRWl3gHLSRqEzzjjDOnSpYsz\niVcEEEAAAQQQ6EDA83XAeubzYYcdJtdee625/WTrMgcOHCh6YpY+nOGbb75pPYu/EUAAAQQQ\nQKCVgOcArL/t6l5xvNStWzdzGFrPjCYhgAACCCCAQMcCngPwuHHj5JVXXjF3w+qoyFmzZpk9\n5CFDhnQ0m2kIIIAAAgggcEAg6W/AbZXOPPNM88hBvSXltGnTzElYZWVlsnXrVvMUpLVr18rD\nDz/cdjHeI4AAAggggEArAc8BuG/fvqJ3vbrgggvknnvukebm5pbi9ND0kiVLRIM0CQEEEEAA\nAQTiC3gOwFpU79695aWXXjL3gV63bp3s3LlTRowYIYMHD45fE3MQQAABBBBAoEUgpQDsLN21\na9eUbtLhLM8rAggggAACURXwfBJWVKHoNwIIIIAAAn4KdGoP2M+GtC1LT+pavXq1ucvW6NGj\nRU/0SpScO3HpjUE0v16TTEIAAQQQQMBWASv3gOfNmydTpkyRDRs2yKJFi2T69Omya9euuIa/\n+93vZOrUqebSKH1Uoi77xhtvxM3PDAQQQAABBIIWsG4PWPd89U5ac+bMMSd2NTQ0yKWXXioL\nFy40r23BNm7cKK+//ro8/fTTomdoa7rtttvkvvvuk5NOOqltdt4jgAACCCBghYB1AXjNmjXm\n8LGeVa1JHwYxYcIEeeqppzoMwLpnfPHFF7cEX11m5MiR8tprr5lLpHJycnQSKUICFRUVnnur\nX/psS6n0w7Y+0B4EghTw+hnK9HbAugC8bds2GTRoUMyY6e+5emvLpqYmyc2NPWp+4okniv5r\nnV5++WX59re/LW2D76pVq+Tyyy9vnVXuvfdeGTNmTMy0eG+0PL0EK0wp0W1Dg+6HepaUlMRc\nSx5Um5I56RfBZHmCajv1IpBMwIZ119ke29CWeF7aNqed+pS/VFN9fb2rRa0LwNu3bxe9vKl1\nKi8vN8G3urpaevTo0XpWu7/1UPV7770nDz74YLt5+pSmtsG9sLDQ3Lu6XeYOJugJXvoloPXN\nRzrIZtWkxsZGq9rTujE2eSZy0uCrKVGe1v3ibwRsE7Bh3dXApjtQNrQl3vho27SNnd02aZxw\nk6wLwAUFBaK/+7ZOznvdW0qUHn30UZk/f77cfvvtcvjhh7fLOmrUKHn++edjputNRNw+OEK/\nEemXAJtXoJjOHXjjtm9tl8vEe/2iVVtba/5lor5EdSRy6t+/v1knE+VJVDbzEAhawIZ1V7/I\n6s5UohNqbXDS7VJpaalUVVWJ2z3Ztu3WAF5cXNx2crv31gVgPcS7efPmmIbu3r3b7PkWFRXF\nTHfe6LcNfQzi3/72N5k9e7b5DdiZxysCCCCAAAI2CsT+oGpBC4cOHSoffvhhzF7w+vXr2x06\nbt3UmTNnmsuO5s6dS/BtDcPfCCCAAALWClgXgMePH2+w9FCy7tlu2rRJnGt7HUWdp0FZ0wsv\nvGD2fPU64D179pjff/U3YP0XpkPFTt94RQABBBCIhoB1h6D1MLPu0eq1vBpo9Tj6xIkTzd2t\nnCF54IEHzCVJw4cPl2eeecZM1ucQt00vvviiOcu27XTeI4AAAgggELSAdQFYQfQ6Xn2s4Zdf\nfil9+vRpd+nRypUrW9weeeSRlr/5AwEEEEAAgbAIWBmAHTybrxdz2sgrAggggAACqQhY9xtw\nKp1gGQQQQAABBMImQAAO24jRXgQQQACBrBAgAGfFMNIJBBBAAIGwCRCAwzZitBcBBBBAICsE\nCMBZMYx0AgEEEEAgbAIE4LCNGO1FAAEEEMgKAQJwVgwjnUAAAQQQCJsAAThsI0Z7EUAAAQSy\nQoAAnBXDSCcQQAABBMImQAAO24jRXgQQQACBrBAgAGfFMNIJBBBAAIGwCRCAwzZitBcBBBBA\nICsECMBZMYx0AgEEEEAgbAIE4LCNGO1FAAEEEMgKAQJwVgwjnUAAAQQQCJsAAThsI0Z7EUAA\nAQSyQoAAnBXDSCcQQAABBMImQAAO24jRXgQQQACBrBAgAGfFMNIJBBBAAIGwCRCAwzZitBcB\nBBBAICsECMBZMYx0AgEEEEAgbAIE4LCNGO1FAAEEEMgKAQJwVgwjnUAAAQQQCJsAAThsI0Z7\nEUAAAQSyQiA/K3pBJ3wTqKio8K0svwqqrKz0qyjKQQABlwJetwVeP6dey9dme63DZVcDy8Ye\ncGD0VIwAAgggEGUBAnCUR5++I4AAAggEJkAADoyeihFAAAEEoixAAI7y6NN3BBBAAIHABAjA\ngdFTMQIIIIBAlAUIwFEeffqOAAIIIBCYAAE4MHoqRgABBBCIsgABOMqjT98RQAABBAITIAAH\nRk/FCCCAAAJRFiAAR3n06TsCCCCAQGACBODA6KkYAQQQQCDKAgTgKI8+fUcAAQQQCEyAABwY\nPRUjgAACCERZgAAc5dGn7wgggAACgQkQgAOjp2IEEEAAgSgLEICjPPr0HQEEEEAgMIH8wGq2\npOL8/HwpLCx01Zq8vDwpKSmR5uZmV/ltyFRWVmZDMzrVhkz0IVkdubm5kixPpzrJwgikUSAT\n626yOvQzpNvbZPkSMXRm2UTlOvO0/IKCAvO2uLhYioqKnFmeXt3GiMgHYIVqbGx0hat5m5qa\nzD9XC1iQyW3fLGhq3CZkog/J6vCynsTtCDMQCEgg2frtR7OS1aGfoc5+jpLV0dl+aPm6jdek\nr+muL/IBWIHr6upcjZt+I9q/f3/aB8VVY1xmqqmpcZnT3myZ6EOiOrp162Y2HIny2KtHyxAQ\nycS6m6wO52hjsnyJxqszyyYq15mn5Tt7wLW1tVJfX+/M8vSqR0vdJH4DdqNEHgQQQAABBHwW\nIAD7DEpxCCCAAAIIuBEgALtRIg8CCCCAAAI+CxCAfQalOAQQQAABBNwIEIDdKJEHAQQQQAAB\nnwUIwD6DUhwCCCCAAAJuBAjAbpTIgwACCCCAgM8CBGCfQSkOAQQQQAABNwIEYDdK5EEAAQQQ\nQMBnAQKwz6AUhwACCCCAgBsBArAbJfIggAACCCDgswAB2GdQikMAAQQQQMCNAAHYjRJ5EEAA\nAQQQ8FmAAOwzKMUhgAACCCDgRiDnwPMZw/N0eTc98phn586drh9H2LNnT6muro77OMKKigqP\ntZMdAQQQyA6BysrKhB3RxxGWl5fLrl27TD4bt5fah65du0ppaans2LGjU48j7Nu3b0IPncke\ncFIiMiCAAAIIIOC/AAHYf1NKRAABBBBAIKkAATgpERkQQAABBBDwX4AA7L8pJSKAAAIIIJBU\ngACclIgMCCCAAAII+C9AAPbflBIRQAABBBBIKkAATkpEBgQQQAABBPwXIAD7b0qJCCCAAAII\nJBUgACclIgMCCCCAAAL+CxCA/TelRAQQQAABBJIKEICTEpEBAQQQQAAB/wUIwP6bUiICCCCA\nAAJJBQjASYnIgAACCCCAgP8CBGD/TSkRAQQQQACBpAIE4KREZEAAAQQQQMB/AQKw/6aUiAAC\nCCCAQFIBAnBSIjIggAACCCDgvwAB2H9TSkQAAQQQQCCpAAE4KREZEEAAAQQQ8F+AAOy/KSUi\ngAACCCCQVIAAnJSIDAgggAACCPgvQAD235QSEUAAAQQQSCpAAE5KRAYEEEAAAQT8FyAA+29K\niQgggAACCCQVIAAnJSIDAggggAAC/gsQgP03pUQEEEAAAQSSChCAkxKRAQEEEEAAAf8FCMD+\nm1IiAggggAACSQXyk+YIKMPWrVtl9erV0rNnTxk9erSUlZUlbInX/AkLYyYCCCCAAAJpFrBy\nD3jevHkyZcoU2bBhgyxatEimT58uu3btikvhNX/cgpiBAAIIIIBAhgSsC8C6J1tZWSlz5syR\nGTNmyAMPPCBFRUWycOHCDkm85u+wECYigAACCCCQYQHrAvCaNWtk4MCBMmLECEORn58vEyZM\nkBUrVnRI4zV/h4UwEQEEEEAAgQwLWPcb8LZt22TQoEExDBqQd+zYIU1NTZKbG/udwUv+VatW\nyeWXXx5T9r333itjxoyJmRbvTU5OjvTu3TvebKYjgAACkRXo169fwr7r9lNTsnwJC0nzTG2b\n0049/yjVVF9f72pR6wLw9u3bpWvXrjGNLy8vN8G3urpaevToETPPS/4uXbq0C+6FhYXS2NgY\nU2a8N3l5eaYdzc3NHWb561//2uH0TE8sKCgw7XTbr0y3z6kvmaeTL+hXPQqjqaGhIeimJKzf\n+XKqX1RtTmHy1I1xGD5HOvZuN/rpWjeSOamlttPJZ8v2srWHtk3b2Nltk9vPoHUBWINH2w2d\n876kpKS1lfnbS/5Ro0bJ888/H1PGzp07zd51zMQ4b/QbkX4JcFagONkCnawrj36Lq62tlaqq\nqkDbkqxy/aKl7dR/Nqf+/fubdVKPwticSktLRb8c7tu3z+ZmmqNIuoGz3VO/sOsX9N27d1vt\nqTsl2lbdlsXbObChA/rFS3emEp1Qa0M7dbuknyXdfqb6pUbX7+Li4qTdiT2emzR7+jPoId49\ne/bEVKQfAF3J9GSstslr/rbL8x4BBBBAAIEgBKwLwEOHDpUPP/wwZi94/fr17Q4dO1he8zvL\n8YoAAggggECQAtYF4PHjxxuP+fPnm98xN23aJMuWLTPXBTtQOk+DsiY3+Z3leEUAAQQQQMAW\nAesCsB5mnjlzpixevNhcfnTllVfKxIkTzd2wHDS9Nnjt2rXmrZv8znK8IoAAAgggYIuAdSdh\nKczIkSNlyZIl8uWXX0qfPn3aXXq0cuXKGL9k+WMy8wYBBBBAAAELBKwMwI6L1+vFvOZ36uEV\nAQQQQACBTAtYdwg60wDUhwACCCCAQBACBOAg1KkTAQQQQCDyAgTgyK8CACCAAAIIBCFAAA5C\nnToRQAABBCIvQACO/CoAAAIIIIBAEAI5B+4d2vGTBYJoTQB17t27N+auW4maoPf3tPk+0Nr2\nmpoaeemll8ydw/Te1zanzt7wPFN905vG6z1sx44dm6kqU6pH7wOuye2N4FOqxIeFXnnlFdm/\nf7+cdtppPpSWviLaPjwgfTV1ruR//OMfog+l0ce2dnS73s6V7u/SYdiGvv/++/LJJ5+Yp+S1\nffiPWw39LOo2I1mKfABOBhS2+fpB/OEPf2g+jHPmzAlb861s71FHHSXDhg0z16Zb2cCQNUoD\nr66n77zzTshabmdzL730Unn11VfljTfekM48Qs/O3mW+VXfccYf8+c9/lkWLFskxxxyT1gZw\nCDqtvBSOAAIIIIBAxwIE4I5dmIoAAggggEBaBQjAaeWlcAQQQAABBDoW4Dfgjl1CO7WhoUG2\nbNkiZWVlwq05/RlGfSKXPpj9oIMO8qfAiJfy73//25zMOGTIkIhL+NN9vWf+N998I+qpJzmR\nOiewc+dOqaqqMp/3dJ/URgDu3FixNAIIIIAAAikJcAg6JTYWQgABBBBAoHMCBODO+bE0Aggg\ngAACKQlY/TjClHrEQkZg3759snr1avniiy9Er2P97ne/i4wPAq+//rq5wF6fQU3yLrB161az\nXur1qqNHjzbnKngvhSXaCrBethVJ7X2mt5vsAac2TlYvtXz5cjnzzDPl+eeflw8//FCuuuoq\nmT17ttVtDkPj1q5dK7fccots2LAhDM21ro3z5s2TKVOmGD+9ycH06dNl165d1rUzbA1ivfRn\nxILYbrIH7M/YWVOK3oZQ7+Kid8c599xzTbv02/FNN90kZ511lrmjkzWNDUlD9MxyDR76T29P\nSPIuoHu+lZWVondnGzFihLn9q66jCxcuNOuq9xJZgvXSv3UgqO0me8D+jaEVJX399ddy3HHH\nycknn9zSHudwqR6OJnkXWLZsmSxdulT0FnXf+ta3vBfAErJmzRoZOHCgCb7KkZ+fb26XumLF\nCnRSFGC9TBGug8WC2m4SgDsYjDBP6t27tznk3L1795ZuvPzyy+b6wMMPP7xlGn+4F/je974n\nCxYskBNPPNH9QuSMEdi2bZt5QEjriRqQd+zYYf3DI1q32aa/WS/9G42gtpsEYP/G0MqS9Kke\nDz74oEyePJkbc6Q4Qr169TJ7bCkuzmIHBPThC127do2x0KfF6KG/6urqmOm8cSfAeunOKZVc\nmdpu8htwKqNjyTLvvfeefPDBBy2tOfbYY+XQQw9teb9u3Tq5/vrrZdy4cXLxxRe3TOePjgX0\nMJQ+ytFJffv2NXbOe15TFygoKGj32E/9DVNTSUlJ6gWzJAI+C2Ryu0kA9nnwMlmcno2rz6p1\nkj670gnAq1atkltvvVUmTZokv/zlL50svCYQ0D2x5557riXHEUccQQBu0ejcH3qIb/PmzTGF\n7N69W3SdTfft/mIq5Q0CCQQyvd0kACcYDNtnnX/++aL/2iZ9NujMmTPl17/+tfzkJz9pO5v3\ncQSGDh0qTz75ZJy5TO6MgNrqZR6616snYGlav359u9+FO1MHyyLQGYEgtpv8BtyZEbNwWb2R\n+J133iljx441N2fXw9TOPz3ESkIgCIHx48ebaufPn29+99UHXOhZvHpdMAmBoAWC2m6yBxz0\nyPtc/wsvvCB6Nxe9vKPtJR76e/Dpp5/uc40Uh0ByAT3MrEdlbrvtNtEgXFxcLBMnTjR3w0q+\nNDkQSK9AUNtNnoaU3nGldAQQaCOgj8/r06eP5OZyAK4NDW8jJkAAjtiA010EEEAAATsE+Apq\nxzjQCgQQQACBiAkQgCM24HQXAQQQQMAOAQKwHeNAKxBAAAEEIiZAAI7YgNNdBBBAAAE7BAjA\ndowDrUAAAQQQiJgAAThiA053EUAAAQTsECAA2zEOtAIBqwX0WdL6RCMSAgj4J8B1wP5ZUhIC\nWStw9NFHS2lpqbz55ptZ20c6hkCmBdgDzrQ49SGAAAIIIHBAgADMaoAAAggggEAAAjyMIQB0\nqkQgnQL/+c9/5P7775cf/OAH4jyFyKnvo48+Mg9DOOecc+Q73/mOmaxPJVq5cqV8/PHH0r17\ndxk+fLhccsklUlZW5izW7lXLLywsNPlaz3z88cdlx44dctVVV7VM1kcQVlZWypo1a8yDQkaO\nHGmW69atW0se/kAgigL8BhzFUafPWS3Q3Nws+vxdfeDBP//5z5i+XnHFFTJ37lz5/PPPpW/f\nvjJ58mTzDOTDDjtMjjrqKFm9erU52erQQw+V999/3wRZLaDtb8Bt3zuVnHzyyaJBfsuWLWbS\nV199Jaeddpq89dZbonUceeSR8tprr4kGXw38+p6EQFQFOAQd1ZGn31krkJOTI1OnTjVBb+PG\njS391D3RBQsWmICowVcfQP7kk0/KtddeK5rv2WefFT3befr06WZv+MUXX2xZNtU/9BGYGnz/\n8pe/mDoWL15snk9dV1cnl156aarFshwCWSFAAM6KYaQTCMQKXHTRRaKBWJ+966Tly5eL7pFW\nVFSYSbqXrAH4pptucrKYZfQ5vZo0b2dSVVWVOfR80kknydlnn91S1MEHHywXXHCBOey9bt26\nlun8gUDUBPgNOGojTn8jIaDBdezYsSYAz5gxw/R53rx55rD06aefbt4PGTJE9J8epn777bfl\ngw8+MP+cS410L7UzSX9T1sPhu3fvlkmTJsUU9dlnn5n3erja+S06JgNvEIiAAHvAERhkuhhN\ngZ///OeyadMm87tudXW1PPfcc3LhhRdKQUGBAdHAOGbMGDn++OPNSVMaiIcNGybXXHNNymCN\njY0ty+rJWJqKi4slNzc35p/uBZ933nlSXl7ekp8/EIiaAHvAURtx+hsZAT2UfNlll5nffUeM\nGCH79+9vOfysCHroWc9+fvjhh0UPWTuB+ZlnnjFGuvcaL+Xl5Ul9fX272Vu3bm2Zdsghh5i/\n9eSr1ofCdaIGai2DhECUBdgDjvLo0/esFigpKTF7mXri06JFi+TYY481ZzM7ndY9Xs3TOvjq\nvKVLl5osetJWvKSXK+mZzvv27WvJonvbztnPOlEDcP/+/UXr173t1knPvnbKaD2dvxGIkgAB\nOEqjTV8jJ6CHofX3Vj2j2Tn5ykHQvWINoDfccIM561l/+9U95qeeespk0cPW8dIpp5wiO3fu\nlClTpsjLL78sjz32mOi0Hj16tCyie9SzZs2SmpoaOeuss+Tvf/+7+b356quvloULF4peEjV4\n8OCW/PyBQOQEDhxmIiGAQBYLHHHEEc1FRUXNBwJmTC/1/bRp05oPXJKkx5qbDxwSbj7jjDOa\nP/300+YBAwY0jxs3riX/gWuEm0844YSW93v37m0+cKmTWUaXPfBbbvNdd93V/Jvf/Kb5wO+7\nLfn0jwPBtnngwIGmDs2bn5/ffOCLQXNtbW1MPt4gEDUBbsQRua9cdBiBWIGmpiZz8wzdG9UT\nprwk3YPW3331xh3JftPVpynpXrOeea0PdiAhEHUBAnDU1wD6jwACCCAQiAC/AQfCTqUIIIAA\nAlEXIABHfQ2g/wgggAACgQgQgANhp1IEEEAAgagLEICjvgbQfwQQQACBQAQIwIGwUykCCCCA\nQNQFCMBRXwPoPwIIIIBAIAIE4EDYqRQBBBBAIOoC/wfJw+VPPIvO4AAAAABJRU5ErkJggg==", "text/plain": [ "plot without title" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# let's take a closer look at the histogram of Y\n", "p1 <- gather(as_tibble(Y)) %>%\n", " ggplot(., aes(x = value)) +\n", " geom_histogram(aes(y = ..density..))\n", "p1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Ok, so there's a lot going on above. Let's take a moment to dissect it. The `ggplot2` engine requires that we pass in a table of [\"tidy\" data](http://vita.had.co.nz/papers/tidy-data.pdf).\n", "\n", "What's tidy data?\n", "1. each observation in a single row\n", "2. each variable in a single column\n", "3. each value in a cell.\n", "\n", "To tidy up our data, we can use the `gather` function from the [`tidyr` package](http://tidyr.tidyverse.org/).\n", "\n", "Some questions:\n", "* Really, are you kidding? What's a [`tibble`](http://tibble.tidyverse.org/)? It's just a cleaner implementation of a `data.frame`. There's a few re-implementations of this core object because of inefficiencies. Another is the very useful [`data.table`](https://github.com/Rdatatable/data.table/wiki).\n", "* What's this \"`%>%`\" thing all about? It's a \"pipe\" operator, introduced initially by the [`magrittr` package](https://cran.r-project.org/web/packages/magrittr/index.html). We'll come back to this.\n", "\n", "How does `ggplot2` work?\n", "* We're composing a plot in _layers_.\n", "* First, we add a core data layer: this is the `tibble` that is passed in by the pipe. With `aes`, we specify that the x-axis of our plot is going to be the variable `value` (n.b., this is created implicitly by `tidyr::gather`).\n", "* Next, we specify the type of plot -- we're making a histogram, hence the use of the aptly named `geom_histogram`. By specifying `aes` here, we note that the y-axis is going to be a density (we could have also done `aes(y = ..counts..)`.\n", "* We could just stop here, but there's more we can do to improve our plot." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.\n" ] }, { "data": {}, "metadata": {}, "output_type": "display_data" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAeAAAAFoCAYAAACPNyggAAAEGWlDQ1BrQ0dDb2xvclNwYWNl\nR2VuZXJpY1JHQgAAOI2NVV1oHFUUPrtzZyMkzlNsNIV0qD8NJQ2TVjShtLp/3d02bpZJNtoi\n6GT27s6Yyc44M7v9oU9FUHwx6psUxL+3gCAo9Q/bPrQvlQol2tQgKD60+INQ6Ium65k7M5lp\nurHeZe58853vnnvuuWfvBei5qliWkRQBFpquLRcy4nOHj4g9K5CEh6AXBqFXUR0rXalMAjZP\nC3e1W99Dwntf2dXd/p+tt0YdFSBxH2Kz5qgLiI8B8KdVy3YBevqRHz/qWh72Yui3MUDEL3q4\n4WPXw3M+fo1pZuQs4tOIBVVTaoiXEI/MxfhGDPsxsNZfoE1q66ro5aJim3XdoLFw72H+n23B\naIXzbcOnz5mfPoTvYVz7KzUl5+FRxEuqkp9G/Ajia219thzg25abkRE/BpDc3pqvphHvRFys\n2weqvp+krbWKIX7nhDbzLOItiM8358pTwdirqpPFnMF2xLc1WvLyOwTAibpbmvHHcvttU57y\n5+XqNZrLe3lE/Pq8eUj2fXKfOe3pfOjzhJYtB/yll5SDFcSDiH+hRkH25+L+sdxKEAMZahrl\nSX8ukqMOWy/jXW2m6M9LDBc31B9LFuv6gVKg/0Szi3KAr1kGq1GMjU/aLbnq6/lRxc4XfJ98\nhTargX++DbMJBSiYMIe9Ck1YAxFkKEAG3xbYaKmDDgYyFK0UGYpfoWYXG+fAPPI6tJnNwb7C\nlP7IyF+D+bjOtCpkhz6CFrIa/I6sFtNl8auFXGMTP34sNwI/JhkgEtmDz14ySfaRcTIBInmK\nPE32kxyyE2Tv+thKbEVePDfW/byMM1Kmm0XdObS7oGD/MypMXFPXrCwOtoYjyyn7BV29/MZf\nsVzpLDdRtuIZnbpXzvlf+ev8MvYr/Gqk4H/kV/G3csdazLuyTMPsbFhzd1UabQbjFvDRmcWJ\nxR3zcfHkVw9GfpbJmeev9F08WW8uDkaslwX6avlWGU6NRKz0g/SHtCy9J30o/ca9zX3Kfc19\nzn3BXQKRO8ud477hLnAfc1/G9mrzGlrfexZ5GLdn6ZZrrEohI2wVHhZywjbhUWEy8icMCGNC\nUdiBlq3r+xafL549HQ5jH+an+1y+LlYBifuxAvRN/lVVVOlwlCkdVm9NOL5BE4wkQ2SMlDZU\n97hX86EilU/lUmkQUztTE6mx1EEPh7OmdqBtAvv8HdWpbrJS6tJj3n0CWdM6busNzRV3S9KT\nYhqvNiqWmuroiKgYhshMjmhTh9ptWhsF7970j/SbMrsPE1suR5z7DMC+P/Hs+y7ijrQAlhyA\ngccjbhjPygfeBTjzhNqy28EdkUh8C+DU9+z2v/oyeH791OncxHOs5y2AtTc7nb/f73TWPkD/\nqwBnjX8BoJ98VQNcC+8AAEAASURBVHgB7Z0J/BXT///f7fuqjdJCKJVKWoSQpVD2rEWKLBFR\n34iEFNJClspSvpJUyC5bKEK0ahfSSntp3+Z/Xuf/Pfc3937uMnefO/M6j8fnc+/MnO39PHPn\nPeec93mffJYKwkACJEACJEACJJBRAvkzWhoLIwESIAESIAES0ASogHkjkAAJkAAJkEAWCFAB\nZwE6iyQBEiABEiABKmDeAyRAAiRAAiSQBQJUwFmAziJJgARIgARIgAqY9wAJkAAJkAAJZIEA\nFXAWoLNIEiABEiABEqAC5j1AAiRAAiRAAlkgUDALZbquyB9++EH69u2bp14FChSQ4sWLy9FH\nHy3XXHONnHHGGXniuO3Evn37pG3btlKhQgWZPHlyUPX+/fdfOXDggJQvXz7ofLIH9913n8yZ\nM0f69esnrVu3jpjdhAkT5KWXXpILL7xQevfureNFq2/EjMJcSJdsYYry1aktW7bI3XffLdOn\nT5cNGzbIueeeKx9++GFEBnfddZcsXLgw7PVChQpJ6dKl5dhjj5XLL79cmjdvHjae/eTcuXPl\n3nvvlaJFi8oHH3wgyCNawH2I+7FatWoybty4aFHzXMP9+91330mNGjVk7Nixki9fvjxxzIk/\n/vhDbr75ZoEfo2effVZOOukkc4mfJOCcADxh+T2oHza8gcX8O/PMM60lS5a4Gtfu3bu1HFWr\nVg2q5/vvv28deeSR1vfffx90PhUH55xzji5TKdio2T355JM6XpcuXQLxItU3EMHBl3TK5qB4\nT0dRL0u6zZTis5o0aWIpZRhV3pYtW+r4SnlZ+fPnD/qz/8YKFixovfzyy1HzwkX1wqjvW6RV\nij9m/Ntvv12X/9BDD8WMGxrht99+s9QLt07/wgsvhF4OHB8+fNg6++yzdbyOHTsGzvMLCcRL\ngEPQtneVxo0by+rVqwN/eMvFGzjeho8//nj59ttvpUOHDoJem1sDegiDBg2S+++/P6iKr776\nqqxfvz7onBsOItU3nrq5VbZ4ZHBjXPUwkS+//FJXbdmyZfLLL7/I0KFDHVX13XfflUOHDgX9\n4XejlJz06dNHDh48KLfccovMmDEjan5KUcuNN96o47zxxhtR4yL/t956S/dcu3btGjVuuIu1\na9eWJ554Ql9CHVetWhUumowePVq+/vprPTL2/PPPh43DkyTghAAVsI1SkSJF9NAVhq/wV6tW\nLWnUqJF07txZ5s+fL82aNdPDaw8++KAtlbu+4oH1wAMPyJ133umuikWoTa7VN4IYnjy9efNm\n2b9/v9SsWVP/FpIVsnDhwgIlp0ZCRI2C6Ow++eSTmNnedNNNOg6GoDHVECng+tatW/UwOeqc\nSMAQeqtWrWTnzp3SrVu3PFlAKf/nP//RSv61116TMmXK5InDEyTglADngB2SwhwU3sDREx4+\nfLj06NFDqlevHpQaDwc1HCrz5s2TXbt2ScOGDfV8Z2g8zDlDoaM3rYaz5NNPP5Uff/xRz4+1\naNFCLrnkkrDzT+iNI/+//vpLKleuLCeeeKK0a9cuaE4XPYtXXnlFSpQoIZ06ddL1wFzYypUr\ndV3fe+89WbBggVx66aWC75jnQk9EDRcGyYID1AtlIW6VKlXyXE/FidD62vOMJS8Yh5Pt6quv\nlnLlygWywsgF5jAxooH5vVNOOUUuuuiisIzR60P7Qfa1a9cK2uPKK6+UX3/9Vc9zt2/fXtTw\nvs5bDefr8ygPc49TpkzRbY72q1Spko6zYsUKPXKydOlS2bRpkxxzzDHSoEGDPG2MuXn05PHS\n16ZNG/nmm290L2vbtm2ipj70ObQpenmoG66jrriG+dR4AvJET3Hx4sW6J4r6YF4ebEx45513\nNC8co8xRo0bpS1CcUKTJBjVULWPGjBEwjBXwm4NSRBuiZ216xKHp/vvf/+pTmJtNNOD3gHph\nTvezzz4TKFm8gJsApYzf+T333BPV3sHE5ycJRCUQ75i1F+ObOWD1sI0pnjLE0nM/ofNRynjD\nUg9PfQ3zX+ohpb8roxPr9ddfD8pXGYnoa+ohaCnlrL+rRgp8KqVqqYdeUBrV6w5cx3yciX/U\nUUdZ6mEciBs6p6qG1ANxTRp8KkMZ6+STT9bX1DBjIL35snfvXkspMats2bLWnj17zOmwn+mY\nA3YibzTZUNHt27dbF198cUB+OzfM4amXiyB5IDPuAcMJ85T4Dk433HBDHlaqt6TPYW4bbW7S\nPffccxbmCXFejaro85gPtcc57bTTLNXLCpSvHuo6nnrZsbp37x7Iy+SJuVjEUUZQea5h3tNp\nUIZ5ljLQ03mgPsrQUH9Xyt0aOXJkIBv1kpKnHNQFTKMFMwesXkaiRbPUELHO/9prr40az1xU\nylXHP++888ypoM+///7bQntBttDfTlBEhwcjRozQ5SmDRUu9OOlU6gVcn1MvvjF/Ew6LYTSf\nE8BbtO9DPArYPHSfeuqpALd169ZZpUqV0g9bZRGpH5R4CEydOjVgQGI3fjIKGEoaigDGS8rC\n1FJv9xYUNh50+LGbMGvWLH1O9QQsKHooCigfZbmtz9erV89EtUIVsJqHs9Tcr3X++efruJAV\nxzBugaJAWeoNP5DefHn77bf1tdtuu82civhpFLCad7ZU7yzinxpK1HnGMsJyKm802VBZUx5e\nmlQP1kJ81Qu21MiDrgeUDM6ZAA7gcdVVV2m+YPTFF19YajpCn8c1+8uKuRfU6IilrL+tF198\nUSsWPLA//vhjnQZp8V31Oi0oCdW7spRVvb4G/iYYBYyXhGLFilmq52Xhvpo4cWIgPozo1DSI\n9dNPP1nKOtlSvX+tdKBIYUAUK8CAEHmjvqgr6oT7BUoZigbywaANATKAGc7hvsM9gz+8WEQL\nsRQw7t3BgwcHFL8ago6WXeCaGu3Qvw28MIBLaFBz07quPXv2DL2U0DHkVKMLOk81QmSpoW1L\njWpYaJ/Zs2cnlCcTkUAoASpgRSQeBTxw4ED9o1TDYAGW5sH99NNPB86ZLx999JGOj4e9CUYB\n16lTx1JzbOa0/hw2bJiOr5Y9Bc5DqeNBqIyrAufwBcoDykINNesHMs6FKmCcQzA9QfuLgJrj\n0y8NeHlAOnsw8dXQuP102O9GAaOOTv5iKeB45EWFTF3tsmFUAHWBAgzXgz/11FP1dbUsSsuk\nhjf1sRqODVLKuGheCJBfOAUMBRVaBtoEihFKMjQ8/PDDuiz7PWQUMMqAkrYHNeeo48NCd82a\nNfZLlhoe19fw8hYrnHXWWTquMhzKE1UNt+prNWrUCNyT//zzjz6HHp/TYBSwWmqkraZhOW3+\nYJlvRgEwItC/f3+n2ep4t956q64PfiOhAe0GdosWLQq9lPDx77//bmFkAHXGCxbyHzBgQML5\nMSEJhBLgHLD6VcUTMFeHgHkxE5SS1fOJxrDEnMen6nnq+UhYkGLuTQ3pBi4rxZFnXWPdunX1\n9R07dgTiYY4OAXPPmFvEHCPW+WLeVvWQAvHi/YL1wKgD1gurlxDBXCYC5ioxz6heEByt1TTl\nQlbM10UKqucgmP+OFVIhL9ZzIqjhXL2GNLRMrEPG3Knq4eo5cKVkdRQ1nJtnPrxp06aCOcuZ\nM2eGZqOP1bBonjLUtIO2loXtgD2oH6AYAyGldO2XAt8xH2sP6mVBH8KmwMw/m+vGvgBrdKMF\n9bKm2auXLS1vaFy0HeY9YR8AS2XYFyQTlPIKmxxz3JhTx72mFHPYOJFOwrIZFsiwxVA93UA0\n2Apgjh5tlGy9A5mqL5ivVyNd2qBx2rRp2h4ABo4MJJAqAlTAcZI0DxY8SBCgVKGw4LQDD7Fw\nwShtPNjwMDfBPDzNMT6Nkww8ME2AAxA1VyZwZAEDExiC4OGFBzUMudQQtIka9ycsTKGA8VAz\nChjloM6RjF0iFYK84LAkUsDDzIkCToW8arhVV+OEE04IWx28XCDAOAoBRlQIsNINF/BiEUkB\nH3fcceGSiBru1WnUkLE2eFq+fLmoHprAuhjB3sYmAxhawcDOHnAOIVT54hwcxSBAsUcLf/75\np35pBI9IRlRgAgUMJskqMhhLwXgP9cKLBl5S77jjDlG9ar3CIF7lC9nw28HLGdoK7WteVlNh\nfBWJHeqMZX1q2Ft/4nfOQAKpIpDX9DVVOXs0H6OA8XaMgAcKAh5q6JGG+8ODAg8PNcyr45p/\neEA7CViqM378eP13wQUXCJZL/fzzz/Loo49K/fr1taegcA9zJ3njpUEZcmmLT7xIIKD3Bjlg\nRZ2NkAp5TY/QPuJglwUemRDwAoWg5jf1J9iGC0bRhbt2xBFH5DmN+wKenpSxlW4frCXfuHGj\nfmmCp6ZIwbyAhbuONkk0xOKBfEOZJFoW0qlhW/2HOmOpzvXXXy+ff/65fumAZy3cY4kEY+GM\n3wMCXhTffPNNXXc1HZNIllHTQA4zihHtHoiaCS+SQAQCif+iI2To5dNQvhhKxo8SD1YE9GJx\njAc3hjGj/SmjjoTxoIzrrrtOsG4S7gGxRALDqygXQ9NYApNIwBu9svDVDzIsPcGyGciIYdVw\nPa5EykgkTbLymhckOFYJF4yThYoVK+rLcI+IECu+juTgH4ZZcS9gudPXymkDphSw7AeKx/T+\nYvVaHRTjOEosHsgolInjzB1GxAuJcXSBURwz7O8wuY4GRY573ihgTJXgxQYjL2akIJ78GJcE\nskmACjgO+ui5wDEB1t6a4S/0YqGE0ZMyw5j2LBEfw1hw3mGGHu3XnXzHwwbzX1ifioA3cvRc\n4YUH65ERsEYy0aCMyHRSKHGsM0aId/hZJ0rRv1TIa4aesQY4XFBGWvo05lURzNQAhkpDg1p6\no30Eh56PdAyFgDlo9JjwUnPWWWcFhoqRBooYIdFRC504zn9Yx40eLpQshqNDAxxPYI4eIZ1+\njbF+FjxgQ4F7zG5LEVqncMcYbcDQ9sqVK/WLorGBMD3jcGl4jgTcSoAKOEbLQIHCwAM9Gign\nKD9lvRmUCt5zEKAMQ4eZH3nkEVHrK0Ut6wnM7wYldnCABzacA8BZfGiAowwEo0hCr5tjM4wG\n5RAaoKxgwIKeGpQfhgwvu+yy0GgZO45X3nCyYfgczjjQ44STE3uAAhoyZIgeZjcvGviE97NJ\nkyZpZw8mPtofXsXMULU5H+0TaRCgXMzQtomPlzS1/Chw3ZzPxCfuTyh9vEgauwRTrrLM1i+I\n2ExDWUKb0yn/xMgGHMXgxRVzzWpVQVAZeAnAX+jvyB7JuJmErQJemHDvmxcoezzz3UmeJi4/\nSSCjBNQwmO+DWYakhrYsZYQT+MO6S/XAsFSD6D/10NDrQkOBqQduwKkFlqRguRDWeKpdifR6\nR6z3tS9fMcuQlOej0Kz0Gk+Up7whBa5hzS/WIOK8etBYau5Xr6WEUwbUDw5AlGLV8bGcCPGw\n5MMeTJmQCc4eQpdrwDG+kVMND9qTxvxuliFhPXO04HQzhnjkRXmRZINM4AP+WLML3sqKVTsX\nwfnQ5T5Yt12yZEnNAcuUlFGahWU5qicbWAusrGEDIpp1wOGWGhmHHko5WFiepuYpLTjMwHpb\n9cKjy8DSGRPMMiSsEQ4Nau5Ux8eSs9CATQfQbnYnGqFxzDHW0uL+RHwsi1MvIRaWJKnRFH1O\n+UIPLGdDmmSWIcVyxGHuBayrxXpjE8w9qKZBzKk8n+olQrcL2hXx7eup80RWJ5zkGS6dOaeG\n73UeTpbkmTT8JAEnBLgOWFEyCtj8UM0nHg5qeNnCw/jxxx/XD6RIUOFgQC1tCTzATR5QmKpn\nGZTMKAynChiJ1fBzYC2iyRsvBFgDC6cJJkRSwHiYoi7mhQKehewBHo6gaJC3sva1X4r5PdUK\nGAU6lRdxo8n21VdfBRQeZEObYq2q3dEJ8jABygCOOvBSo+aHLXglU5bB+oUI6e2KIZoChpct\neG0ybYVPeBaDAxfV+9QveWgL1RvXRWdCAaMglAPnKmrUIFA3eFPDumX7fYS46VTAYICXE3DB\nywqUKoLhZeesL4T8UyNLOi7kgFOSaMFpnpHyoAKORIbnkyWQDxmoG5QhRQSAE/NTGOqtWbNm\nwCdwirLXBlgYdobVNZaNxLssAsZAymmErpdSAIFqYcgPy19geGWW5gQuZvELDM6cyhtJNlQf\nTvphRIe5+3DGOvDJHc3KWCkJwXIi5GEMmpxgUUpM3w+wNMe+0m4JGIqGwR3ugWhrt91SX9aD\nBLxIgArYi62agEyYY8bcmhoulV69eiWQQ24ngSMSyI+1zMpVYpAwmLeFBS+WCCl3klppBUXg\nAQmQAAkkQICOOBKA5pUk6KnDwAw9TFhpw/jKr9akak5Ub0Gn3BxqIySst4bVsJo+EBjSYdcm\neEGyjxp45T6gHCRAAtkhwB5wdri7olRsho5t3hAwlI1eYKgbRFdUNEOVMJ7GwlngwlpXbX6R\noZqwGBIgAT8QoAL2QytHkBFLa9QuMtpdIHxCq512IsT0z2l4A8NSJMz14qUEfpgx/6usx/0D\ngZKSAAlkhAAVcEYwsxASIAESIAESCCZARxzBPHhEAiRAAiRAAhkhQAWcEcwshARIgARIgASC\nCVABB/PgEQmQAAmQAAlkhAAVcEYwsxASIAESIAESCCZABRzMg0ckQAIkQAIkkBECVMAZwcxC\nSIAESIAESCCYABVwMA8ekQAJkAAJkEBGCFABZwQzCyEBEiABEiCBYAJUwME8eEQCJEACJEAC\nGSHg+80Y1P6sMn/+/KRhx9rOLukCXJgBZMb2i9jGz0+bFJgdPP0ms7nH/Sa3ucdd+BNMW5Ww\nXSXaOdoWnWkrPIsZm3s82Spga9fhw4fHzMb3Cnj58uXSvXt3vQNOTFpRIqgN7fU+swUL+gfp\nzp07BRsXlC1bVu9PHAWPpy5BZjycihUr5im5ogmzb98+wT1esmRJKV68eLSonrq2f/9+2bt3\nr94Zy1OCRREGSgg+0YsUKaJ3SIsS1VOX8KKFfcOx7WgyATunYWtTJ8E/2iIKDbytVKtWLUqM\n2JfwUCpVqpQUKlQodmSPxNixY4fs2rVL37D4sfol4MUDCrhEiRJ+EVkrIXOPQwn7JeDFY8+e\nPfol0y8yo/eL51jRokWlXLlyfhFb8OIBmStWrJiUzAcOHHCcnnPAjlExIgmQAAmQAAmkjgAV\ncOpYMicSIAESIAEScEyACtgxKkYkARIgARIggdQRoAJOHUvmRAIkQAIkQAKOCVABO0bFiCRA\nAiRAAiSQOgJUwKljyZxIgARIgARIwDEBKmDHqBiRBEiABEiABFJHgAo4dSyZEwmQAAmQAAk4\nJkAF7BgVI5IACZAACZBA6gjQE1bqWDInEvAEgbZt28Ytx9SpU+NOwwQk4HcC7AH7/Q6g/CRA\nAiRAAlkhQAWcFewslARIgARIwO8EqID9fgdQfhIgARIggawQoALOCnYWSgIkQAIk4HcCVMB+\nvwMoPwmQAAmQQFYIUAFnBTsLJQESIAES8DsBKmC/3wGUnwRIgARIICsEqICzgp2FkgAJkAAJ\n+J0AFbDf7wDKTwIkQAIkkBUCVMBZwc5CSYAESIAE/E6ACtjvdwDlJwESIAESyAoBKuCsYGeh\nJEACJEACfidABez3O4DykwAJkAAJZIWA73dDOnTokGzdulU2bdqUVAMcPHhQtm3bJvny5Usq\nn1xKDHYI27dvl/z5/fMuZ+Tes2dPLjVXWuua7O8nrZVLIvPDhw8L/rwqXzg0lmXp0/v27fOd\n3HiOJ9vWyAP3jJPgewUMxVG6dGkpV66cE14R40D5lihRQgoVKhQxjtcu7Ny5U3bv3i2lSpWS\nwoULe028iPLs2rVLv2gVL148Yhy/XUj29+NWXvv375e9e/fqZ4Rb65jqeuEFc/Pmzfo3XaZM\nmVRn79r8oDTRGUv2Xj5w4IDjDonvFTB6rAUKFNB/ydwZqconmTpkOq3p7eMlBgz9EiCvaW+/\nyBxLTq+2v5/b2m/3OORNhcxOe7/4Tfln3DDWE4TXSYAESIAESCCDBKiAMwibRZEACZAACZCA\nIUAFbEjwkwRIgARIgAQySIAKOIOwWRQJkAAJkAAJGAJUwIYEP0mABEiABEgggwR8bwWdQdYs\nigSSJtC2bdu48pg6dWpc8d0cOV7ZIYuX5Hdz27BuiRFgDzgxbkxFAiRAAiRAAkkRoAJOCh8T\nkwAJkAAJkEBiBKiAE+PGVCRAAiRAAiSQFAEq4KTwMTEJkAAJkAAJJEaACjgxbkxFAiRAAiRA\nAkkRoAJOCh8TkwAJkAAJkEBiBKiAE+PGVCRAAiRAAiSQFAEq4KTwMTEJkAAJkAAJJEaACjgx\nbkxFAiRAAiRAAkkRoAJOCh8TkwAJkAAJkEBiBKiAE+PGVCRAAiRAAiSQFAEq4KTwMTEJkAAJ\nkAAJJEaACjgxbkxFAiRAAiRAAkkRoAJOCh8TkwAJkAAJkEBiBKiAE+PGVCRAAiRAAiSQFAEq\n4KTwMTEJkAAJkAAJJEaACjgxbkxFAiRAAiRAAkkRoAJOCh8TkwAJkAAJkEBiBKiAE+PGVCRA\nAiRAAiSQFAEq4KTwMTEJkAAJkAAJJEaACjgxbkxFAiRAAiRAAkkRoAJOCh8TkwAJkAAJkEBi\nBKiAE+PGVCRAAiRAAiSQFAEq4KTwMTEJkAAJkAAJJEaACjgxbkxFAiRAAiRAAkkRoAJOCh8T\nkwAJkAAJkEBiBKiAE+PGVCRAAiRAAiSQFAEq4KTwMTEJkAAJkAAJJEaACjgxbkxFAiRAAiRA\nAkkRoAJOCh8TkwAJkAAJkEBiBKiAE+PGVCRAAiRAAiSQFAEq4KTwMTEJkAAJkAAJJEagYGLJ\n0p9q1apVMnPmTClfvry0bNlSSpYs6ajQX375RbZt2ybnnnuuo/iMRAIkQAIkQALZIODKHvC4\nceOkU6dOsnjxYpk0aZLcfvvtsnXr1ph8/vnnH3nooYfkiy++iBmXEUiABEiABEggmwRcp4DR\n8x07dqw8++yz8thjj8moUaOkSJEiMnHixKicDh8+LAMGDJB8+fJFjceLJEACJEACJOAGAq5T\nwLNmzZKjjjpKGjVqpPkULFhQ2rZtG7NXO2HCBK18W7du7QaurAMJkAAJkAAJRCXgujng9evX\nS9WqVYMqDYW8adMmQS83f/687wzLli0TKOBXXnlF3njjjaC09oOff/5ZHn74YfspKVasmGze\nvFlKly4ddD7eA9TNyTB5vPm6OT5kRsCcu59GHizL0nLv2rVLf7r534YNG1JaPfxWChUqJIUL\nF9Z/5veY6nJSVelk64W2xl+y+aRKnkzkY+7vvXv3+kpusD106FDSMh88eFDrKidt5ToF/Pff\nf+dRhqVKldICbd++XcqVKxck1759+/TQc/fu3aVKlSpB10IP9u/fLxs3bgw6Xa1aNf0DM8ok\n6GIcB7hpk80jjuJcEdX8UCG3HxVwLsgc7z25Zs2aiPcW8oKRowmQH7/HI444Qp/HqJVRyCZO\ntj/jlT9cff322za/a7BIBb9wTN18LlmZ40nvOgWMt2u8QdiDOS5evLj9tP7+wgsvSI0aNeSC\nCy7Icy30xGmnnRb0AMH1rl27SoUKFWIq79C8Qo+3bNkieFFA/f0SduzYIegFwlId8/R+CTt3\n7tQvHCVKlHC9yLFeSiEA2vD111+XESNGCEagmjdvHvaFCg/mY489VvDSi5dZpMN9j7+LL75Y\nfyY7kpRqoE7kj1YmZN2zZ4+ULVs2WjRPXTO9wKJFi+bp8HhK0BBhoDgxwlOxYsWQK/EdHjhw\nwPGLqOsUMJThypUrgyTGgx5v2qEPeVg9T5kyRRo0aCB9+vTRaX7//Xf9cMDxAw884KsfThA0\nHpBADAKY1hk8eLCeusH0Cewt2rdvr6cUMMQcGgoUKCC1a9cOOg0FhYdWhw4d8oxcBUXkAQmQ\nQB4CrlPAtWrVkqlTp+peMB4ICIsWLcozL4zzmL+9+eab8TUQ8DaON/MTTzzRV73RAAB+IQEH\nBMaMGSO9e/fWvVYMId9///2CaRxMycDo0WnASzFsNPr16+c0CeORAAn8j4DrFDAcaIwcOVLG\njx+v1wKjN/zJJ59I3759A42Ga5hvqlevntx4442B8/iCOV78hZ4PisQDEvA5gb/++ksPJT/5\n5JPSo0cP/TKbLiRvv/22dqQTj2JPV12YLwm4iUBek+Is1w5v1FjPi6Fl/GB79uwpl19+ufaG\nZaqGtcHz5s0zh/wkARKIkwCmZ5YuXaqnbjCSlK4AS1r0rGGjceedd+rpoXSVxXxJINcIuK4H\nDICNGzeW9957TzDHiwnxUMvKGTNmROTcq1eviNd4gQTSRSCR3h2mWrIVYGCD4eZ0B5SDEayO\nHTsKDCaxFHDy5MlSvXr1dBfN/EnA9QRc1wO2E6tcuXIe5Wu/zu8kQALuJ9CkSROteK+55hqB\nox28YGfz5cP9xFhDvxBwtQL2SyNQThJIBwFYOWOo2Q0Bm6nAWc5zzz0nWMbVrl27mN7t3FBv\n1oEE0kmACjiddJk3CWSJANbzzpkzR4YNGyZLlizJUi3yFot54GnTpul1w61atcobgWdIwEcE\nXDkH7CP+FJUEUk4AG5pA6WLd7scffyx169ZNeRnJZAiHOPhjIAG/E6AC9vsdQPk9ReDPP/+U\n5cuX6zXwmHvlvtieal4K4zECVMAea1CK418Cq1ev1soXXqyaNWsmueAq07+tRclJQIRzwLwL\nSMAjBLDmFr7ImzZtmrPKF3PXDCTgFwJUwH5pacrpeQLHHXecnluFxXEuBqzvP+GEE/SmELlY\nf9aZBOIlQAUcLzHGJwEXEwjdsMTFVc1TtUqVKmmXmPfcc49espQnAk+QgMcIUAF7rEEpDgnk\nKgH0fuGgA3PXXbp0kZ9++ilXRWG9ScARASpgR5gYiQRIIBME4CULm61gv+FLL71UYFjGQAJe\nJUAF7NWWpVyeJWBZlsDLlVfDxRdfLNil6e+//9YOO7C9KAMJeJEAFbAXW5UyeZoA1vnOnj1b\n1qxZ41k5sVdx586dBXPau3fv9qycFMzfBLgO2N/tT+lzjAB6hdgju3jx4gKjJS+H0aNHy+HD\nhwU7KjGQgBcJUAF7sVUpkycJrFixQhYuXKh3CGvYsKHA4YaXg9fl83LbUTZnBDgE7YwTY5FA\nVgnAyUaHDh3k0KFD2rdz6dKls1ofFk4CJJA8ASrg5BkyBxJIO4EePXrIvHnz5KijjpJq1aql\nvTwWQAIkkH4CHIJOP2OWQAJJEYDVM7xb1a9fX6pUqZJUXl5IjFEA7PTEQAK5ToA94FxvQdbf\n8wTy5cun9/WFYwq/K54NGzbIt99+K1u3bvV8u1NA7xOgAvZ+G1NCjxCA5bPfA15ADhw4IL/+\n+qscPHjQ7zgof44ToALO8QZk9UnATwSOOOIIqVGjhuzZs0eWLl3qJ9EpqwcJcA7Yg41KkUgg\nFwi0bds2oWoef/zxsnnzZlm7dq1UrFhRKleunFA+TEQC2SbAHnC2W4DlkwAJxEUgf/78ctJJ\nJwnmxhctWiT79u2LKz0jk4BbCFABu6UlWA8S+B+BXr16yX333aeHWQklPIFSpUoJesKYD/7j\njz/CR+JZEnA5ASpglzcQq+cvAj/88IMMHz5cpkyZot0w+kv6+KTFXHDdunUF2xgykEAuEuAc\ncC62GuvsSQLwdoV9cLHu95VXXtH74npS0BQJhSHo6tWrpyg3ZkMCmScQdw948ODBepeSr7/+\nWj8oMl9llkgC3iTwyCOPaMvebt26SevWrb0pJKUiARIIEIhbAcMN3nvvvacfEMccc4z079+f\nczABnPxCAokR+OWXX2TIkCFy9NFHy9NPP51YJkxFAiSQUwTiVsDXXXed3ij7rbfekhNPPFEG\nDRoktWvXllatWsmYMWPk33//zSkArCwJZJsAHErcdNNNeqOFl19+WWBgxEACJOB9AnErYCDB\n/pxXX321fPzxx3pT8KFDh2prxJtvvln7qr3hhhuEQ9Tev3koYWoIFCxYUB5++GHBJvRt2rRJ\nTaY+zmXXrl0+lp6i5xKBhBSwXUAsgu/Zs6e8+uqrcuedd+o1eePGjdND1HXq1NHWnPb4/E4C\nJJCXALYahH0FQ3IEsCTpu+++ky1btiSXEVOTQAYIJKWAV61aJU8++aTepaVevXoyevRoueyy\ny3TPeOrUqVKzZk254oor5LXXXsuAKCyCBEjA7wTKlSunEcBBx+HDh/2Og/K7nEDcy5C2b98u\nkydPljfeeEOmT5+uLaEbN24sI0aMEMwPw1erCeedd56gF4y54c6dO5vT/CQBEiCBtBCAAoah\n6Jo1a2gcmhbCzDSVBOJWwMOGDZPHHntMKlSoINgkHMYjDRs2DFsnuIw78sgj6as1LB2eJAES\nSAcBeMjCtoUYjl62bBkddaQDMvNMCYG4FXCTJk3knXfekXbt2knhwoVjVuKbb77RPltjRmQE\nEiABEkgBgUKFCumRtwULFsjtt98u06ZNS0GuzIIEUk8g7jngbdu2yY8//hhR+WKNsNkuDNWF\ntxoGEiCB/yOA39CsWbNk4cKF/3eS31JKACNvmA7DagxMlTGQgBsJOOoBb9y4Ufbv36/rP3fu\nXP3wwFZgoQFxPvnkE4FxFtzqFStWLDQKj0nA1wTgZnLJkiWyY8cOvZ6+fv36vuaRTuHhJ7pf\nv37aR0E6y2HeJJAoAUcKeOzYsdKnT5+gMmDoECk0atRIjDVipDhuOo+HIv6SDanKJ9l6ZDq9\n3+SGvBjZSeSegXEQlC+W751zzjkJ5RFP+yZSx3jyN3EzVY4pz8lniRIlBIagydTNpDWfTsrN\n9ThGVnya77kuk5P6G1nNp5M04eLEk96RAsY6X3jrwdZfGNL566+/wlo1w6EAFC/WNOZKOHTo\nkF4ziF5+MgH5gI+fhtzNMg9YxvtR7ngdPmCE6LfffpMCBQroOcpk7zkn92smykA9MlWOE5nt\ncZKtFx6m+DMjgPa8vfrdKBDss5wsv1xiBLnxTEtWZuhK82yMJb8jBQyjhr59++q8sKxo8eLF\n2gd0rMxz4ToehpgrqlSpUlLVxcJ/uBAEK78E9OSghMqWLStFihTxi9iyc+dO/cKBHlY8Yfny\n5folDVa68CaX7D3npOxMlIF6ZKocJzLb4yRbLyihPXv26Hvcnq+Xv6MzASty3KO5NJKZbJtA\naW7evFkqVqyYVFboiGEFkJPgSAHbM4ILSgYSIIH4CMDwCnYTUNowUmQgARIggZgKeN26dXL+\n+edLy5Yt5aWXXpIXXnhBRo4cGZMcLTxjImIEHxHAyAhGWmrVquX47dhHeDImKnp3GCL004hN\nxuCyoLgJxFTA6EqXLFlSD0cgd6z9xTEDCZCAcwLo+Z5yyinOEzBmyglg/h32KRdeeKHexS3l\nBTBDEoiTQEwFXKVKFb3u1+R7yy23CP4YSIAE3E+gbdu27q9khmqItcGY24Q3P+zchv3MGUgg\nmwSczRSHqSGGckzAkA68zYwfP567kBgo/CQBEnAVAYzcYfMYGFbde++9rqobK+NPAgkp4OHD\nh0vVqlW1sw1g69q1q17T2LFjR21ggp1IGEiABEjAbQQ6deokLVq0kPfff18+//xzt1WP9fEZ\ngbgV8IwZM+S+++7Tyw5gnj979mx5/fXXtbeZSZMm6S0IoYgZSIAESMBtBLBeHTu34fOee+7R\nBlluqyPr4x8CcStguJrEXMq8efP0GjH4fkYYMmSINnB44IEHZP78+fLvv//6hyIlJYEQAlhP\niGVHxqlByGUeZpFA06ZNtSMhuAR98cUXs1gTFu13AjGNsEIBwZkAliSZhcaffvqpXrhsLDzr\n1aunHzorV66UBg0ahCbnMQl4ngAW9OPhDiclpUuX1g5aPC90jgk4aNAg7a3o0ksvzbGas7pe\nIhB3D7h8+fJ6j01AWL9+vcyZM0evEzauCM3WX+glM5CAHwmsXr1aK1/YScA7GoP7CGB1x2uv\nvSbVq1d3X+VYI98QiFsBY1kDnGx0795drr32Wt3bvf766wVW0RiGHjhwoDRv3lwqVKjgG4gU\nlAQMAfgMXrFihfb3fNxxx5nT/CQBEiCBPATiHoK+7LLL5K677tIesTAM3bt3b7ngggu0An7o\noYe0NTSspBlIwI8Efv/9d23YA+VLb0t+vAMoMwk4JxC3AobSffbZZ+Xxxx/XpZghNmxq8OOP\nPwq2ImQgAT8SwCYNGH7GPtg1a9b0IwLKTAIkEAeBuBWwydsoXnOMTypfOw1+9xsBbMuIgN2O\njJGi3xhQXhIgAecEElLA77zzjgwdOlTvC4y1wOGWWmzdutV5LRiTBDxAAEZX2HAB27gx5BYB\nbCGHjWbKlCkjN910U25VnrXNWQJxK+CZM2cKtiTEMFvDhg21Qw5jAZ2zFFhxEkgRASrfFIHM\ncDabNm0S2LCg/WDngj2uGUgg3QTiVsCTJ0/WNymWH9HKM93Nw/xJgAQyQQDLJvv06SMPP/yw\nDBgwQI/wZaJcluFvAnEvQ8LaXzjdoPL1941D6UnAawR69eol1apVk+eff14vJfOafJTHfQTi\nVsBQvuj97t69233SsEYkQAIkkCABTKs98cQTgrXc//nPfxLMhclIwDmBuBVw586d5aijjpJH\nHnlE36jOi2JMEvAWgb///lu2bdvmLaF8Lg2cCsFX9JQpU+Tbb7/1OQ2Kn24CcSvgr7/+Wvt+\nfvrpp7Wf29q1a2tjLBhk2f/SXXHmTwLZJtCjRw/B/b9s2bJsV4Xlp4gADEqHDRumn22rVq1K\nUa7MhgTCE4jbCAvLi7ChNd4SGUjArwTgdAYGiY0bN6Y9hMdugtNPP107VMFGGnjWMZBAugjE\nrYC7desm+GMgAb8SwLr3+++/X4uP3hKdbnjvToDyZSCBdBOIWwHbK7RgwQLB9oTwitWmTRvt\nmKNGjRr2KPxOAp4j8Pbbb8svv/wil1xyiZx11lmek48CkQAJZIZA3HPAqNbixYulVatWes63\nQ4cOMnbsWF1bzAFjHR2HbTLTeCwl8wT27t0r/fv3l4IFC8rgwYMzXwGWSAIk4BkCcfeAd+zY\nIRdeeKHAddt9990n8IyFgO0IsVUhFrGvXbtWXn31Vc9A8qMgaMt4wtSpU+OJnrNxR40apecH\nb7/9du3zORlB4mWcTFnpTutWWTJVL7/c/+m+j/yWf9wK+KWXXhI4nZ8/f77ezPqqq67SzLAb\n0ltvvSXwhztixAj9V6JECb/xpLweJ3Drrbfq+79Lly4el5Ti2QlgVA+djOLFi9tP8zsJJEUg\n7iHouXPn6nmv6tWrhy34mmuu0fuhrly5Mux1niSBXCYAZw0Y+Slfvnwui8G6x0EATodmzJgh\nixYtiiMVo5JAbAJxK2C8AWIOOFIwHrKwKwwDCZAACeQ6ATzzsEvSli1bZMOGDbkuDuvvIgJx\nK+BmzZppy2d4igkNmB9+9NFHtaesKlWqhF7mMQmQAAnkJIETTjhB1xtOVw4fPpyTMrDS7iMQ\ntwLGXpnwB3355ZdLy5YtdW/4999/F7hwg9KFp6zhw4e7T1LWiARIgAQSJIB1wbBvwQjf6tWr\nE8yFyUggmEDcRlhYfvHJJ59oRwSvvfZa4G0Q6yKxpReMtIxhVnBRPCIBEiCB3CWAHeDg/xsd\nDvjDL1SoUO4Kw5q7gkDcPWDUumLFinqZ0ebNm2XWrFlaIS9dulQ74ujYsaMrBGMlSCAVBF5+\n+WXt+W3jxo2pyI555DCBIkWKSK1atfQSTPaCc7ghXVT1uHvA9rqXLVuWPqHtQPjdUwRg0/Dg\ngw/Krl27tIMZTwlHYRIiULNmTSlcuLAejk4oAyYiARuBmAoYTjXgnDze8Oeff8abhPFJwFUE\nHn/8cUHPF56vsFE7AwnA38HRRx9NECSQEgIxFTDmfLHlmj2sWLFCsM4Xa4HhfhJrItetW6fX\nymGx+tVXX22Pzu8kkHME/vjjD3n22Wd1T4ebs+dc87HCJJATBGIq4MqVK8sXX3wREAbKt3nz\n5vLUU09phwR4IzQBSrhdu3ZStGhRc4qfJJCTBHr37i379++XQYMG0ftRTrYgK00C7icQtxEW\nLJ+PP/54Qa/ArnwhKiwDhw4dqjdn2Llzp/ulZw1JIAyB6dOny7vvvquX23Xq1ClMDJ4iARIg\ngeQJxK2AMbeLXnGkAI8xGIbetGlTpCg8TwKuJoD9fvGSifXs+fLlc3VdWTl3EMA9w0AC8RKI\nWwG3bt1apk2bpr1hhSvs6aef1g8vWAsmE1atWqU3d/j888/FSW8aw98TJ04U7NWK7wwkkCiB\nM888U5YsWZKQ8WGiZTJdbhLAJg3z5s2Thx56KDcFYK2zSiBuBdy+fXttdAWXlL169ZJx48YJ\n3FLCYKVJkyYyadIk6dOnT1JCIU8M/cHnNPLD1m9bt26NmGe/fv2kc+fO+qUATkKQ9ocffogY\nnxdIIBaB/Pnj/mnEypLXPUgA03B4NmHqjSs/PNjAaRYpphFWaPmVKlUSeL267rrrZNiwYWIf\nesHQ9HvvvSdQ0okG9HzHjh2rFXqjRo30zkq33Xab7t3iMzTANyvm7CZPniyoGwL8UWNLxFNP\nPTU0Oo9JgARIIGUEsEoE0xULFy7UHZJ33nknZXkzI+8TiFsBA0mFChUEQ8NwVLBgwQKBRywo\nyxo1aiRNDJ61YMyF/BBwg2NT7QkTJkg4BYy3z65duwaUL9I0btxYvvnmG/1yYJ/Dg1Xrv//+\niyiBgPlq8xc4mcAXvIjASTvy8mPwk9xoZ9xXfpLZj/e0U5nxvIKXLBjuffnll3L22Wc7TerK\neOa+xjPNfHdlRVNcKcibCpnjYZaQAjZyw0F5Ik46TPpwn+vXr8/jZQY3OIy68OALHRps0aKF\n4M8evvrqK6lbt24eAxoMS3fr1s0eVerUqZOybcawXZlfgx+3aXNim+DX+8FPcuNl7OGHH5ZL\nLrlEevTooTsnoStEcpEH5rf9+LtOVuaDBw8G9kiI1e5JKeBYmSdyHc7OodjtoVSpUlqg7du3\nS7ly5eyX8nyHIdb8+fNl9OjRea5hj+Jzzjkn6Dw8HcG1HN5gkwkHDhzQvXV7jzuZ/HItbbL8\nsinvzz//rItv2rSpo2qYN1wvPGQdCcxIMQmgI3LFFVcIhqDxDMKucbka0AvEaCE6O37bcAJy\nQx8kE/BccKoHXKeA0eB4g7AHc4yNsaOFMWPGyPjx42XgwIFi9u+0x69fv768+OKL9lN6+BpL\np+DNK5mA3i9eFPx2wxpmyfIz+WT6E8oUxoQw+Fu+fLkce+yxMauAni9+YCVKlIgZlxH8QQD3\n/zPPPCP//POPtj3J1d8DWgu/CfQCoYhidXi81LoYYcV0arJth85YzipgzC/DzaU9YK4ZN0Kk\nXhbAwQoR8y9DhgzRc8D29PxOApEIjBw5UhvQwHLeifKNlA/PkwD8hc+YMYMgSMAxAdettcB2\nX9ja0PR6IcmiRYvyzAvbJRwwYIBedoSHKQywGEjACQHYFWDurmTJkvLkk086ScI4JEACJJAy\nAq5TwOeee64WDkPJ6NnCKb5Z22ukxjUoZYRPP/1U93yxDhgWzpj/NX9mrs6k4ycJ2An07dtX\nr+HEOnIY+jGQAAmQQCYJuG4OGMPM6NFiLS8UbbFixeTyyy+Xli1bBriMGjVKL0mqV6+e9nyF\nC/DAFRo+++wzOtIPhcJjTWD27Nny6quv6jWc99xzD6mQAAmQQMYJuE4BgwCGkeHQAwYNFStW\nzLP0yD7PgocoAwnES+CDDz7QIyzw4Jas1WO8ZTM+CZAACYCA64ag7c0Cz1qh637t1/mdBBIl\ngBGWOXPmaCcviebBdCQQi8DUqVPzrLyIlYbX/UPAlT1g/+CnpNkkQIO9bNL3ftl79uzR64Hh\nra9Nmza0svd+k8ctoat7wHFLwwQkQAIk4BICsF8ZPHiwwKPUnXfe6ZJasRpuIkAF7KbWYF1I\ngAQ8RQDry1u1aiUYioavaAYSsBOgArbT4HcSIAESSDEBeN/DpjKwtt+1a1eKc2d2uUyACjiX\nW491d0wA68a5LtwxLkZMIQEsl4TyXb16tV5imcKsmVWOE6ACzvEGZPVjE4B/1zPPPFPOOOOM\noP2rY6dkDBJIDYH+/fvL0UcfrbdwTU2OzMULBGgF7YVWpAxRCdx///3ayXr79u0dO0mPmiEv\nkkCcBODuFKMw2LCFgQQMAfaADQl+epIAnLbAWQv2fb7vvvs8KSOFyg0CVL650U6ZrCUVcCZp\ns6yMEsDenrfeeqsuE/tD0+NVRvGzMBIggRgEqIBjAOLl3CXw1FNPyZIlS6RLly56KUjuSsKa\nkwAJeJEAFbAXW5Uyye7du+W5556TSpUqaWcIREICJEACbiNABey2FmF9UkKgePHi2tfzpEmT\npHz58inJk5mQQCoJwEPW0KFDZe/evanMlnnlEAFaQedQY7Gq8RGoVq2a4I+BBNxIYNCgQfLY\nY4/Jli1bZODAgW6sIuuUZgLsAacZMLMnARIggXAEYJVftWpVPUUyb968cFF4zuMEqIA93sAU\njwRIwJ0ESpcuLaNGjZKDBw9qQ0F8MviLABWwv9qb0pIACbiIQLt27eS6666TuXPn0ljQRe2S\nqapQAWeKNMtJKwHMo61ZsyatZTBzEkgHgWeffVYqVqyo54OXLl2ajiKYp0sJ0AjLpQ2Ta9Vq\n27Zt3FXGFm3xhkjlLFiwQDZs2CDNmjUTDO3ZQyLl2NPzOwnEIhDpvoyUzn5PVqhQQS+Zu+OO\nO2TlypXaa1u4dPGWgTzs5YTLk+eyS4AKOLv8WXoKCEDxrl+/XvvZhc9dBhLINQJXX321nH/+\n+VKuXLlcqzrrmwQBDkEnAY9Js08A7ibh5D5fvnxSv359yZ+ft3T2W4U1SIQAlW8i1HI7DZ9W\nud1+vq/94sWLBUq4du3aeYaefQ+HAEiABFxNgArY1c3DykUjsG7dOvnnn3+kTJkyUqtWrWhR\neY0ESIAEXEeACth1TcIKOSWwdu1aPeTcoEED7vPrFBrjkQAJuIYAFbBrmoIViZdAkyZNpGnT\nplKiRIl4kzI+CbiewB9//KEddOzZs8f1dWUFEyNAK+jEuDGVCwjA4Kps2bIuqAmrQAKpJzB8\n+HAZO3asFC1aVF588cXUF8Acs06APeCsNwErQAIkQAJ5CWA/6zp16sjIkSPlgw8+yBuBZ3Ke\nABVwzjchBSABEvAiAWyp+eabb0rhwoWla9eugu0LGbxFgArYW+1JaUiABDxEoHHjxvLEE0/I\npk2bBN7eLMvykHQUhQqY90BOEPjwww85DJcTLcVKpppAz549tZcs+DuHq0oG7xCgAvZOW3pW\nEliDdurUSeCub+/evZ6Vk4KRQDgC8PL23//+V+AzunLlyuGi8FyOEqACztGG80u1Me/VoUMH\n2b59u4wYMUJbhPpFdspJAoZAlSpVBMvuMC/M4B0CVMDeaUtPSoLhtzlz5sj1118vt9xyiydl\npFAkQAL+JEAF7M92zwmpJ0yYoJdg1K1bV0aPHp0TdWYlSYAESMApASpgp6QYL6MEfvvtN+nW\nrZsecnv77bfp7Sqj9FkYCZBAJgjQE1YmKLOMuAnUrFlTOnbsKKeddpqceOKJcadnAhLwAwG4\nqcTSJM4N52ZrUwHnZrt5vtaFChXSw8+eF5QCkkCCBLAi4IcffpAiRYpI8+bNpWBBPs4TRJm1\nZL5vMbw9Yj/ZZL3MHD58WOeDTwZnBJJl7qwUSbptQ8s5ePCg3n0pU/UPLZ/HuU8gFfcOfERX\nqlRJsCvYwoULpVGjRnnAJFKOeYYdOnQo5b+dPBV00QnoAvwlwswuxoEDB+yHUb/7XgHjZsOb\nZLI7jpibNR74UVvGBxeTZe4UUarLQRtjbSZ+rAwkkAiBVN2TmJ7ZuXOn3hcb6+WPOeaYoOok\nUo65r/FMSyR9UAVy6AByQx8kKzNe0A3DWOL7XgEXKFBASpcunfSuOvBSU6pUKcHQKYMzApna\nySjV5eCBBwXMbRCdtTNj5SWQqnsSO4LBXeXMmTMFhouYC8aaYRMSKQeKF50S+KBOJL0pO9c+\noXw3b96ctMzmBd2J/L5XwE4g5Xqctm3bulIEUy+8ce7YsSNtXn5MOa6EwEqRQJIEMAd88skn\ny6xZs+TXX3/VzmqM4kzk3v/444+TrBGTOyXAZUhOSTFeWgjgbXH27Nkyb9487e0qLYUwUxLw\nOIEyZcrISSedpHvAUMgMuUGAPeDcaCdP1hJDPlC8u3btkmrVqgkeIgwkQAKJEYCf6IoVKwqG\npRlygwBbKjfayXO1hJHCokWLBHPncDLPtb6ea2IKlAUCVL5ZgJ5EkVTAScBj0sQJLFu2TNat\nW6cN1xo2bKiNmhLPjSlJgARIIPcIUAHnXpvlfI3Xr18vf/31l56vwg4vdCCQ801KAUiABBIg\nQAWcADQmSY4A5qqqV68up5xyivbik1xuTE0CJBCNAFYZ/Pnnn9Gi8FqWCNAIK0vg/Vws5qmw\nwxEDCZBA+gksWLBAtm3bpp1MHHvssekvkCU4JsAesGNUjEgCJEACuUegQYMGeqRpxYoVAm9Z\nDO4hQAXsnrZgTUiABEgg5QTgHatp06ZaCcNbFpVwyhEnnCEVcMLomNAJAWx0wUACJJBdAnCb\nSiWc3TYIVzoVcDgqPJcSAmvWrJHp06fLxo0bU5IfMyEBEkicgFHC8PG8atUq4cYxibNMVUoa\nYaWKJPMJIrBy5UrBWl8sMeIyoyA0PCCBrBGAEm7WrJnerYcbx2StGQIFUwEHUPBLqgiYeSa8\naWOdL3abYiABEnAHAe7i5Y52QC2ogN3TFjlfE/h2XrJkiWDoGZuFY50vf+w536wUgARIIE0E\nqIDTBNaP2WKxP5RvyZIldc8XSpiBBEiABEggPAEq4PBceDYBAjVr1tSGHbVr1+a8bwL8mIQE\nskkAKxawO9mcOXP07mTZrItfyqYVtF9aOgNyFihQQOrUqUPlmwHWLIIEUk1g06ZNsnXrVjnr\nrLPko48+SnX2zC8MASrgMFB4igRIgAT8RuCoo44S7EwGW45u3bpJ//795dChQ37DkFF5qYAz\nitsbhWEvXwYSIAHvEahSpYp8//33erOU5557Ts477zyu409jM1MBpxGuF7P+999/5YcffpAt\nW7Z4UTzKRAK+J4Be8NSpU7Xy/frrr6VDhw6+Z5IuADTCShdZD+aLPXyXL1+uh6gwX1S+fHkP\nSkmRSIAEypYtK2+99ZY8//zz0qZNGwJJEwEq4DSB9VK2+/btk4ULFwqULrxaYXcVDFUxkAAJ\neI/ARRddFCTUzJkzg45DD9BbZkiMABVwYtx8k2rdunWydOlSvbyoXLlyctJJJ2knG74BQEFJ\ngARIIE0EqIDTBNYr2W7evFlbQh5//PGCdb758uXzimiUgwRIIAkC27ZtEyw9ZEicAI2wEmfn\ni5QnnHCCtGzZUmrVqkXl64sWp5AkEJsAliotWLBAG2T26dNHYJzJED8BKuD4mfkqBTZUoD9n\nXzU5hSWBmATy58+vne7g+TB48GDBi/q4ceP0LksxEzNCgAAVcACFf7/g7fXgwYP+BUDJSYAE\n4iZQqVIlOf3006Vv3756WeINN9ygR8tmzJgRd15+TUAF7NeWV3Lv3r1bDyPByhEbdDOQAAmQ\nQDwEsCpi4MCBsnjxYrn44ovlxx9/lDfffDOeLHwdl0ZYPmx+KF7sXLR27Vo9ZIQh5lKlSvmQ\nBEUmARJIBYFjjjlG3n//fZk+fbocd9xxqcjSF3lQAfuimf+/kFjPiyVFf//9tz5RrFgxOfbY\nYwU+YGnd7KMbgaKSQJoItGrVKk05ezNbDkF7s13DSoUlA1hWhB5v/fr19fxN1apVqXzD0uJJ\nEiCBVBLYuXOn3mlp7NixsmfPnlRmnbN5UQHnbNPFX3HM1zRv3lxOO+00geKFJSMDCZAACWSC\nAPxKw0CrS5cuer/h3r176xG5TJTt1jL4BHZryyRQL8ztjh8/XpYtWxYxNXrtDqAqAAAV9ElE\nQVS/HG6OiIcXSIAE0kSgffv28ttvv0mvXr10CUOGDJG6detKixYt5O23305Tqe7OlgrY3e0T\ns3a7du2SyZMny1VXXSVYFtCxY0cZOXJkzHSMQAIkQAKZJgBjraefflrWrFkjr7/+urRu3Vpm\nzZqljUIzXRc3lEcjLDe0QgJ1wJ6dTz31lHz55ZeB+RS4iuzevbt07tw5gRyZhARIgAQyQwAG\noJ06ddJ/WAIZbRUG/BREu56ZGqenFCrg9HBNe64waPjwww+1FfMVV1whV155pTRt2jTt5bIA\nEiABEkglgerVq0fNDvsRYxvUc889V84++2xtyHXkkUdGTZMrF6mAXdhSmCfBpvf79++Xm2++\nOWwNcSNiSRFcwDGQAAmQgFcJlClTRjZu3Cgvv/yy/oOceO6deuqp0r9/f71JTK7K7loFjGEJ\neGjCpu/YDKBkyZJRGccbP2pmGby4detW+fjjj2Xu3LkyZ84cmTdvnmCXEQS8GUZSwPDBSuWb\nwYZiUSRAAlkhMHHiRL0d6s8//yywpJ42bZr2uAVjU3jhihTQgcFz0s3BlQoYTr1feeUVOfPM\nMwX70eJ4xIgRgv1ow4V444fLI13n4GMZCrVChQphi4B8mAsxAfO455xzjrYMxBseAwmQAAn4\nnUChQoV0RwydsQcffFD7rl+0aJF2IhSJTYMGDWTv3r1y4oknCrZThYcu/MH5EJ6zWJaZ7ZD9\nGoQQQE8WC7WfffZZadSokQZ92223Cd6C8Bka4o0fmj5Vx3DrCIMoeJnC99WrV8tff/2lrf0a\nN24seHsLF3BjDBs2TMsKeSO9ZIRLy3MkQAIk4EcCUJ4NGzaMKLplWdrXAbZMnDp1qv6zR8Z5\nKOj58+fLE088oeNiXhl+EipWrGiPmtbvrlPAMEmHa0QoIwSAbtu2rUyYMCGsAo43frporly5\nUu69996g7NGQJ598ckCWoIv/O8CbXc+ePcNd4jkSIAESIIEECMDXAYaqETB/DLsa8/fHH3/o\n/c1xDZtIoHNnwvDhw+WMM84wh2n/dJ0CXr9+vX4bsUsOhbxp0ybBJtCh3pviiY+GCO2Jwj8y\nhimSdY2G+VgMm1erVk3XH3WGqb0JyeZv8uEnCZBA7hPw0vPA7bLAfgijkPizB9T7oosu0koY\nU4EYuaxTp07SuuDAgQP2YqJ+d50CxhBu6dKlgyqNNWBQvtu3b88zRBtPfLzthPY2ARzrzIzh\nU1DBcRygjhdeeGEgBRQ7/hhIgARIIJRAss+b0PyyeZzrspQtW1bwh7lihGTlgd0P9JWT4DoF\njCHZ0M3hzXHx4sXzyBRPfEy+P/DAA0F5fP7559rCOlTpB0VycACPVEWLFhVseOC28M4776Sl\nShg5gKUh2sUNBg1pETJMpubFqkiRImGuevMU3urRY4DMfpIbzx7Ibh/NylYLp+t3HCoPlAf8\nDOA3He6ZGxrfK8eQG+58Y624iSUv7hen7n5dp4BhLYz5VHvYsWOH7vmG++HHEx/Dw6FeouAc\nHD8u+EhOJuChjHzwQuCXcOjQIa2A8eIRrm28ygEGHviBJXvP5BIfM02DZR1+khu/a7S3n2TG\n79ooYD/JDQWM+zxZmeNRwK7zBV2rVi3tYML0evGQgrk5du8JF+KNHy4PniMBEiABEiCBTBNw\nnQKGuzEE7OqDNxJYrH3yySdBa2VxDUoZwUl8HZH/SIAESIAESMBFBFyngDGUOWDAAJkyZYpe\nfgSjqcsvv1wvwjbcRo0apT1G4dhJfJOOnyRAAiRAAiTgFgKumwMGGJiLv/fee/LPP//oRdGh\nS48wb2sPseLb4/I7CZAACZAACbiBgCsVsAFTuXJl89XRZ7zxHWXKSCRAAiRAAiSQBgKuG4JO\ng4zMkgRIgARIgARcR4AK2HVNwgqRAAmQAAn4gQAVsB9amTKSAAmQAAm4joCr54AzRQv78MKl\nZTIB7izhiMNPHqHgNQYL1+GG008OSOARCo444IDELwEez+CcAfc4/vwS4FQBzjiS9Y6US7yw\n/BPuGOF0xW9y4zlepkyZpJoLjkycBt8rYOx8gS2p8JdMAHRYazt1QZZMWW5Ju3nzZv1DDd14\nwi31S1c98IBCCLXOT1d5bsgXyherEsqXL5/HH7sb6peuOsALFtrbjS5m0yUzXjqwzSs8QlWp\nUiVdxbgu31S2tX1fgGiC5lOFWtEi8BoJRCIwdOhQeemll+T111/X+2hGisfzuU/giy++kDvv\nvFN69+4tN998c+4LRAkiEsCLVqtWraRNmzYyYsSIiPF4IXkCnANOniFzIAESIAESIIG4CVAB\nx42MCUiABEiABEggeQJUwMkzZA4kQAIkQAIkEDcBzgHHjYwJDIGNGzcKDLGqV6/uq31Djfx+\n+oQR1po1a6RSpUraEMtPsvtNVuxEt2LFCsEe6TCwZEgfASrg9LFlziRAAiRAAiQQkQCHoCOi\n4QUSIAESIAESSB8BKuD0sWXOJEACJEACJBCRgO8dcUQkwwuOCaxbt06wRSScFbRs2ZLzRo7J\n5UZEOJmZN2+eLF68WOrUqSNNmzbNjYqzlgkR4O85IWwJJeIccELYmMgQ6Nevn/z0008Cj2J/\n/vmn/PXXX/L444/LqaeeaqLwM4cJQPnedtttsn79ejn99NPl+++/l7PPPlvuvffeHJaKVY9E\ngL/nSGTSc5494PRw9UWuy5Ytk+nTp8vkyZO1dSyEfvTRR7X3HCpgb9wCkyZN0j6gJ06cqF0T\n4gWrU6dOctFFF8kJJ5zgDSEphSbA33PmbwTOAWeeuWdK3Lp1q3Tt2jWgfCFY48aN9cYW9HDq\njWb+7rvv5LzzztPKFxLVqFFD6tevL3BNyeAtAvw9Z7492QPOPHPPlNiiRQvBnz189dVXUrdu\nXV9tSmGX32vfMfQcuhYUxxs2bPCaqL6Xh7/nzN8C7AFnnrlnS8QwJXaVuvvuuz0ro58Eg0OG\nTZs2aYcMdrnhoGHLli32U/zuQQL8Pae/UdkDTj9jT5Tw0Ucf6blAI8yll14atB/umDFjZPz4\n8TJw4EDODRpIOf4Jq3ZsuQhFbA84xlZ1DN4lwN9zZtqWCjgznHO+lC+//DJo2LFt27ZaAWOv\nVGxLiOtDhgzRc8A5LywF0ASwtzX2/8Um5fawY8cOX+0Ta5fd69/5e85sC1MBZ5Z3zpb2zDPP\nhK37gAED9LDzyJEj5ZhjjgkbhydzlwDadNGiRdrq2UiB9cBXXnmlOeSnhwjw95zZxuQccGZ5\ne6q0Tz/9VPd8O3furHtJmP81f1g/ypD7BKBoMboBpQvL9nfeeUf2798vF154Ye4LRwmCCPD3\nHIQjIwd0xJERzN4sBEuQli9fHla4zz77jDskhSWTeycxHzhu3DgpVKiQVK1aVbp37y6nnHJK\n7gnCGkclwN9zVDxpuUgFnBaszJQEvEUAvV7M/VaoUMFbglEaEsgiASrgLMJn0SRAAiRAAv4l\nwDlg/7Y9JScBEiABEsgiASrgLMJn0SRAAiRAAv4lQAXs37an5CRAAiRAAlkkQAWcRfgsmgRI\ngARIwL8EqID92/aUnARIgARIIIsEqICzCJ9FkwAJkAAJ+JcAFbB/256S2wjs2bNHsNk8Pv0S\n9u3bp2XetWtXxkRet26d3i86YwWmqSB4BVu4cKG8//77smTJkjSVwmy9ToAK2OstTPkcEcA+\nxjVr1pTPP//cUXwvRPrll1+0zG+//XbGxGnTpo1gJ61cDgcOHJAzzzxTGjRooGXBZiQMJJAI\nAW7GkAg1piEBEvAtAfjGnjFjhlx//fXSu3dvqVSpkm9ZUPDkCFABJ8ePqUmABHxGYO3atVri\nbt26ScOGDX0mPcVNJQEq4FTSZF6uJIB53VdeeUUw5IpdmvDQvOWWW6Rs2bJh6ztx4kTBzjBF\nixaVc845Rzp06BAUb/Xq1To/7BBUpkwZPRQJR/YlS5YMioeN68eOHSuzZs2S3bt3672SUS7S\nmIBtHNGDOvroo+WFF16QunXryhFHHCF4yD/00ENSsGDwT3T06NGCudsePXqYLOTXX3+VSZMm\n6bnI6tWrS7t27aR169aB6+YLem0ff/yxrF+/XsvlZPtIsMAcp5O6bNy4UV5//XVZunSpbN26\nVY499lhdlzPOOMNUIc8nZC5cuLBuD/tF5LNp0ya599577acdyeq0fYIy/t9BrHsF9f3kk090\n7DfeeEMwdfHggw9qGez5ffDBBzJ79mxdf3t7I86bb74pa9askf/85z/2JPzuRwLKmICBBDxL\nQCkyS83tWuohb5133nmWUk5WiRIlLLWrj6UUckDuDz/80FK/f6tRo0ZW6dKlrcsvv9xq1qyZ\nPqeGGgPxfvvtN0spSEspb0ttyWcp5WIpJWkpZWMphRGIt2HDBkvtGKTTH3/88Zaa99RpatSo\nYan9dQPxUN5ZZ51lVaxY0cqfP7/+GzJkiE730UcfBeLhyz///KPLuuuuuwLnR40apWWDfO3b\nt7dOPvlknbZXr16BOPgyePBgfV69fGjZIIOJ+9prrwXFtR+8+uqrjury3XffaRnUS4jmfPrp\np1sFChSw8uXLZyEPE+rXr281b97cHFqhx+bCueeea6mXCXOoP53I6rR9gjL+34GTe+Wmm26y\n0J64V9C+aDv1cpUnO7Vto47z8ssvB11TCt5SCtm68cYbg87zwJ8EsMcnAwl4lkDbtm2tYsWK\nWT/99FNAxmXLllmVK1fWD39lUKPPGwWseqPWihUrAnEfeOAB/SCdMmWKPqfm/LTCU72sQBzV\n+9RxnnvuucC5Ll266HPvvvtu4JyysraOPPJIrbTNSShgPMxVT08/yFXv0fr333/1S8I111xj\nounPZ555RsedM2eOPoaygeI9++yzLdX7DMRVPTIdT81V6nPTp0/XyvCOO+6wDh8+rM8pa2RL\n9YB1vGgK2GldWrVqZZUqVcr6+++/A/UAI7ycQNGbEKpwQ49NvFAF7FRWp+1jyrF/Or1X1KiF\n5qZ6uPbkQd/VKIWldo6ylLFW0Pm33npLp502bVrQeR74kwAVsD/b3RdSQwFAuanh2jzyDhs2\nTF8zSsoo4CeffDIoLno36O2i54zQp08fnW7ChAmWGmIOxFXDuoHvavhV9/xOPfXUwDnz5b77\n7tPp58+fr09BAaPni56RPXTu3NkqXry4pbYADJxu0qSJhR6sCT179tR5qb2XzSn9uWXLFkvt\n3Wtddtll+hjKHYp68+bNQfFeeuklnT6aAkaCWHWBUp86dar1zTffBOWPg9NOO81Sw+uB86EK\nN/TYRAxVwE5lddI+pgz7Zzz3ihMFjLzvvvtufR/gxcsEjJpgRMa8CJnz/PQnAS5DUk9oBm8S\nMOsz1VByHgHVMKg+h/lKe2jatKn9UFTvWdSQo6hesz6PuV5sSn/ttdfquVvVS9Wb1StlGUin\nemt4sdX751511VVi/5s5c6aOt3z58kB8zP9ivtke1FCnnjdWPW99GvPNmFPEeRNQJzXEK0qR\nBpVx6623CupjylDKXqpVqybly5c3SfWnUuhBx5EOYtUFdcDyopNOOknU0KsMGDBAOnbsKPXq\n1ZPvv/9esJdwssGprE7aJ1xdErlXwuVjPwduuA8w54ugphD0MrcbbrhBt5s9Lr/7kwAVsD/b\n3RdSqx6fllPN6eaR1xhMYU2nPUSKC+MchOOOO04rwscff1x/nzx5suCBivNQNggwHkKA8la9\n26A/GEldffXVooZrdRz8g9FVaFBDulK7dm2BoQ/CuHHjRPVq9dIXExflFClSRBtqhZajhlOl\nRYsWOio4GHlNWnyGKmT7Nft3J3VB/fAiceWVV8qYMWP0y4Oa5xSnSt5eHr7DWM4enMrqpH3s\n+ZrvidwrJm2kTxj7NW7cONCGUMSQC1wYSEAT8GfHn1L7gYCyPtZDrCNGjMgjrvJgpK99/fXX\n+poZgsZnaICBFYxtTMD8ngkY1oVxEIaRYXiEgHlc9eOyrrvuOhMt8GkftsZJDEHb50gDEdUX\npeT13C2Mr2CQBMMwe8AcMcrBnHZoMHPbOI8hZAyjhwb1wqDTxxqCRrpodcG8L+Z6YZRkH25F\nOtULtjCvbkLokDOG1MPJD+Z2IyynsqKcWO1j6mL/jOdecToEjfxx76GNYHinXoiC5v/t5fO7\nPwmwB6xfQ/jPiwSwpKdcuXKiFIweCrTLiF4aglKA9tOCHq09KOte+f3330UpYH0avVf09Iz7\nRvQiMeRbp06dgItFLO+pUqWKYPhYzeHas9M9WCx/gtvLWAE9JfVY0stcVq1aFTT8jLRqflVn\ngSU79rBgwQLd41VzkPo0htu3bdsmWBpjD1hi5DREq8vcuXMFS67UnLOgh28ChuIxDI5rkYJh\ngWVaJvzxxx95+DiV1Un7mHLsn4ncK/b0kb6rlzC9REm9pIkyBBT1MhQpKs/7kYA/3zsotV8I\nqHWbugeCZUBq/lUvPVIOFPS5QYMGBTCYHjCMl2BJO2/ePEu5aNRWy0cddZS2TEZkWFOr54Re\nVoRlQj/++KOl1nPqc7CYNkENyepzsFCGcRJ6WDCGQtp+/fqZaFF7wIik5lZ1GqXQLXuvFtfU\n3KqlFIe2mIaFtPJNbI0fP16fwzIjLIVCQI8QvXOcw3XEQ48W1uGoj5MeMPKJVBf1cqCNvmBV\nrdbIWitXrrRgpIYerBoi19eM0VFoDxhtgDqgdw+DOLVu2lJD73pJk70H7FRWp+0DeUKD03sl\nnh4wyrjiiiv0SEaoUV1o+Tz2HwG8YTOQgKcJqN6ufqDjQY8/rOOEFbQ9GAUMxYnhTxMXVrx/\n/vmnPapWXljvauJg3XDfvn0tNb8XFE/1MC0obxMPw7RYnmQfIo02BI3MzLKV0HW9piAoWeUo\nRA8Bm3LUPKilfFqbKPoTy5QuvvjigNJFvczyKacKOFpdkBdYYSge9cCwM6ysjbLCUiiEUAWs\nRhL0EDnWDCMdljI99dRT1j333BM0BI20TmXFy4WT9kGeocHJvWJkirYMyZ4vXtQgW6dOneyn\n+Z0ErHxgoG4OBhLwPAG11ETUg16U8okpq1K6AoOscAZSSAzLXuSHgCFnWAJHCmqOVGDkU1Nt\n9qCcgESKltR51EetX9ZetiBfpPrAmAw7EqmXjKTKi5QYlr4wbIPVdTwBQ9AYZocRFdooWnAi\nazztE66seO6VcOnt59QyMYFRnFr7K2pExH6J331OgArY5zcAxScBEkgfATX0LhdccIGoYXnt\nojPSi1H6asCc3Uwg2NGsm2vKupEACZBAjhDAwCIM9+B3GyMTWB9N5ZsjjZfBatIKOoOwWRQJ\nkIA/CEDZKnen2hoeFvewEGcggVACHIIOJcJjEiABEiABEsgAAfaAMwCZRZAACZAACZBAKAEq\n4FAiPCYBEiABEiCBDBCgAs4AZBZBAiRAAiRAAqEEqIBDifCYBEiABEiABDJAgAo4A5BZBAmQ\nAAmQAAmEEqACDiXCYxIgARIgARLIAAEq4AxAZhEkQAIkQAIkEEqACjiUCI9JgARIgARIIAME\n/h/IE9bkkaL65QAAAABJRU5ErkJggg==", "text/plain": [ "plot without title" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "p2 <- p1 +\n", " stat_function(fun = dnorm, args = list(mean = 0, sd = 1),\n", " linetype = \"dashed\") + ## let's add a normal distribution to compare our histogram to\n", " xlim(c(-3, 3)) + ## we'll set hard limits to the x-axis to make the visualization clearer\n", " xlab(\"observed values of y\") + ## labeling our axes never hurts\n", " ggtitle(\"Density Histogram of R.V. Y\") + ## titles are good too...\n", " theme_bw() ## let's get rid of that annoying gray background\n", "p2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Much nicer! Definitely a nicer looking plot than we saw before.\n", "\n", "* Consult the comments in the above block to know what each step of the layering does\n", "* Note that we simple constructed a new plot (`p2`) by adding layers to the first plot (`p1`) -- both are preserved!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alright, now that we've taken a look at `ggplot2` and some other core tools of the [Tidyverse](https://www.tidyverse.org/), let's resume talking about simulations:\n", "\n", "There are other distributions we might want to simulate from as well." ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
variabletypestatlevelvalueformatted
unif numeric missing .all 0.000000000
unif numeric complete .all 100.00000000100
unif numeric n .all 100.00000000100
unif numeric mean .all 0.494018940.49
unif numeric sd .all 0.259922310.26
unif numeric p0 .all 0.022872950.023
unif numeric p25 .all 0.309162210.31
unif numeric median .all 0.470268680.47
unif numeric p75 .all 0.707357980.71
unif numeric p100 .all 0.999706501
unif numeric hist .all NA▆▃▇▇▆▅▅▅
bern integer missing .all 0.000000000
bern integer complete .all 100.00000000100
bern integer n .all 100.00000000100
bern integer mean .all 0.440000000.44
bern integer sd .all 0.498887650.5
bern integer p0 .all 0.000000000
bern integer p25 .all 0.000000000
bern integer median .all 0.000000000
bern integer p75 .all 1.000000001
bern integer p100 .all 1.000000001
bern integer hist .all NA▇▁▁▁▁▁▁▆
gamma numeric missing .all 0.000000000
gamma numeric complete .all 100.00000000100
gamma numeric n .all 100.00000000100
gamma numeric mean .all 1.557804041.56
gamma numeric sd .all 1.407034011.41
gamma numeric p0 .all 0.016384540.016
gamma numeric p25 .all 0.487684290.49
gamma numeric median .all 1.087180461.09
gamma numeric p75 .all 2.119529452.12
gamma numeric p100 .all 5.579254275.58
gamma numeric hist .all NA▇▆▃▂▂▁▁▁
\n" ], "text/latex": [ "\\begin{tabular}{r|llllll}\n", " variable & type & stat & level & value & formatted\\\\\n", "\\hline\n", "\t unif & numeric & missing & .all & 0.00000000 & 0 \\\\\n", "\t unif & numeric & complete & .all & 100.00000000 & 100 \\\\\n", "\t unif & numeric & n & .all & 100.00000000 & 100 \\\\\n", "\t unif & numeric & mean & .all & 0.49401894 & 0.49 \\\\\n", "\t unif & numeric & sd & .all & 0.25992231 & 0.26 \\\\\n", "\t unif & numeric & p0 & .all & 0.02287295 & 0.023 \\\\\n", "\t unif & numeric & p25 & .all & 0.30916221 & 0.31 \\\\\n", "\t unif & numeric & median & .all & 0.47026868 & 0.47 \\\\\n", "\t unif & numeric & p75 & .all & 0.70735798 & 0.71 \\\\\n", "\t unif & numeric & p100 & .all & 0.99970650 & 1 \\\\\n", "\t unif & numeric & hist & .all & NA & ▆▃▇▇▆▅▅▅ \\\\\n", "\t bern & integer & missing & .all & 0.00000000 & 0 \\\\\n", "\t bern & integer & complete & .all & 100.00000000 & 100 \\\\\n", "\t bern & integer & n & .all & 100.00000000 & 100 \\\\\n", "\t bern & integer & mean & .all & 0.44000000 & 0.44 \\\\\n", "\t bern & integer & sd & .all & 0.49888765 & 0.5 \\\\\n", "\t bern & integer & p0 & .all & 0.00000000 & 0 \\\\\n", "\t bern & integer & p25 & .all & 0.00000000 & 0 \\\\\n", "\t bern & integer & median & .all & 0.00000000 & 0 \\\\\n", "\t bern & integer & p75 & .all & 1.00000000 & 1 \\\\\n", "\t bern & integer & p100 & .all & 1.00000000 & 1 \\\\\n", "\t bern & integer & hist & .all & NA & ▇▁▁▁▁▁▁▆ \\\\\n", "\t gamma & numeric & missing & .all & 0.00000000 & 0 \\\\\n", "\t gamma & numeric & complete & .all & 100.00000000 & 100 \\\\\n", "\t gamma & numeric & n & .all & 100.00000000 & 100 \\\\\n", "\t gamma & numeric & mean & .all & 1.55780404 & 1.56 \\\\\n", "\t gamma & numeric & sd & .all & 1.40703401 & 1.41 \\\\\n", "\t gamma & numeric & p0 & .all & 0.01638454 & 0.016 \\\\\n", "\t gamma & numeric & p25 & .all & 0.48768429 & 0.49 \\\\\n", "\t gamma & numeric & median & .all & 1.08718046 & 1.09 \\\\\n", "\t gamma & numeric & p75 & .all & 2.11952945 & 2.12 \\\\\n", "\t gamma & numeric & p100 & .all & 5.57925427 & 5.58 \\\\\n", "\t gamma & numeric & hist & .all & NA & ▇▆▃▂▂▁▁▁ \\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "variable | type | stat | level | value | formatted | \n", "|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|\n", "| unif | numeric | missing | .all | 0.00000000 | 0 | \n", "| unif | numeric | complete | .all | 100.00000000 | 100 | \n", "| unif | numeric | n | .all | 100.00000000 | 100 | \n", "| unif | numeric | mean | .all | 0.49401894 | 0.49 | \n", "| unif | numeric | sd | .all | 0.25992231 | 0.26 | \n", "| unif | numeric | p0 | .all | 0.02287295 | 0.023 | \n", "| unif | numeric | p25 | .all | 0.30916221 | 0.31 | \n", "| unif | numeric | median | .all | 0.47026868 | 0.47 | \n", "| unif | numeric | p75 | .all | 0.70735798 | 0.71 | \n", "| unif | numeric | p100 | .all | 0.99970650 | 1 | \n", "| unif | numeric | hist | .all | NA | ▆▃▇▇▆▅▅▅ | \n", "| bern | integer | missing | .all | 0.00000000 | 0 | \n", "| bern | integer | complete | .all | 100.00000000 | 100 | \n", "| bern | integer | n | .all | 100.00000000 | 100 | \n", "| bern | integer | mean | .all | 0.44000000 | 0.44 | \n", "| bern | integer | sd | .all | 0.49888765 | 0.5 | \n", "| bern | integer | p0 | .all | 0.00000000 | 0 | \n", "| bern | integer | p25 | .all | 0.00000000 | 0 | \n", "| bern | integer | median | .all | 0.00000000 | 0 | \n", "| bern | integer | p75 | .all | 1.00000000 | 1 | \n", "| bern | integer | p100 | .all | 1.00000000 | 1 | \n", "| bern | integer | hist | .all | NA | ▇▁▁▁▁▁▁▆ | \n", "| gamma | numeric | missing | .all | 0.00000000 | 0 | \n", "| gamma | numeric | complete | .all | 100.00000000 | 100 | \n", "| gamma | numeric | n | .all | 100.00000000 | 100 | \n", "| gamma | numeric | mean | .all | 1.55780404 | 1.56 | \n", "| gamma | numeric | sd | .all | 1.40703401 | 1.41 | \n", "| gamma | numeric | p0 | .all | 0.01638454 | 0.016 | \n", "| gamma | numeric | p25 | .all | 0.48768429 | 0.49 | \n", "| gamma | numeric | median | .all | 1.08718046 | 1.09 | \n", "| gamma | numeric | p75 | .all | 2.11952945 | 2.12 | \n", "| gamma | numeric | p100 | .all | 5.57925427 | 5.58 | \n", "| gamma | numeric | hist | .all | NA | ▇▆▃▂▂▁▁▁ | \n", "\n", "\n" ], "text/plain": [ " variable type stat level value formatted\n", "1 unif numeric missing .all 0.00000000 0 \n", "2 unif numeric complete .all 100.00000000 100 \n", "3 unif numeric n .all 100.00000000 100 \n", "4 unif numeric mean .all 0.49401894 0.49 \n", "5 unif numeric sd .all 0.25992231 0.26 \n", "6 unif numeric p0 .all 0.02287295 0.023 \n", "7 unif numeric p25 .all 0.30916221 0.31 \n", "8 unif numeric median .all 0.47026868 0.47 \n", "9 unif numeric p75 .all 0.70735798 0.71 \n", "10 unif numeric p100 .all 0.99970650 1 \n", "11 unif numeric hist .all NA ▆▃▇▇▆▅▅▅ \n", "12 bern integer missing .all 0.00000000 0 \n", "13 bern integer complete .all 100.00000000 100 \n", "14 bern integer n .all 100.00000000 100 \n", "15 bern integer mean .all 0.44000000 0.44 \n", "16 bern integer sd .all 0.49888765 0.5 \n", " [ reached getOption(\"max.print\") -- omitted 17 rows ]" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# simulate 100 draws from a uniform(0,1) distribution\n", "Y_unif <- runif(n = 100, min = 0, max = 1)\n", "\n", "# simulate 100 draws from a bernoulli(0.5) distribution\n", "# recall that a bernoulli(p) distribution is just a binomial(n=1, p)\n", "Y_bern <- rbinom(n = 100, size = 1, prob = 0.5)\n", "\n", "# simulate 100 draws from a gamma(1,2) distribution\n", "Y_gamma <- rgamma(n = 100, shape = 1, scale = 2)\n", "\n", "# Let's skim these\n", "y_rv <- as_tibble(list(unif = Y_unif, bern = Y_bern, gamma = Y_gamma))\n", "skim(y_rv)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`R` has several other functions for simulating data from various distributions. Additionally many packages have been constructed for simulating data from other distributions. Googling \"how to simulate from XYZ distribution\" will likely be sufficient to figure what package/functions you will need. \n", "\n", "Often, our experiment measures more than a single variable on each unit, so we might want to write a function that returns `n` copies of $O = (W,A,Y)$. Suppose that $O$ is distributed as follows\n", "\n", "$$ W \\sim Uniform(0,1) $$\n", "$$ A \\sim Bernoulli(expit[-1 + W]) $$\n", "$$ Y \\sim Normal(W + A, 1/2) $$\n", "\n", "We can now write a function that takes as input `n` and returns a `data.frame` object with rows corresponding to observations $O$ and columns corresponding to $W, A, Y$." ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
    \n", "\t
  1. 'tbl_df'
  2. \n", "\t
  3. 'tbl'
  4. \n", "\t
  5. 'data.frame'
  6. \n", "
\n" ], "text/latex": [ "\\begin{enumerate*}\n", "\\item 'tbl\\_df'\n", "\\item 'tbl'\n", "\\item 'data.frame'\n", "\\end{enumerate*}\n" ], "text/markdown": [ "1. 'tbl_df'\n", "2. 'tbl'\n", "3. 'data.frame'\n", "\n", "\n" ], "text/plain": [ "[1] \"tbl_df\" \"tbl\" \"data.frame\"" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# define a function that takes as input n (a numeric)\n", "make_data_O <- function(n) {\n", " W <- runif(n, min = 0, max = 1)\n", " # plogis is the pdf of the logistic distribution and \n", " # plogis(x) = expit(x) = exp(x)/(1+exp(x))\n", " A <- rbinom(n, size = 1, prob = plogis(-1 + W))\n", " Y <- rnorm(n, mean = W + A, sd = sqrt(1/2))\n", " \n", " # return a data.frame object with named columns W, A, Y\n", " return(as_tibble(list(W = W, A = A, Y = Y)))\n", "}\n", "\n", "# make a data set with 100 observations\n", "data_O_obs <- make_data_O(n = 100)\n", "\n", "# confirm that make_data_O() returned a tibble object\n", "class(data_O_obs)" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
variabletypestatlevelvalueformatted
W numeric missing .all 0.0000000000
W numeric complete .all 100.000000000100
W numeric n .all 100.000000000100
W numeric mean .all 0.5080369090.51
W numeric sd .all 0.2878174700.29
W numeric p0 .all 0.0081458390.0081
W numeric p25 .all 0.2628543200.26
W numeric median .all 0.5195608660.52
W numeric p75 .all 0.7362414020.74
W numeric p100 .all 0.9976244221
W numeric hist .all NA▆▇▅▇▆▇▆▆
A integer missing .all 0.0000000000
A integer complete .all 100.000000000100
A integer n .all 100.000000000100
A integer mean .all 0.3500000000.35
A integer sd .all 0.4793724850.48
A integer p0 .all 0.0000000000
A integer p25 .all 0.0000000000
A integer median .all 0.0000000000
A integer p75 .all 1.0000000001
A integer p100 .all 1.0000000001
A integer hist .all NA▇▁▁▁▁▁▁▅
Y numeric missing .all 0.0000000000
Y numeric complete .all 100.000000000100
Y numeric n .all 100.000000000100
Y numeric mean .all 0.9594934840.96
Y numeric sd .all 0.9149685020.91
Y numeric p0 .all -1.223206222-1.22
Y numeric p25 .all 0.2955348560.3
Y numeric median .all 0.9239891970.92
Y numeric p75 .all 1.6185799971.62
Y numeric p100 .all 3.1650303083.17
Y numeric hist .all NA▁▃▅▇▅▆▂▁
\n" ], "text/latex": [ "\\begin{tabular}{r|llllll}\n", " variable & type & stat & level & value & formatted\\\\\n", "\\hline\n", "\t W & numeric & missing & .all & 0.000000000 & 0 \\\\\n", "\t W & numeric & complete & .all & 100.000000000 & 100 \\\\\n", "\t W & numeric & n & .all & 100.000000000 & 100 \\\\\n", "\t W & numeric & mean & .all & 0.508036909 & 0.51 \\\\\n", "\t W & numeric & sd & .all & 0.287817470 & 0.29 \\\\\n", "\t W & numeric & p0 & .all & 0.008145839 & 0.0081 \\\\\n", "\t W & numeric & p25 & .all & 0.262854320 & 0.26 \\\\\n", "\t W & numeric & median & .all & 0.519560866 & 0.52 \\\\\n", "\t W & numeric & p75 & .all & 0.736241402 & 0.74 \\\\\n", "\t W & numeric & p100 & .all & 0.997624422 & 1 \\\\\n", "\t W & numeric & hist & .all & NA & ▆▇▅▇▆▇▆▆ \\\\\n", "\t A & integer & missing & .all & 0.000000000 & 0 \\\\\n", "\t A & integer & complete & .all & 100.000000000 & 100 \\\\\n", "\t A & integer & n & .all & 100.000000000 & 100 \\\\\n", "\t A & integer & mean & .all & 0.350000000 & 0.35 \\\\\n", "\t A & integer & sd & .all & 0.479372485 & 0.48 \\\\\n", "\t A & integer & p0 & .all & 0.000000000 & 0 \\\\\n", "\t A & integer & p25 & .all & 0.000000000 & 0 \\\\\n", "\t A & integer & median & .all & 0.000000000 & 0 \\\\\n", "\t A & integer & p75 & .all & 1.000000000 & 1 \\\\\n", "\t A & integer & p100 & .all & 1.000000000 & 1 \\\\\n", "\t A & integer & hist & .all & NA & ▇▁▁▁▁▁▁▅ \\\\\n", "\t Y & numeric & missing & .all & 0.000000000 & 0 \\\\\n", "\t Y & numeric & complete & .all & 100.000000000 & 100 \\\\\n", "\t Y & numeric & n & .all & 100.000000000 & 100 \\\\\n", "\t Y & numeric & mean & .all & 0.959493484 & 0.96 \\\\\n", "\t Y & numeric & sd & .all & 0.914968502 & 0.91 \\\\\n", "\t Y & numeric & p0 & .all & -1.223206222 & -1.22 \\\\\n", "\t Y & numeric & p25 & .all & 0.295534856 & 0.3 \\\\\n", "\t Y & numeric & median & .all & 0.923989197 & 0.92 \\\\\n", "\t Y & numeric & p75 & .all & 1.618579997 & 1.62 \\\\\n", "\t Y & numeric & p100 & .all & 3.165030308 & 3.17 \\\\\n", "\t Y & numeric & hist & .all & NA & ▁▃▅▇▅▆▂▁ \\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "variable | type | stat | level | value | formatted | \n", "|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|\n", "| W | numeric | missing | .all | 0.000000000 | 0 | \n", "| W | numeric | complete | .all | 100.000000000 | 100 | \n", "| W | numeric | n | .all | 100.000000000 | 100 | \n", "| W | numeric | mean | .all | 0.508036909 | 0.51 | \n", "| W | numeric | sd | .all | 0.287817470 | 0.29 | \n", "| W | numeric | p0 | .all | 0.008145839 | 0.0081 | \n", "| W | numeric | p25 | .all | 0.262854320 | 0.26 | \n", "| W | numeric | median | .all | 0.519560866 | 0.52 | \n", "| W | numeric | p75 | .all | 0.736241402 | 0.74 | \n", "| W | numeric | p100 | .all | 0.997624422 | 1 | \n", "| W | numeric | hist | .all | NA | ▆▇▅▇▆▇▆▆ | \n", "| A | integer | missing | .all | 0.000000000 | 0 | \n", "| A | integer | complete | .all | 100.000000000 | 100 | \n", "| A | integer | n | .all | 100.000000000 | 100 | \n", "| A | integer | mean | .all | 0.350000000 | 0.35 | \n", "| A | integer | sd | .all | 0.479372485 | 0.48 | \n", "| A | integer | p0 | .all | 0.000000000 | 0 | \n", "| A | integer | p25 | .all | 0.000000000 | 0 | \n", "| A | integer | median | .all | 0.000000000 | 0 | \n", "| A | integer | p75 | .all | 1.000000000 | 1 | \n", "| A | integer | p100 | .all | 1.000000000 | 1 | \n", "| A | integer | hist | .all | NA | ▇▁▁▁▁▁▁▅ | \n", "| Y | numeric | missing | .all | 0.000000000 | 0 | \n", "| Y | numeric | complete | .all | 100.000000000 | 100 | \n", "| Y | numeric | n | .all | 100.000000000 | 100 | \n", "| Y | numeric | mean | .all | 0.959493484 | 0.96 | \n", "| Y | numeric | sd | .all | 0.914968502 | 0.91 | \n", "| Y | numeric | p0 | .all | -1.223206222 | -1.22 | \n", "| Y | numeric | p25 | .all | 0.295534856 | 0.3 | \n", "| Y | numeric | median | .all | 0.923989197 | 0.92 | \n", "| Y | numeric | p75 | .all | 1.618579997 | 1.62 | \n", "| Y | numeric | p100 | .all | 3.165030308 | 3.17 | \n", "| Y | numeric | hist | .all | NA | ▁▃▅▇▅▆▂▁ | \n", "\n", "\n" ], "text/plain": [ " variable type stat level value formatted\n", "1 W numeric missing .all 0.000000000 0 \n", "2 W numeric complete .all 100.000000000 100 \n", "3 W numeric n .all 100.000000000 100 \n", "4 W numeric mean .all 0.508036909 0.51 \n", "5 W numeric sd .all 0.287817470 0.29 \n", "6 W numeric p0 .all 0.008145839 0.0081 \n", "7 W numeric p25 .all 0.262854320 0.26 \n", "8 W numeric median .all 0.519560866 0.52 \n", "9 W numeric p75 .all 0.736241402 0.74 \n", "10 W numeric p100 .all 0.997624422 1 \n", "11 W numeric hist .all NA ▆▇▅▇▆▇▆▆ \n", "12 A integer missing .all 0.000000000 0 \n", "13 A integer complete .all 100.000000000 100 \n", "14 A integer n .all 100.000000000 100 \n", "15 A integer mean .all 0.350000000 0.35 \n", "16 A integer sd .all 0.479372485 0.48 \n", " [ reached getOption(\"max.print\") -- omitted 17 rows ]" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# let's skim over the observed data\n", "skim(data_O_obs)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice that in our function, we specified the vector `W` as above. However, when we generated `A`, rather than giving the `prob` argument of the `rbinom` function a numeric, we gave it a vector `plogis(-1 + W)`. The function is smart enough to know that when we give it a scalar argument for `size`, but a vector argument for `prob`, we mean that we want all `n` objects to be random draws from a binomimal distribution with `size = 1`, but whose probabilities vary according to the vector `plogis(-1 + W)`. \n", "\n", "Our function returned a `tibble` object. These objects in `R` can be useful for interfacing with certain functions like `glm`, that we'll meet in a minute (just like the canonical `data.frame`). These objects consist of named vectors of the same length that are stored as columns. Using the `$` operator accesses the a named attribute of the object. In addition, `tibble`s form the core object of the Tidyverse, a useful set of tools with which you should strive to be familiar." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We conclude this section with a quick statement of a fact that may be obvious but bears mention. We have not idiot-proofed the function `make_data_O`. That is, the function expects a numeric input `n` and if we give it something crazy, the results will be unexpected. We can see this like so" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Warning message in runif(n, min = 0, max = 1):\n", "“NAs introduced by coercion”" ] }, { "ename": "ERROR", "evalue": "Error in runif(n, min = 0, max = 1): invalid arguments\n", "output_type": "error", "traceback": [ "Error in runif(n, min = 0, max = 1): invalid arguments\nTraceback:\n", "1. make_data_O(n = \"something crazy\")", "2. runif(n, min = 0, max = 1) # at line 3 of file " ] } ], "source": [ "make_data_O(n = \"something crazy\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Above, the function tried to pass the character object `n` into `runif`, which threw an error. If only you will ever use your function, it may not be worth your time to include sanity checks; however, if others will use your function, it's (usually) worth it to build in some checks of function inputs and throw errors if input is unexpected. Here's an example:" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# define a function that takes as input n (a numeric)\n", "make_sane_data_O <- function(n){\n", " # check whether n is of class numeric\n", " if(class(n) != \"numeric\"){\n", " stop(\"n is not numeric. Can't simulate data.\")\n", " }\n", " W <- runif(n, min = 0, max = 1)\n", " # plogis is the pdf of the logistic distribution and \n", " # plogis(x) = expit(x) = exp(x)/(1+exp(x))\n", " A <- rbinom(n, size = 1, prob = plogis(-1 + W))\n", " Y <- rnorm(n, mean = W + A, sd = sqrt(1/2))\n", " \n", " # return a data.frame object with named columns W, A, Y\n", " return(as_tibble(list(W = W, A = A, Y = Y)))\n", "}" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "ename": "ERROR", "evalue": "Error in make_sane_data_O(n = \"something crazy\"): n is not numeric. Can't simulate data.\n", "output_type": "error", "traceback": [ "Error in make_sane_data_O(n = \"something crazy\"): n is not numeric. Can't simulate data.\nTraceback:\n", "1. make_sane_data_O(n = \"something crazy\")", "2. stop(\"n is not numeric. Can't simulate data.\") # at line 5 of file " ] } ], "source": [ "make_sane_data_O(n = \"something crazy\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Resources:\n", "* [_R for Reproducible Scientific Analysis_ (Software Carpentry)](https://swcarpentry.github.io/r-novice-gapminder/)\n", "* [Tools for Reproducible Research (Karl Broman, UW-Madison)](http://kbroman.org/Tools4RR/)\n", "* [_Version Control with git_ (Software Carpentry)](https://swcarpentry.github.io/git-novice/)\n", "* [_Happy git and GitHub for the userR_ (Jenny Bryan)](http://happygitwithr.com/)\n", "* [git for Humans (Alice Bartlett)](https://speakerdeck.com/alicebartlett/git-for-humans)\n", "* [Introduction to git (Berkeley Stat 259, KJ Millman)](http://www.jarrodmillman.com/rcsds/standard/git-intro.html)\n", "* [Interactive git Tutorial (GitHub)](https://try.github.io/levels/1/challenges/1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Appendix: Ceci n'est pas une pipe.\n", "\n", "* See http://r4ds.had.co.nz/pipes.html" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Introduction to GLMs in R\n", "## Lab 02 for PH 290: Targeted Learning in Biomedical Big Data\n", "### Author: [Nima Hejazi](https://nimahejazi.org)\n", "### Date: 17 January 2017\n", "### Attribution: adapted from source materials by [David Benkeser](https://github.com/benkeser)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## III. Fitting basic GLMs\n", "\n", "The first lab assignment asks you to fit several [generalized linear models](https://en.wikipedia.org/wiki/Generalized_linear_model). A generalized linear model for an observation $O=(W,A,Y)$ assumes that the conditional distribution of $Y$ given $A, W$ has some parameteric distribution (typically one from the Exponential Family) and that the conditional mean of $Y$ given $A,W$ can be written as\n", "\n", "$$ g[E(Y | A, W)] = \\beta_0 + \\beta_1 A + \\beta_2 W \\ . $$\n", "\n", "There is typically a choice of the link function $g()$ that is associated with a particular parametric family; this is known as the canonical link function. For example, when $Y \\in \\{0,1\\}$ we assume that $Y$ is Bernoulli distribution with $g(x) = log[x/(1-x)]$ -- also known as the logit link function. `R` has so-called `family` objects built in that can identify these relationships." ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\n", "Family: binomial \n", "Link function: logit \n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "\n", "Family: gaussian \n", "Link function: identity \n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "\n", "Family: poisson \n", "Link function: log \n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "\n", "Family: Gamma \n", "Link function: inverse \n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# for logisitic regression\n", "binomial()\n", "# for linear regression\n", "gaussian()\n", "# for poisson regression\n", "poisson()\n", "# for Gamma regression (make sure G is capital!)\n", "Gamma()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The unknown parameters of a GLM are typically estimated via maximum likelihood, which can be achieved in `R` through the use of the `glm` function. Below, we illustrate a couple different ways to call `glm`. " ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
    \n", "\t
  1. 'glm'
  2. \n", "\t
  3. 'lm'
  4. \n", "
\n" ], "text/latex": [ "\\begin{enumerate*}\n", "\\item 'glm'\n", "\\item 'lm'\n", "\\end{enumerate*}\n" ], "text/markdown": [ "1. 'glm'\n", "2. 'lm'\n", "\n", "\n" ], "text/plain": [ "[1] \"glm\" \"lm\" " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "\n", "Call:\n", "glm(formula = \"Y ~ A + W\", family = gaussian(), data = .)\n", "\n", "Deviance Residuals: \n", " Min 1Q Median 3Q Max \n", "-1.51390 -0.44471 -0.03176 0.39016 1.61826 \n", "\n", "Coefficients:\n", " Estimate Std. Error t value Pr(>|t|)\n", "(Intercept) 0.1817 0.1343 1.353 0.1792\n", "A 1.2030 0.1395 8.626 0.000000000000123\n", "W 0.7022 0.2323 3.023 0.0032\n", "\n", "(Dispersion parameter for gaussian family taken to be 0.4217676)\n", "\n", " Null deviance: 82.880 on 99 degrees of freedom\n", "Residual deviance: 40.911 on 97 degrees of freedom\n", "AIC: 202.41\n", "\n", "Number of Fisher Scoring iterations: 2\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# we'll use our data.frame object dat from before to fit the\n", "# linear regression Y = \\beta_0 + \\beta_1 A + \\beta_2 W\n", "\n", "# first we can specify a formula and a data.frame object\n", "mod_1 <- data_O_obs %>%\n", " glm(formula = \"Y ~ A + W\", data = ., family = gaussian())\n", "\n", "# check the class\n", "class(mod_1)\n", "\n", "# summarize the results\n", "summary(mod_1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "How about logistic regression this time? We'll need new data though..." ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# define a function that takes as input n (a numeric)\n", "# and returns n copies of O=(W, A, Y) where Y \\in \\{0,1\\}\n", "make_binary_O <- function(n) {\n", " W <- runif(n, min = 0, max = 1)\n", " # plogis is the pdf of the logistic distribution and \n", " # plogis(x) = expit(x) = exp(x)/(1+exp(x))\n", " A <- rbinom(n, size = 1, prob = plogis(-1 + W))\n", " # now Y is binary\n", " Y <- rbinom(n, size = 1, prob = plogis(-1 + W + A))\n", " \n", " # return a data.frame object with named columns W, A, Y\n", " return(as_tibble(list(W = W, A = A, Y = Y)))\n", "}" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\n", "Call:\n", "glm(formula = \"Y ~ A + W\", family = binomial(), data = .)\n", "\n", "Deviance Residuals: \n", " Min 1Q Median 3Q Max \n", "-1.3282 -0.8670 -0.7749 1.1608 1.7023 \n", "\n", "Coefficients:\n", " Estimate Std. Error z value Pr(>|z|)\n", "(Intercept) -1.2185 0.4602 -2.648 0.0081\n", "A 0.9839 0.4310 2.282 0.0225\n", "W 0.6116 0.7424 0.824 0.4101\n", "\n", "(Dispersion parameter for binomial family taken to be 1)\n", "\n", " Null deviance: 132.81 on 99 degrees of freedom\n", "Residual deviance: 126.20 on 97 degrees of freedom\n", "AIC: 132.2\n", "\n", "Number of Fisher Scoring iterations: 4\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# generate the data and fit logistic regression\n", "binary_O <- make_binary_O(n = 100)\n", "\n", "# remember, we need family = binomial for logistic regression\n", "mod_2 <- binary_O %>%\n", " glm(\"Y ~ A + W\", data = ., family = binomial())\n", "\n", "# summarize the model output\n", "summary(mod_2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The method `summary.glm` returns Wald-style tests that the estimated regression parameters are equal to zero. Let\\'s figure out how to access those values.\n", "\n", "To access these values, we can look inside the model object like so" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
    \n", "\t
  1. 'aic'
  2. \n", "\t
  3. 'boundary'
  4. \n", "\t
  5. 'call'
  6. \n", "\t
  7. 'coefficients'
  8. \n", "\t
  9. 'contrasts'
  10. \n", "\t
  11. 'control'
  12. \n", "\t
  13. 'converged'
  14. \n", "\t
  15. 'data'
  16. \n", "\t
  17. 'deviance'
  18. \n", "\t
  19. 'df.null'
  20. \n", "\t
  21. 'df.residual'
  22. \n", "\t
  23. 'effects'
  24. \n", "\t
  25. 'family'
  26. \n", "\t
  27. 'fitted.values'
  28. \n", "\t
  29. 'formula'
  30. \n", "\t
  31. 'iter'
  32. \n", "\t
  33. 'linear.predictors'
  34. \n", "\t
  35. 'method'
  36. \n", "\t
  37. 'model'
  38. \n", "\t
  39. 'null.deviance'
  40. \n", "\t
  41. 'offset'
  42. \n", "\t
  43. 'prior.weights'
  44. \n", "\t
  45. 'qr'
  46. \n", "\t
  47. 'R'
  48. \n", "\t
  49. 'rank'
  50. \n", "\t
  51. 'residuals'
  52. \n", "\t
  53. 'terms'
  54. \n", "\t
  55. 'weights'
  56. \n", "\t
  57. 'xlevels'
  58. \n", "\t
  59. 'y'
  60. \n", "
\n" ], "text/latex": [ "\\begin{enumerate*}\n", "\\item 'aic'\n", "\\item 'boundary'\n", "\\item 'call'\n", "\\item 'coefficients'\n", "\\item 'contrasts'\n", "\\item 'control'\n", "\\item 'converged'\n", "\\item 'data'\n", "\\item 'deviance'\n", "\\item 'df.null'\n", "\\item 'df.residual'\n", "\\item 'effects'\n", "\\item 'family'\n", "\\item 'fitted.values'\n", "\\item 'formula'\n", "\\item 'iter'\n", "\\item 'linear.predictors'\n", "\\item 'method'\n", "\\item 'model'\n", "\\item 'null.deviance'\n", "\\item 'offset'\n", "\\item 'prior.weights'\n", "\\item 'qr'\n", "\\item 'R'\n", "\\item 'rank'\n", "\\item 'residuals'\n", "\\item 'terms'\n", "\\item 'weights'\n", "\\item 'xlevels'\n", "\\item 'y'\n", "\\end{enumerate*}\n" ], "text/markdown": [ "1. 'aic'\n", "2. 'boundary'\n", "3. 'call'\n", "4. 'coefficients'\n", "5. 'contrasts'\n", "6. 'control'\n", "7. 'converged'\n", "8. 'data'\n", "9. 'deviance'\n", "10. 'df.null'\n", "11. 'df.residual'\n", "12. 'effects'\n", "13. 'family'\n", "14. 'fitted.values'\n", "15. 'formula'\n", "16. 'iter'\n", "17. 'linear.predictors'\n", "18. 'method'\n", "19. 'model'\n", "20. 'null.deviance'\n", "21. 'offset'\n", "22. 'prior.weights'\n", "23. 'qr'\n", "24. 'R'\n", "25. 'rank'\n", "26. 'residuals'\n", "27. 'terms'\n", "28. 'weights'\n", "29. 'xlevels'\n", "30. 'y'\n", "\n", "\n" ], "text/plain": [ " [1] \"aic\" \"boundary\" \"call\" \n", " [4] \"coefficients\" \"contrasts\" \"control\" \n", " [7] \"converged\" \"data\" \"deviance\" \n", "[10] \"df.null\" \"df.residual\" \"effects\" \n", "[13] \"family\" \"fitted.values\" \"formula\" \n", "[16] \"iter\" \"linear.predictors\" \"method\" \n", "[19] \"model\" \"null.deviance\" \"offset\" \n", "[22] \"prior.weights\" \"qr\" \"R\" \n", "[25] \"rank\" \"residuals\" \"terms\" \n", "[28] \"weights\" \"xlevels\" \"y\" " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "objects(mod_2)" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\t
(Intercept)
\n", "\t\t
-1.21847334364193
\n", "\t
A
\n", "\t\t
0.98385648844088
\n", "\t
W
\n", "\t\t
0.611577019152868
\n", "
\n" ], "text/latex": [ "\\begin{description*}\n", "\\item[(Intercept)] -1.21847334364193\n", "\\item[A] 0.98385648844088\n", "\\item[W] 0.611577019152868\n", "\\end{description*}\n" ], "text/markdown": [ "(Intercept)\n", ": -1.21847334364193A\n", ": 0.98385648844088W\n", ": 0.611577019152868\n", "\n" ], "text/plain": [ "(Intercept) A W \n", " -1.2184733 0.9838565 0.6115770 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "## looks like we might want \"coefficients\"\n", "mod_2$coefficients" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## IV. Simulation basics\n", "We now have all the components we need to execute a basic simulation. In frequentist statistics, we care a lot about what happens to a given summary measure of data (i.e., a statistic) over repeated experiments. Whereas in real life, we typically only get data from one or two repeated experiments where we don't know the real answer, in simulations we can generate many, many experiments where we do know the answer. \n", "\n", "Suppose we are interested in the power of a statistical test. That is, if the null hypothesis is not true, what is the probability (i.e., proportion of times over repeated experiments) that we correctly reject the null hypothesis. Below, we define a couple functions that we will combine into a function that executes one simulation (i.e., one analysis for one data set)." ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [ { "data": { "text/html": [ "TRUE" ], "text/latex": [ "TRUE" ], "text/markdown": [ "TRUE" ], "text/plain": [ "[1] TRUE" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# a function that takes as input 'fm', a glm object and returns whether\n", "# or not the p-value for the coefficient associated with A is \n", "# rejected at the 0.05 level\n", "is_null_rejected <- function(fm) {\n", " sumFm <- summary(fm)\n", " p_val <- sumFm$coefficients[\"A\",\"Pr(>|z|)\"]\n", " return(p_val < 0.05)\n", "}\n", "\n", "\n", "# a function that takes as input 'n', a scalar, simulates a data set\n", "# using makeBinaryO, estimates a main-terms glm and determines whether \n", "# the coefficient associated with A is rejected at the 0.05 level\n", "run_one_sim <- function(n) {\n", " # make data set\n", " dat <- make_binary_O(n = n)\n", " # fit glm\n", " fm <- glm(\"Y ~ A + W\", data = dat, family = binomial())\n", " # get hypothesis test result\n", " out <- is_null_rejected(fm)\n", " # return\n", " return(out)\n", "}\n", "\n", "# try it out\n", "run_one_sim(n = 100)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we need to do this a large number of times. There are many ways to do this, a couple illustrated below. First, we start with a basic `for` loop." ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
    \n", "\t
  1. TRUE
  2. \n", "\t
  3. FALSE
  4. \n", "\t
  5. TRUE
  6. \n", "\t
  7. TRUE
  8. \n", "\t
  9. FALSE
  10. \n", "\t
  11. TRUE
  12. \n", "
\n" ], "text/latex": [ "\\begin{enumerate*}\n", "\\item TRUE\n", "\\item FALSE\n", "\\item TRUE\n", "\\item TRUE\n", "\\item FALSE\n", "\\item TRUE\n", "\\end{enumerate*}\n" ], "text/markdown": [ "1. TRUE\n", "2. FALSE\n", "3. TRUE\n", "4. TRUE\n", "5. FALSE\n", "6. TRUE\n", "\n", "\n" ], "text/plain": [ "[1] TRUE FALSE TRUE TRUE FALSE TRUE" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
    \n", "\t
  1. FALSE
  2. \n", "\t
  3. TRUE
  4. \n", "\t
  5. TRUE
  6. \n", "\t
  7. TRUE
  8. \n", "\t
  9. FALSE
  10. \n", "\t
  11. TRUE
  12. \n", "
\n" ], "text/latex": [ "\\begin{enumerate*}\n", "\\item FALSE\n", "\\item TRUE\n", "\\item TRUE\n", "\\item TRUE\n", "\\item FALSE\n", "\\item TRUE\n", "\\end{enumerate*}\n" ], "text/markdown": [ "1. FALSE\n", "2. TRUE\n", "3. TRUE\n", "4. TRUE\n", "5. FALSE\n", "6. TRUE\n", "\n", "\n" ], "text/plain": [ "[1] FALSE TRUE TRUE TRUE FALSE TRUE" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "The power of the test is 0.58" ] } ], "source": [ "# number of simulations, in general more is better to decrease\n", "# the Monte Carlo error in your answer (i.e., error due to random seed used)\n", "n_sim <- 100\n", "\n", "# let's set a seed so the results are reproducible\n", "set.seed(1234)\n", "\n", "## get results using a for() loop\n", "# create empty vector of length nSim that results will go into\n", "results_vector <- rep(NA, n_sim)\n", "for(i in seq_len(n_sim)){\n", " results_vector[i] <- run_one_sim(n = 100)\n", "}\n", "\n", "# check out first and last results\n", "head(results_vector) \n", "tail(results_vector)\n", "\n", "# power is the average number of times you reject the null\n", "cat(\"The power of the test is \" , mean(results_vector))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next we use `R`'s built in function `replicate`. This function repeatedly evaluates `expr` a total of `n` times and stores the output. It's exactly like a `for` loop, but with shorter syntax. " ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The power of the test is 0.58" ] } ], "source": [ "# again set a seed so reproducible\n", "set.seed(1234)\n", "# replicate\n", "results_rep <- replicate(n = n_sim, expr = run_one_sim(n = 100))\n", "\n", "cat(\"The power of the test is \" , mean(results_rep))" ] } ], "metadata": { "kernelspec": { "display_name": "R", "language": "R", "name": "ir" }, "language_info": { "codemirror_mode": "r", "file_extension": ".r", "mimetype": "text/x-r-source", "name": "R", "pygments_lexer": "r", "version": "3.4.3" } }, "nbformat": 4, "nbformat_minor": 2 }