--- title: 'Literate Programming: Creating reproducible reports using R Markdown' author: "Zena Lapp" date: "4/20/2019" output: html_document --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` # Intro - How do you usually share data analyses with your collaborators? Many people share them through Word document, spreadsheets, attachments, etc. - In R Markdown, you can incorporate ordinary text (ex. experimental methods, analysis and discussion of results) alongside code and figures! (Some people write entire manuscripts in R Markdown.) - This is useful for writing reproducible analysis reports, publication, sharing work with collaborators, writing up homework, and keeping a bioinformatics notebook. - Because the code is emedded in the document, the figures are *reproducible*. - If you find an error or want to add more to the report, you can just re-run the document and you'll have updated figures! - This concept of combining text and code is called "literate programming". - To do this we use R Markdown - combines Markdown (renders plain text) with R. # Creating an R Markdown file - Open RStudio - File → New File → R Markdown - Give your document a title - Keep default output format as HTML. You need TeX installed if you want to generate PDF documents. - R Markdown files always end in `.Rmd` # Basic components of R Markdown - Header: instructions for R to specify type of document to be created and options to choose (ex. title, author, date) - Key-value pairs - At start of file - Between lines of `---` ``` --- title: 'Literate Programming: Creating reproducible reports using R Markdown' author: "Zena Lapp" date: "4/20/2019" output: html_document --- ``` - Setup (technically R code): sets up options for all code chunks (ex. show code in output file) - Text: narration formatted with markdown - Code chunks: embedded code ```{r,comment=NA,echo=F,collapse=T} cat('```{r}\n# Your code here\n```') # cat('# Your code here') # cat('```') ``` - R Markdown uses the location of the `.Rmd` file as the working directory (where you are on your computer) # Markdown ## Text formatting and bullet points - Bold text with double asterisks **like this** - Italicize text with underscores _like this_ - Add code-type font with backticks `like this` - Make a list using dashes (`-`) or asterisks (`*`) - You can make a nubmered list using `1.` for each thing in your list (like the example below) ## Lists Steps to generate an R Markdown file: 1. Open a .Rmd file in RStudio 1. Write text and code 1. Compile using Knit! ## Hyperlinks to useful information about R markdown - You can make a hyperlink to a website like this: `[text to show](http://the-web-page.com)`. - [Here](https://www.markdownguide.org) is a useful website on markdown - [Here](https://www.rstudio.com/wp-content/uploads/2016/03/rmarkdown-cheatsheet-2.0.pdf) is a useful R Markdown cheat sheet - [Here](https://swcarpentry.github.io/r-novice-gapminder/15-knitr-markdown/) is the Software Carpentry lesson on R Markdown ## Including images - You can include an image file like this: ``  ## Subscripts and superscripts - Subscript: F~2~ - Superscript: F^2^ ## Math You can use `$ $` and `$$ $$` to insert mat equations. Here is an inline example: $E = mc^2$. Here is a block example: $$y = \mu + \sum_{i=1}^p \beta_i x_i + \epsilon$$ # R code chunks and inline code Output of code will be embedded in the document. ```{r function} hist_rnorm = function(n){ nums = rnorm(n) hist(nums) return(nums) } ``` ```{r n10,echo=T} set.seed(0) n = 10 nums = hist_rnorm(n) ``` Here I generated `r n` random numbers from a $N(\mu=0,\sigma=1)$ distribution. The mean of the random numbers is `r mean(nums)`. The histogram also doesn't look like a normal distribution. It seems like I need a larger sample size to get a good estimate of the mean for this distribution. ```{r n100,echo=T} n = 1000 nums = hist_rnorm(n) ``` Next I generated `r n` random numbers from the same distribution. The mean is `r mean(nums)`, which is much closer to zero, and the histogram looks much more normal. # To compile - Click "Knit" at the top! - R code is executed and output document is produced! # Exercise - Create a new R Markdown document - Everything except the header - At the top, summarize what you did today using Markdown syntax _ Use a built-in dataset to make a plot: `data()` to check out what datasets are available. - OR download the gapminder data: ```{r} gapminder = read.csv("https://raw.githubusercontent.com/swcarpentry/r-novice-gapminder/gh-pages/_episodes_rmd/data/gapminder_data.csv") ``` - Plot gdp per capita vs. life expectancy ```{r} plot(gapminder$gdpPercap, gapminder$lifeExp) ``` - Plot life expectancy by continent ```{r, life_exp,warning=F} library(ggplot2) ggplot(data = gapminder, aes(x = lifeExp, fill = continent)) + geom_histogram(bins = 100) + facet_wrap(~continent) ```