# Introduction
## Motivation: How to create documents?
* Types and distinctions
* Formal Documents: Journal articles, books, book chapters, theses,
consulting reports, etc.
* Informal documents: preliminary analyses, statistical homework,
* Online content: web pages, blog posts, forum posts
* Browser metaphor versus page/slide-based metaphor
* Context
* When to use reproducible analysis?
* When to use knitr with R Markdown or LaTeX?
## What is *reproducible analysis*?
* Reproducibility varies on a continuum
* One particular form:
* code transforms raw data and meta-data into processed data,
* code runs analyses on the data, and
* code incorporates analyses into a report
* Ideally, the process involves a one-click build
* Public sharing of document, code, and data is optional, but forms part of gold
standard of scientific openness
* Goes by many names, particularly "reproducible research", but I prefer "reproducible analysis".
\tiny{
See also: \url{http://stats.stackexchange.com/a/15006/183}
\url{https://github.com/jeromyanglim/rmarkdown-rmeetup-2012/issues/11}}
## Aims of reproducible analysis
* Ability to reproduce analysis
* Increase accuracy
* Ability to verify analyses are consistent with intentions
* Ability to review analysis choices
* Increase clarity of communication
* Increased trustworthiness
* Increased accuracy +
* Ability for others to verify
* Extensibility
* Ability to easily modify or re-use existing analyses
## Reproducible analysis in R
### Typically:
* Combine R and plain text file format to produce documents (e.g., pdfs,
HTML documents, etc.)
### Popular Instances
* Sweave
* brew
* knitr
\tiny{see also \url{http://cran.r-project.org/web/views/ReproducibleResearch.html}}
## Installation of software used in this talk
* R:
* R Studio:
* In R:
* `install.packages("knitr)`
* `install.packages("markdown")`
* `install.packages("xtable")`
* `install.packages("ggplot2")`
* `install.packages("lattice")`
* pandoc:
*
* LaTeX distribution:
* E.g., TeXLive, MikTeX
# Markdown
## What is markdown?
* Simple, readable, intuitive, light-weight markup
* Convert to HTML
* Raw HTML can be interspersed to add functionality
* Various extensions and flaours of markdown
* Popular on websites: e.g., StackOverflow, GitHub, Reddit
\tiny{see also: \url{http://daringfireball.net/projects/markdown/ }}
## Headings
\includegraphics[width=4in]{figures/headings.png}
## Basic formatting
\includegraphics[width=4in]{figures/basic-formatting.png}
## Paragraphs
\includegraphics[width=4in]{figures/paragraphs.png}
## Dot points
\includegraphics[width=4in]{figures/dot-points.png}
## Equations
\includegraphics[width=4in]{figures/equations.png}
* Uses MathJaX to render LaTeX (and other) equations
* Inserts MathJaX script reference into HTML header
\tiny{getting started: \url{http://jeromyanglim.blogspot.com.au/2010/10/getting-started-with-writing.html}}
## Hyperlinks
\includegraphics[width=4in]{figures/links.png}
## Images
\includegraphics[width=4in]{figures/images.png}
## Code
\includegraphics[width=4in]{figures/code.png}
## Quotes
\includegraphics[width=4in]{figures/quote.png}
## Tables
\includegraphics[width=4in]{figures/tables.png}
## Raw HTML
\includegraphics[width=4in]{figures/html.png}
# knitr and R Markdown
## knitr, R Markdown, and R Studio
* knitr: R Package developed by Yihui Xie for weaving R (and other languages) with various markup languages
* R Markdown: A file format that combines R code chunks and markdown text which is converted by knitr into markdown, and other formats (e.g., HTML, pdf, etc.).
* R Studio: Open source, cross-platform IDE for R.
## Benefits of knitr
* knitr supports many markups: LaTeX, Markdown, HTML, reStructuredText
* knitr has really nice defaults
* Tidy placement of generated files
* Simplified figure production
* automatically print ggplot2 and lattice figures
* print figures by default
* permit interspersing of figures and console output
* Greater extensibility:
* output options
* supports languages other than R
* Simplified caching
* And more:
## Rstudio
* Benefits of Rstudio as IDE for R
* Open source
* Works on Linux, Mac, and Windows
* Many useful features
* It just works
* Tight integration with knitr
* But many other options
* Emacs with ESS
* Vim with R plugin
* Eclipse with StatET
* etc.
## RMarkdown Examples
* *Introduction to R Markdown*
* *Statistics homework example*
* *Analysis of Winter Olympic Medals Example*
## Rstudio screenshot
\includegraphics[width=3in]{figures/rstudio-screenshot.png}
## R Code chunks
see http://yihui.name/knitr/options
```{r my_chunk_name, some_option='some_value'}
some_r_code
```
## R Code chunks options
### Global options:
`r opts_chunk$set(opt = value)` # general form
`r opts_chunk$set(cache=TRUE)` # e.g, global cache
### Some useful local options
* Hide console input: `echo=FALSE`
* Hide assorted messages: `warning=FALSE, error=FALSE, message=FALSE`
* Hide console output: `results="hide"`
* Display console input as is: `tidy=FALSE`
* Output raw markup: `results="asis"`
## Inline R Code
R Markdown `r 2 + 2` `r I(2+2)`
Markdown `4` 4
HTML 4
4
## Figures
* Support for multiple figures in a code block
* also see e.g., `par(mfrow=c(2,2))` or `grid.arrange`
* Figures and console output can be interspersed in a code chunk
* Various code chunk options
* see http://yihui.name/knitr/options
* `fig.width` and `fig.height`
* `dev` defaults to pdf for LaTeX and png for HTML/markdown
## Tables
* Many options for creating HTML Tables:
* R packages: `xtable`, `googleVis`, `R2HTML`, `hwriter`
* markdown extentions: github, pandoc
* Custom R code
* `xtable` is a reasonable option
* For informal reports just use console output
* css can be added later to control table appearance
* If you require sophisticated tables, you may want to switch to LaTeX
## `xtable` example
print(xtable(my_data_frame, caption = "My Caption",
digits = 3), type = "html",
caption.placement = "top",
html.table.attributes =
"style=\"border: 1px solid black;\"")
\centerline{\includegraphics[height=1.5in]{figures/simple_table.png}}
## Caching
Basic workflow:
* If knitting is quick, don't cache.
* If knitting takes more than ten seconds add \texttt{\`}`r
opts_chunk$set(cache=TRUE)`\texttt{\`} to the top of R Markdown file.
* If caching is causing problems, delete contents of `cache` folder,
* But if caching is causing problems and knitting takes a long time, name R code
chunks and use the `dependson` option in knitr
(see http://yihui.name/knitr/options).
Naming also permits selective deletion of named R code chunks in the
cache directory.
## R package: `markdown`
* Maintained by Jeffrey Horner; Developed by devloped JJ Allaire, Jeffrey
Horner, Vicent Marti, and Natacha Porte
* R Package that creates more options for converting Markdown to HTML
* `markdownToHTML("file.md", "file.html", options=c(...))`
* The default options are `"hard_wrap", "use_xhml", "smartypants", "base64_images"`
## Replicating R Studio's `Knit to HTML`
require(knitr) # for knitting from rmd to md
require(markdown) # for md to html
knit('test.rmd', 'test.md') # creates md
markdownToHTML('test.md', 'test.html') # create html
browseURL(paste('file://',
file.path(getwd(),'test.html'),
sep='')) # open file in browser
see `?markdownHTMLOptions` for more options. E.g.,
markdownToHTML('test.md', 'test.html',
options='fragment_only')
## pandoc
* pandoc is a library and command-line tool for converting between many
document formats (e.g., HTML, markdown, pdf, LaTeX, docx; also supports multiple plain text slide formats such as beamer)
* Lots of options
* Often requires thought in order to minimise conversion issues
### Example
pandoc -s file.html -o file.pdf
## One-click build
* For simple documents, click `knit to HTML` in RStudio
* For complex documents use a command-line option:
* e.g., `makefile`, `Rscript`, etc.
* combine with `pandoc`, `knitr` options, `markdown` options, text manipulation tools (e.g., sed, awk, scripting languages) etc. to flexibly produce a varity of documents
# LaTeX
## Example of LaTeX
*If time permits, show example of knitr with LaTeX*
# Conclusion
## Final thoughts
* knitr and R Markdown
* It makes reproducible analysis as simple as one click
* Great tool for:
* quick analyses for self and colleagues
* doing homework
* creating teaching resources
* blog posts, websites, etc.
* Scope to make more complex documents, but at a certain point it may be worth exploring other tools
* knitr and R LaTeX
* Great for journal articles, theses, books (e.g., citations, cross-references, printed works, equations)
* As your needs get more complex
* pandoc, makefiles, knitr options, markdown package options, scripts, etc.
## Links
* knitr:
* R Studio:
* R Markdown with R Studio:
* My Posts
### Places to ask questions
* R on StackOverflow:
* LaTeX:
* knitr:
## Thank You
\begin{center}
\LARGE{Questions?}
\end{center}