---
title: R Markdown Tutorial
author: "Author: Your Name"
date: "Last update: `r format(Sys.time(), '%d %B, %Y')`"
output:
BiocStyle::html_document:
toc: true
toc_float:
collapsed: true
smooth_scroll: true
toc_depth: 3
fig_caption: yes
code_folding: show
number_sections: false
fontsize: 14pt
bibliography: bibtex.bib
type: docs
weight: 11
---
Source code downloads:
[ [.Rmd](https://raw.githubusercontent.com/tgirke/GEN242//main/content/en/tutorials/rmarkdown/rmarkdown.Rmd) ]
[ [.html](https://girke.bioinformatics.ucr.edu/GEN242/tutorials/rmarkdown/rmarkdown.html) ]
[ [.R](https://raw.githubusercontent.com/tgirke/GEN242//main/content/en/tutorials/rmarkdown/rmarkdown.R) ]
## R Markdown Overview
R Markdown combines markdown (an easy to write plain text format) with embedded
R code chunks. When compiling R Markdown documents, the code components can be
evaluated so that both the code and its output can be included in the final
document. This makes analysis reports highly reproducible by allowing to automatically
regenerate them when the underlying R code or data changes. R Markdown
documents (`.Rmd` files) can be rendered to various formats including HTML and
PDF. The R code in an `.Rmd` document is processed by `knitr`, while the
resulting `.md` file is rendered by `pandoc` to the final output formats
(_e.g._ HTML or PDF). Historically, R Markdown is an extension of the older
`Sweave/Latex` environment. Rendering of mathematical expressions and reference
management is also supported by R Markdown using embedded Latex syntax and
Bibtex, respectively. A new and related publishing environemt is [Quarto](https://quarto.org/docs/tools/neovim.html) (not covered here).
## Quick Start
### Install R Markdown
To work with this tutorial, the `rmarkdown` package needs to be installed on a system.
```{r install_rmarkdown, eval=FALSE}
install.packages("rmarkdown")
```
### Initialize a new R Markdown (`Rmd`) script
To minimize typing, it can be helful to start with an R Markdown template and
then modify it as needed. Note the file name of an R Markdown scirpt needs to
have the extension `.Rmd`. Template files for the following examples are available
here:
+ R Markdown sample script: [`sample.Rmd`](https://raw.githubusercontent.com/tgirke/GEN242/main/static/custom/rmarkdown/sample.Rmd)
+ Bibtex file for handling citations and reference section: [`bibtex.bib`](https://raw.githubusercontent.com/tgirke/GEN242/main/content/en/tutorials/rmarkdown/bibtex.bib)
Users want to download these files, open the `sample.Rmd` file with their preferred R IDE
(_e.g._ RStudio, vim or emacs), initilize an R session and then direct their R session to
the location of these two files.
### Metadata section
The metadata section (YAML header) in an R Markdown script defines how it will be processed and
rendered. The metadata section also includes both title, author, and date information as well as
options for customizing the output format. For instance, PDF and HTML output can be defined
with `pdf_document` and `html_document`, respectively. The `BiocStyle::` prefix will use the
formatting style of the [`BiocStyle`](http://bioconductor.org/packages/release/bioc/html/BiocStyle.html)
package from Bioconductor.
```
---
title: "My First R Markdown Document"
author: "Author: First Last"
date: "Last update: `r format(Sys.time(), '%d %B, %Y')`"
output:
BiocStyle::html_document:
toc: true
toc_depth: 3
fig_caption: yes
fontsize: 14pt
bibliography: bibtex.bib
---
```
### Render `Rmd` script
An R Markdown script can be evaluated and rendered with the following `render` command or by pressing the `knit` button in RStudio.
The `output_format` argument defines the format of the output (_e.g._ `html_document` or `pdf_document`). The setting `output_format="all"` will generate
all supported output formats. Alternatively, one can specify several output formats in the metadata section.
```{r render_rmarkdown, eval=FALSE, message=FALSE}
rmarkdown::render("sample.Rmd", clean=TRUE, output_format="BiocStyle::html_document")
```
The following shows two options how to run the rendering from the command-line. To render to PDF format, use the argument setting: `output_format="pdf_document"`.
```{sh render_commandline, eval=FALSE, message=FALSE}
$ Rscript -e "rmarkdown::render('sample.Rmd', output_format='BiocStyle::html_document', clean=TRUE)"
```
Alternatively, one can use a Makefile to evaluate and render an R Markdown
script. A sample Makefile for rendering the above `sample.Rmd` can be
downloaded [`here`](https://raw.githubusercontent.com/tgirke/GEN242-2018/gh-pages/_vignettes/07_Rbasics/Makefile).
To apply it to a custom `Rmd` file, one needs open the Makefile in a text
editor and change the value assigned to `MAIN` (line 13) to the base name of
the corresponding `.Rmd` file (_e.g._ assign `systemPipeRNAseq` if the file
name is `systemPipeRNAseq.Rmd`). To execute the `Makefile`, run the following
command from the command-line.
```{sh render_makefile, eval=FALSE, message=FALSE}
$ make -B
```
### R code chunks
R Code Chunks can be embedded in an R Markdown script by using three backticks
at the beginning of a new line along with arguments enclosed in curly braces
controlling the behavior of the code. The following lines contain the
plain R code. A code chunk is terminated by a new line starting with three backticks.
The following shows an example of such a code chunk. Note the backslashes are
not part of it. They have been added to print the code chunk syntax in this document.
```
```\{r code_chunk_name, eval=FALSE\}
x <- 1:10
```
```
The following lists the most important arguments to control the behavior of R code chunks:
+ `r`: specifies language for code chunk, here R
+ `chode_chunk_name`: name of code chunk; this name needs to be unique within an Rmd
+ `eval`: if assigned `TRUE` the code will be evaluated
+ `warning`: if assigned `FALSE` warnings will not be shown
+ `message`: if assigned `FALSE` messages will not be shown
+ `cache`: if assigned `TRUE` results will be cached to reuse in future rendering instances
+ `fig.height`: allows to specify height of figures in inches
+ `fig.width`: allows to specify width of figures in inches
For more details on code chunk options see [here](https://www.rstudio.com/wp-content/uploads/2015/03/rmarkdown-reference.pdf).
If document rendering of code chunk sections becomes time consuming due to long computations, one can enable caching to improve performance.
The corresponding [cache options](https://yihui.org/knitr/options/#cache) of the `knitr` package describes how caching works and the cache examples [here](https://yihui.org/knitr/demo/cache/) provide additional details.
### Learning Markdown
The basic syntax of Markdown and derivatives like kramdown is extremely easy to learn. Rather
than providing another introduction on this topic, here are some useful sites for learning Markdown:
+ [R Markdown Online Book](https://bookdown.org/yihui/rmarkdown/)
+ [Markdown Intro on GitHub](https://guides.github.com/features/mastering-markdown/)
+ [Markdown Cheet Sheet](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet)
+ [Markdown Basics from RStudio](http://rmarkdown.rstudio.com/authoring_basics.html)
+ [R Markdown Cheat Sheet](http://www.rstudio.com/wp-content/uploads/2015/02/rmarkdown-cheatsheet.pdf)
+ [kramdown Syntax](http://kramdown.gettalong.org/syntax.html)
### Tables
There are several ways to render tables. First, they can be printed within the R code chunks. Second,
much nicer formatted tables can be generated with the functions `kable`, `kableExtra`, `pander` or `xtable`. The following
example uses `kable` from the `knitr` package.
### With `knitr::kable`
```{r kable}
library(knitr)
kable(iris[1:12,])
```
A much more elegant and powerful solution is to create fully interactive tables with the [`DT` package](https://rstudio.github.io/DT/).
This JavaScirpt based environment provides a wrapper to the `DataTables` library using jQuery. The resulting tables can be sorted, queried and resized by the
user. Note, R Markdown source files containing JavaScript components can only be rendered into HTML and not PDF.
### With `DT::datatable`
```{r dt}
library(DT)
datatable(iris)
```
### Figures
Plots generated by the R code chunks in an R Markdown document can be automatically
inserted in the output file. The size of the figure can be controlled with the `fig.height`
and `fig.width` arguments.
```{r some_jitter_plot, eval=TRUE}
library(ggplot2)
dsmall <- diamonds[sample(nrow(diamonds), 1000), ]
ggplot(dsmall, aes(color, price/carat)) + geom_jitter(alpha = I(1 / 2), aes(color=color))
```
Sometimes it can be useful to explicitly write an image to a file and then insert that
image into the final document by referencing its file name in the R Markdown source. For
instance, this can be useful for time consuming analyses. The following code will generate a
file named `myplot.png`. To insert the file in the final document, one can use standard
Markdown or HTML syntax, _e.g._: ``.
```{r some_custom_inserted_plot, eval=TRUE, warning=FALSE, message=FALSE}
png("myplot.png")
ggplot(dsmall, aes(color, price/carat)) + geom_jitter(alpha = I(1 / 2), aes(color=color))
dev.off()
```
### Custom functions
Custom functions can be kept in a separate R file (here [`custom_Fct.R`](https://raw.githubusercontent.com/tgirke/GEN242/main/content/en/tutorials/rmarkdown/custom_Fct.R)) and then imported
with the `source()` command. In the following example, the `custom_Fct.R` file is located on GitHub.
```{r import_custom_fct, eval=TRUE}
source("https://raw.githubusercontent.com/tgirke/GEN242/main/content/en/tutorials/rmarkdown/custom_Fct.R")
```
Now the imported function (here `myMAcomp`) can be used.
```{r use_custom_fct, eval=TRUE}
myMA <- matrix(rnorm(100000), 10000, 10, dimnames=list(1:10000, paste("C", 1:10, sep="")))
resultDF <- myMAcomp(myMA=myMA, group=c(1,1,1,2,2,2,3,3,4,4), myfct=mean)
kable(resultDF[1:12,])
```
### Inline R code
To evaluate R code inline, one can enclose an R expression with a single back-tick
followed by `r` and then the actual expression. For instance, the back-ticked version
of 'r 1 + 1' evaluates to `r 1 + 1` and 'r pi' evaluates to `r pi`.
### Mathematical equations
To render mathematical equations, one can use standard Latex syntax. When expressions are
enclosed with single `$` signs then they will be shown inline, while
enclosing them with double `$$` signs will show them in display mode. For instance, the following
Latex syntax `d(X,Y) = \sqrt[]{ \sum_{i=1}^{n}{(x_{i}-y_{i})^2} }` renders in display mode as follows:
$$d(X,Y) = \sqrt[]{ \sum_{i=1}^{n}{(x_{i}-y_{i})^2} }$$
To learn LaTeX syntax for mathematical equations, one can consult various online manuals, such as
this [Wikibooks tutorial](https://en.wikibooks.org/wiki/LaTeX/Mathematics), or use an online
equation rendering and checking tool, such as this [one](https://arachnoid.com/latex/).
### Citations and bibliographies
Citations and bibliographies can be autogenerated in R Markdown in a similar
way as in Latex/Bibtex. Reference collections should be stored in a separate
file in Bibtex or other supported formats. To cite a publication in an R Markdown
script, one uses the syntax `[@]` where `` needs to be replaced with a
reference identifier present in the Bibtex database listed in the metadata section
of the R Markdown script (_e.g._ `bibtex.bib`). For instance, to cite Lawrence et al.
(2013), one uses its reference identifier (_e.g._ `Lawrence2013-kt`) as `` [@Lawrence2013-kt].
This will place the citation inline in the text and add the corresponding
reference to a reference list at the end of the output document. For the latter a
special section called `References` needs to be specified at the end of the R Markdown script.
To fine control the formatting of citations and reference lists, users want to consult this
[R Markdown page](http://rmarkdown.rstudio.com/authoring_bibliographies_and_citations.html).
Also, for general reference management and obtaining references in Bibtex format [Paperpile](https://paperpile.com/features)
can be very helpful.
### Viewing R Markdown report on HPCC cluster
R Markdown reports located on UCR's HPCC Cluster can be viewed locally in a web browser (without moving
the source HTML) by creating a symbolic link from a user's `.html` directory. This way any updates to
the report will show up immediately without creating another copy of the HTML file. For instance, if user
`ttest` has generated an R Markdown report under `~/bigdata/today/rmarkdown/sample.html`, then the
symbolic link can be created as follows:
```{r rmarkdown_symbolic_link, eval=FALSE}
cd ~/.html
ln -s ~/bigdata/today/rmarkdown/sample.html sample.html
```
After this one can view the report in a web browser using this URL [https://cluster.hpcc.ucr.edu/~ttest/rmarkdown/sample.html](https://cluster.hpcc.ucr.edu/~ttest/rmarkdown/sample.html).
If necessary access to the URL can be restricted with a password following the instructions [here](http://hpcc.ucr.edu/manuals_linux-cluster_sharing.html#sharing-files-on-the-web). Very
important: to set up accounts for html viewing, users have to apply the configuration settings for their accounts that are outlined on the HPCC website [here](https://girke.bioinformatics.ucr.edu/GEN242/tutorials/rmarkdown/rmarkdown/).
### Viewing R Markdown report on GitHub
To host and view static HTML files on GitHub, follow the instructions [here](https://bit.ly/3MFARYY). Note, this works
only with public GitHub repos.
## Session Info
```{r sessionInfo}
sessionInfo()
```
## References