--- title: "Introduction to R" subtitle: "GEN242: Data Analysis in Genome Biology" author: "Thomas Girke" date: today format: revealjs: theme: [default, revealjs_custom.scss] slide-number: true progress: true chalkboard: false # true not compatible with full HTML download option specified in next line embed-resources: true # for HTML download scrollable: true smaller: true highlight-style: github code-block-height: 380px transition: slide footer: "GEN242 · UC Riverside · [Tutorial source](https://girke.bioinformatics.ucr.edu/GEN242/tutorials/rbasics/rbasics_index.html)" logo: "https://girke.bioinformatics.ucr.edu/GEN242/assets/logo_gen242.png" execute: echo: true eval: false --- ## Overview Topics covered in this tutorial: - What is R and why use it? - R working environments (RStudio, Nvim-R-Tmux) - Installation of R, RStudio and packages - Navigating directories and basic syntax - Data types and data objects - Subsetting, utilities, and calculations - Reading and writing external data - Graphics in R (base graphics) - Analysis routine: data import, merging, filtering, plotting ::: {.callout-note} **Homework:** HW02 tasks are linked throughout these slides at the relevant sections. All tasks are assembled into a single R script `HW2.R` submitted via GitHub. ::: --- ## What is R? [R](http://cran.at.r-project.org) is a powerful statistical environment and programming language for data analysis and visualization, widely used in bioinformatics and data science. ### Why use R? - Complete statistical environment and programming language - Efficient functions and data structures for data analysis - Powerful, publication-quality graphics - Access to a fast-growing number of analysis packages - One of the most widely used languages in bioinformatics - Standard for data mining and biostatistical analysis - Free, open-source, available for all operating systems ### Key package repositories | Repository | Packages | Focus | |---|---|---| | [CRAN](http://cran.at.r-project.org/) | >14,000 | General data analysis | | [Bioconductor](http://www.bioconductor.org/) | >2,000 | Bioscience data analysis | | [Omegahat](https://github.com/omegahat) | >90 | Programming interfaces | --- ## R Working Environments {.scrollable} Several IDEs support syntax highlighting and sending code to the R console: ### RStudio / Posit - [RStudio Desktop](https://www.rstudio.com/products/rstudio/features) — local installation - [RStudio Server / OnDemand](https://hpcc.ucr.edu/manuals/hpc_cluster/selected_software/ondemand/#rstudio-on-ondemand) — web-based, available at UCR HPCC - [Posit Cloud](https://rstudio.cloud/) — cloud-based, no local install needed Key shortcuts in RStudio: | Shortcut | Action | |---|---| | `Ctrl+Enter` | Send code to R console | | `Ctrl+Shift+C` | Comment / uncomment | | `Ctrl+1` / `Ctrl+2` | Switch between editor and console | ### Nvim-R-Tmux Terminal-based environment combining Neovim + R + Tmux. Ideal for working on the HPCC cluster. - Start R session: `\rf` - Send line to R console: `Enter` - Full instructions: [Nvim-R-Tmux tutorial](https://girke.bioinformatics.ucr.edu/GEN242/tutorials/linux/linux.html#nvim-r-tmux-essentials) ### Other editors Emacs (ESS), VS Code, gedit, Notepad++, Eclipse — all support R to varying degrees. --- ## Installation of R and Packages {.scrollable} ### Install R and RStudio 1. Install R from [CRAN](http://cran.at.r-project.org/) 2. Install RStudio from [posit.co](http://www.rstudio.com/ide/download) ### Install CRAN packages ```{r rinstall} install.packages(c("pkg1", "pkg2")) install.packages("pkg.zip", repos=NULL) # install from local file ``` ### Install Bioconductor packages ```{r installbioc} if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") # install BiocManager if not available BiocManager::version() # check Bioconductor version BiocManager::install(c("pkg1", "pkg2")) # install Bioc packages ``` ### Load packages ```{r loadpackages} library("my_library") # single package lapply(c("lib1", "lib2"), require, character.only=TRUE) # multiple packages ``` ### Explore a package ```{r packagehelp} library(help="my_library") # list functions vignette("my_library") # open manual (PDF or HTML) ``` ::: {.callout-tip} For detailed Bioconductor install instructions see the [Bioc Install page](http://www.bioconductor.org/install/) and the [BiocManager vignette](https://cran.r-project.org/web/packages/BiocManager/vignettes/BiocManager.html). ::: --- ## Working Routine for Tutorials When working in R, a good practice is to write all commands directly into an R script, instead of the R console, and then send the commands for execution to the R console with the `Ctrl+Enter` shortcut in RStudio/Posit, or similar shortcuts in other R coding environments, such as [Nvim-R](../linux/index.qmd#nvim-r-tmux-essentials). This way all work is preserved and can be reused in the future. The following instructions in this section provide a short overview of the standard working routine users should use to load R-based tutorials into their R IDE. **Step 1.** Download `*.qmd`, `*.Rmd` or `*.R` file. These so called source files are always linked on the top right corner of each tutorial or slide show. From within R the file download can be accomplished via `download.file` (see below), `wget` from the command-line or with the save function in a user's web browser. The following downloads the `Rmd` file of this tutorial via `download.file` from the R console. ```{r download_file} #| eval: false download.file("https://raw.githubusercontent.com/tgirke/GEN242/main/slides/rbasics/rbasics_slides.qmd", "rbasics.qmd") ``` **Step 2.** Load `*.qmd`, `*.Rmd` or `*.R` file in Nvim-R or RStudio. **Step 3.** Send code from code editor to R console by pushing `Ctrl + Enter` in RStudio or `Enter` in Nvim-R. In `*.Rmd` files the code lines are in so called [code chunks](../rmarkdown/index.qmd#r-code-chunks) and only those ones can be sent to the console. To obtain in Neovim a connected R session one has to initiate by pressing the `\rf` key combination. For details see [here](https://girke.bioinformatics.ucr.edu/GEN242/custom/slides/R_for_HPC/NvimR.html#11). --- ## Getting Around {.scrollable} ### Starting and closing R ```{r quitr} q() # quit R # Save workspace image? [y/n/c]: ``` ::: {.callout-warning} Answer **n** when asked to save the workspace. Saving `.RData` creates large files. Better practice: save your analysis as an R script and re-run it to restore your session. ::: ### Navigating directories ```{r navigatedirs} ls() # list objects in current R session dir() # list files in current working directory getwd() # print path of current working directory setwd("/home/user") # change working directory ``` ### File information ```{r fileinfo} list.files(path="./", pattern="*.txt$", full.names=TRUE) # list files by pattern file.exists(c("file1", "file2")) # check if files exist file.info(list.files(path="./", pattern=".txt$", full.names=TRUE)) # file details ``` --- ## Basic Syntax ### Assignment and general syntax ```{r objects} object <- ... # assignment operator (preferred over =) object <- function_name(arguments) # call a function object <- object[arguments] # subset an object assign("x", function(arguments)) # alternative: assign() ``` ### Pipes The `%>%` pipe from `dplyr`/`magrittr` chains operations left-to-right. New native R pipe is `|>`. ```{r rpipes} x %>% f(y) # equivalent to f(x, y) ``` Makes code readable by avoiding deeply nested calls. Details in the [dplyr tutorial](../dplyr/index.qmd). ### Getting help ```{r functionhelp} ?function_name # open help page for a function ``` ### Run scripts Preferred version ```{r rscriptrun} Rscript my_script.R # execute from command-line (preferred) ``` Older alternatives ```bash source("my_script.R") # execute R script from within R R CMD BATCH my_script.R # alternative ``` --- ## Data Types ### Numeric ```{r numeric} #| eval: true x <- c(1, 2, 3) x is.numeric(x) as.character(x) # convert to character ``` ### Character ```{r character} #| eval: true x <- c("1", "2", "3") x is.character(x) as.numeric(x) # convert to numeric ``` ### Complex (mixed types — coerced to character) ```{r complex} #| eval: true c(1, "b", 3) # numeric values coerced to character ``` ### Logical ```{r logical} #| eval: true x <- 1:10 < 5 x # TRUE/FALSE vector !x # negate which(x) # indices of TRUE values ``` --- ## Data Objects — Overview ### Common object types | Type | Dimensions | Data types | Example | |---|---|---|---| | `vector` | 1D | uniform | `c(1, 2, 3)` | | `factor` | 1D | grouping labels | `factor(c("a","b","a"))` | | `matrix` | 2D | uniform | `matrix(1:9, 3, 3)` | | `data.frame` | 2D | mixed | `data.frame(x=1:3, y=c("a","b","c"))` | | `tibble` | 2D | mixed | modern `data.frame` | | `list` | any | any | `list(name="Fred", age=30)` | | `function` | — | code | `function(x) x^2` | ### Naming rules - Object names should **not** start with a number - Avoid spaces and special characters like `#` in names --- ## Vectors and Factors ### Vectors (1D, uniform type) ```{r vectors} #| eval: true myVec <- setNames(1:10, letters[1:10]) # named numeric vector myVec[1:5] # subset by position myVec[c(2,4,6,8)] # subset by multiple positions myVec[c("b", "d", "f")] # subset by name ``` ### Factors (1D, grouping information) ```{r factors} #| eval: true factor(c("dog", "cat", "mouse", "dog", "dog", "cat")) # Levels: cat dog mouse ``` Factors encode categorical variables with defined levels — essential for statistical modeling. --- ## Matrices and Data Frames ### Matrices (2D, uniform type) ```{r matrices} #| eval: true myMA <- matrix(1:30, 3, 10, byrow=TRUE) class(myMA) myMA[1:2, ] # first two rows myMA[1, , drop=FALSE] # first row, keep matrix structure class(as.data.frame(myMA)) # convert to data.frame ``` ### Data Frames (2D, mixed types) ```{r dataframes} #| eval: true myDF <- data.frame(Col1=1:10, Col2=10:1) myDF[1:2, ] class(as.matrix(myDF)) # convert to matrix ``` ### Tibbles — modern data frames ```{r tibbles} #| eval: true library(tidyverse) as_tibble(iris) # nicer printing, same structure as data.frame ``` ::: {.callout-tip} The `iris` dataset is built into R — no import needed. It is used throughout these examples. ::: --- ## Lists and Functions ### Lists (containers for any object type) ```{r lists} #| eval: true myL <- list(name="Fred", wife="Mary", no.children=3, child.ages=c(4,7,9)) myL myL[[4]][1:2] # access fourth element, first two values ``` Lists are the most flexible R object — they can hold vectors, data frames, other lists, and functions all at once. ### Functions (reusable pieces of code) ```{r fctsyntax} myfct <- function(arg1, arg2, ...) { function_body } ``` --- ## Subsetting Data Objects {.scrollable} ### 1. By position ```{r subsetpos} #| eval: true myVec <- 1:26; names(myVec) <- LETTERS myVec[1:4] # first four elements myVec[-(1:4)] # everything except first four ``` ### 2. By logical vector ```{r subsetlog} #| eval: true myLog <- myVec > 10 myVec[myLog] # elements where condition is TRUE ``` ### 3. By name ```{r subsetname} #| eval: true myVec[c("B", "K", "M")] ``` ### 4. By `$` sign (single column or list component) ```{r subsetdollar} #| eval: true iris$Species[1:8] ``` ### Subsetting 2D objects ```{r subset2d} #| eval: true iris[1:4, ] # first 4 rows, all columns iris[1:4, 1:2] # first 4 rows, first 2 columns iris[iris$Species=="setosa", ] # rows matching a condition ``` --- ## Important Utilities {.scrollable} ### Combining objects ```{r combining} #| eval: true c(1, 2, 3) x <- 1:3; y <- 101:103 c(x, y) # concatenate vectors ma <- cbind(x, y) # bind as columns rbind(ma, ma) # bind as rows ``` ### Dimensions and names ```{r dimensions} #| eval: true length(iris$Species) # number of elements dim(iris) # rows x columns rownames(iris)[1:8] colnames(iris) names(myL) # names of list components ``` ### Sorting ```{r sorting} sort(10:1) sortindex <- order(iris[,1], decreasing=FALSE) iris[sortindex, ][1:2, ] iris[order(iris$Sepal.Length, iris$Sepal.Width), ][1:2, ] # sort by multiple columns ``` ### Checking identity ```{r identical} #| eval: true myma <- iris[1:2,] all(myma == iris[1:2,]) # all values equal? identical(myma, iris[1:2,]) # strict identity? ``` --- ## Operators and Calculations ### Comparison operators ```{r equal} #| eval: true 1 == 1 # equal 1 != 2 # not equal # also: <, >, <=, >= ``` ### Logical operators ```{r logicoperator} #| eval: true x <- 1:10; y <- 10:1 x > y & x > 5 # AND x > y | x > 5 # OR !x # NOT ``` ### Basic calculations ```{r basiccalcul} #| eval: true x + y sum(x) mean(x) apply(iris[1:6, 1:3], 1, mean) # row means (margin=1) apply(iris[1:6, 1:3], 2, mean) # column means (margin=2) ``` --- ## Reading and Writing Data {.scrollable} ### Import tabular data Widely used `read.table` and `read.delim` import functions ```{r importmydf} myDF <- read.delim("myData.tsv", sep="\t") # tab-delimited file ``` Better alternative from `readr` package with better default arguments and performance. For details see [here](https://girke.bioinformatics.ucr.edu/GEN242/tutorials/dplyr/dplyr_index.html#import-with-readr). ```{r readrimport} myTibble <- readr::read_tsv(myData.tsv") ``` Import from Google Sheet directly ```{r googlesheetimport} library(googlesheets4) gs4_deauth() # for public sheets mysheet <- read_sheet("1U-32UcwZP1k3saKeaH1mbvEAOfZRdNHNkWK2GI1rpPM", skip=4) myDF <- as.data.frame(mysheet) ``` ```{r importexcel} library(readxl) mysheet <- read_excel(targets_path, sheet="Sheet1") # Excel files ``` ### Export tabular data ```{r exporttable} write.table(myDF, file="myfile.xls", sep="\t", quote=FALSE, col.names=NA) ``` ### Line-wise import/export ```{r linewiseimport} myDF <- readLines("myData.txt") # import line by line writeLines(month.name, "myData.txt") # export line by line ``` ### Save and load R objects ```{r saveobject} mylist <- list(C1=iris[,1], C2=iris[,2]) saveRDS(mylist, "mylist.rds") # save mylist <- readRDS("mylist.rds") # load ``` ::: {.callout-note} **HW02 — Task A:** Sort `iris` by first column, subset first 12 rows, export to file, modify column names in a spreadsheet program, re-import with `read.table`. [→ HW02 instructions](https://girke.bioinformatics.ucr.edu/GEN242/assignments/homework/hw02/hw02.html) ::: --- ## Useful R Functions ### Unique entries ```{r uniqueentries} #| eval: true length(iris$Sepal.Length) # 150 total entries length(unique(iris$Sepal.Length)) # number of unique values ``` ### Count occurrences ```{r countoccurences} #| eval: true table(iris$Species) # frequency table per group ``` ### Aggregate statistics ```{r aggregate} #| eval: true aggregate(iris[,1:4], by=list(iris$Species), FUN=mean, na.rm=TRUE) ``` ### Set operations ```{r intersects} #| eval: true month.name %in% c("May", "July") # logical: which elements are in set ``` ### Merge data frames ```{r mergefct} frame1 <- iris[sample(1:nrow(iris), 30), ] my_result <- merge(frame1, iris, by.x=0, by.y=0, all=TRUE) # all=TRUE: outer join (keep all rows) # all=FALSE: inner join (keep only common rows) ``` --- ## Graphics in R — Overview ### Why R graphics? - Powerful environment for scientific visualization - Integrated with statistics infrastructure - Publication-quality, fully reproducible output - Supports LaTeX and Markdown via `knitr` ### Four main graphics systems | System | Level | Package | |---|---|---| | Base R graphics | Low + high | built-in | | grid | Low-level | built-in | | lattice | High-level | `lattice` | | [ggplot2](https://girke.bioinformatics.ucr.edu/GEN242/tutorials/rgraphics/rgraphics_index.html#ggplot2-graphics) | High-level | `ggplot2` | ### Key base graphics functions `plot`, `barplot`, `boxplot`, `hist`, `pie`, `pairs`, `image`, `heatmap` ::: {.callout-tip} For new code, [**ggplot2**](https://girke.bioinformatics.ucr.edu/GEN242/tutorials/rgraphics/rgraphics_index.html#ggplot2-graphics) is generally recommended. Base R graphics remain useful for quick exploration and highly customized plots. ::: --- ## Scatter Plots {.scrollable} ### Sample dataset ```{r sampledataset} #| eval: true set.seed(1410) y <- matrix(runif(30), ncol=3, dimnames=list(letters[1:10], LETTERS[1:3])) ``` ### Basic scatter plot ```{r scatterplot} #| eval: true plot(y[,1], y[,2]) ``` ### All pairs ```{r pairsplot} #| eval: true pairs(y) ``` ### With color and labels ```{r addcolor} #| eval: true plot(y[,1], y[,2], pch=20, col="red", main="Symbols and Labels") text(y[,1]+0.03, y[,2], rownames(y)) ``` ### Add regression line ```{r regressionline} #| eval: true plot(y[,1], y[,2]) myline <- lm(y[,2] ~ y[,1]) abline(myline, lwd=2) summary(myline) ``` ### Important plot parameters | Argument | Description | |---|---| | `col` | color of symbols | | `pch` | symbol type (`example(points)` to see options) | | `lwd` | line/symbol width | | `cex.*` | font size controls | | `mar` | margin sizes `c(bottom, left, top, right)` | | `log="xy"` | log scale on both axes | ::: {.callout-note} **HW02 — Task B:** Generate a scatter plot of `iris` columns 1 and 2, colored by Species. Use `xlim`/`ylim` to restrict data to the bottom-left quadrant. [→ HW02 instructions](https://girke.bioinformatics.ucr.edu/GEN242/assignments/homework/hw02/hw02.html#b.-scatter-plots) ::: --- ## Bar Plots, Histograms and More {.scrollable} ### Bar plot with legend ```{r barplotexample} #| eval: true barplot(y[1:4,], ylim=c(0, max(y[1:4,])+0.3), beside=TRUE, legend=letters[1:4]) ``` ::: {.callout-tip} When input is a **matrix**, `barplot` uses column names as group labels and row names as within-group labels. Convert `data.frame` input with `as.matrix()` first. ::: ### Bar plot with error bars ```{r barwitherror} #| eval: true bar <- barplot(m <- rowMeans(y) * 10, ylim=c(0, 10)) stdev <- sd(t(y)) arrows(bar, m, bar, m + stdev, length=0.15, angle=90) ``` ### Histogram and density plot ```{r histogram} #| eval: true hist(y, freq=TRUE, breaks=10) plot(density(y), col="red") ``` ### Save graphics to file ```{r savegraphics} pdf("test.pdf") plot(1:10, 1:10) dev.off() # always close the device! ``` Works the same for `jpeg()`, `png()`, `svg()`, `tiff()`. ::: {.callout-note} **HW02 — Task C:** Calculate mean values per Species for first four `iris` columns. Organize as a matrix. Generate stacked and horizontally arranged bar plots. [→ HW02 instructions](https://girke.bioinformatics.ucr.edu/GEN242/assignments/homework/hw02/hw02.html#b.-scatter-plots) ::: --- ## Analysis Routine — Data Import {.scrollable} A step-by-step workflow using two sample biological datasets. This analysis routine is used by [Homework 2D-H](https://girke.bioinformatics.ucr.edu/GEN242/tutorials/rbasics/rbasics_index.html#analysis-routine). ### Step 1 — Download sample data - [MolecularWeight_tair7.xls](https://cluster.hpcc.ucr.edu/~tgirke/Documents/R_BioCond/Samples/MolecularWeight_tair7.xls) - [TargetP_analysis_tair7.xls](https://cluster.hpcc.ucr.edu/~tgirke/Documents/R_BioCond/Samples/TargetP_analysis_tair7.xls) Open in Excel, save as tab-delimited text, then import: ```{r importsampletable} my_mw <- read.delim(file="MolecularWeight_tair7.xls", header=TRUE, sep="\t") my_mw[1:2,] my_target <- read.delim(file="TargetP_analysis_tair7.xls", header=TRUE, sep="\t") my_target[1:2,] ``` Or import directly from the web: ```{r readfromurl} my_mw <- read.delim("https://faculty.ucr.edu/~tgirke/Documents/R_BioCond/Samples/MolecularWeight_tair7.xls", header=TRUE, sep="\t") my_target <- read.delim("https://faculty.ucr.edu/~tgirke/Documents/R_BioCond/Samples/TargetP_analysis_tair7.xls", header=TRUE, sep="\t") ``` --- ## Analysis Routine — Merging Data Frames {.scrollable} ### Step 2 — Assign uniform ID column names ```{r changecoltitle} colnames(my_target)[1] <- "ID" colnames(my_mw)[1] <- "ID" ``` ### Step 3 — Merge on common ID field (outer join) ```{r merge1} my_mw_target <- merge(my_mw, my_target, by.x="ID", by.y="ID", all.x=TRUE) ``` ### Step 4 — Merge shortened table, then remove non-matching rows ```{r merge2} my_mw_target2a <- merge(my_mw, my_target[1:40,], by.x="ID", by.y="ID", all.x=TRUE) my_mw_target2 <- na.omit(my_mw_target2a) # remove rows with NAs ``` ::: {.callout-note} **HW02 — Task D:** Execute `merge` to return only common rows directly (without `na.omit`). Prove both methods return identical results. **HW02 — Task E:** Replace all `NA` values in `my_mw_target2a` with zeros. ::: --- ## Analysis Routine — Filtering and String Operations {.scrollable} ### Step 5 — Filter rows by conditions ```{r filterdf} # Proteins with MW > 100,000 AND targeted to chloroplast (Loc == "C") query <- my_mw_target[my_mw_target[,2] > 100000 & my_mw_target[,4] == "C", ] query[1:4, ] dim(query) ``` ::: {.callout-note} **HW02 — Task F:** How many proteins have MW > 4,000 and < 5,000? Subset and sort by MW to verify. ::: ### Step 6 — Remove gene model extensions with regex ```{r regexpr} # AT1G01010.1 → AT1G01010 (remove everything from . onward) my_mw_target3 <- data.frame( loci = gsub("\\..*", "", as.character(my_mw_target[,1]), perl=TRUE), my_mw_target ) my_mw_target3[1:3, 1:8] ``` ::: {.callout-note} **HW02 — Task G:** Retrieve rows where second column contains specific IDs using `%in%`. Also use the second column as a row index and repeat. Explain the difference between the two approaches. ::: --- ## Analysis Routine — Calculations and Export {.scrollable} ### Step 7 — Count duplicates ```{r counttxs} mycounts <- table(my_mw_target3[,1])[my_mw_target3[,1]] my_mw_target4 <- cbind(my_mw_target3, Freq=mycounts[as.character(my_mw_target3[,1])]) ``` ### Step 8 — Vectorized calculation (average AA weight) ```{r vectorizedcal} data.frame(my_mw_target4, avg_AA_WT=(my_mw_target4[,3] / my_mw_target4[,4]))[1:2,] ``` ### Step 9 — Row-wise mean and standard deviation ```{r meansddev} mymean <- apply(my_mw_target4[,6:9], 1, mean) mystdev <- apply(my_mw_target4[,6:9], 1, sd, na.rm=TRUE) data.frame(my_mw_target4, mean=mymean, stdev=mystdev)[1:2, 5:12] ``` ### Step 10 — Scatter plot ```{r scatterplot2} plot(my_mw_target4[1:500, 3:4], col="red") ``` ### Step 11 — Export results ```{r exportresults} write.table(my_mw_target4, file="my_file.xls", quote=FALSE, sep="\t", col.names=NA) ``` ::: {.callout-note} **HW02 — Task H:** Assemble all commands from this exercise into `HW2.R` and run it: ```{r runwithsource} source("HW2.R") # from within R ``` ```bash Rscript HW2.R # from command-line ``` ::: --- ## HW02 Summary Assemble all solutions into a single R script `HW2.R` and submit via GitHub. | Task | Topic | Key functions | |---|---|---| | **A** | Sort `iris`, export, modify columns, re-import | `order`, `write.table`, `read.table` | | **B** | Scatter plot `iris` col 1-2, colored by Species | `plot`, `xlim`, `ylim` | | **C** | Mean matrix by Species, stacked & horizontal bars | `aggregate`, `barplot` | | **D** | Merge returning only common rows; prove equivalence | `merge(all=FALSE)`, `all()` | | **E** | Replace NAs with zeros | `is.na`, indexing | | **F** | Filter proteins by MW range 4,000–5,000 | boolean indexing | | **G** | Subset rows by ID using `%in%` and row index | `%in%`, `rownames` | | **H** | Assemble all code into `HW2.R`, run with `source()` | `source`, `Rscript` | ### Submission path ``` Homework/HW2/HW2.R ``` **Due: Thu, April 16th at 6:00 PM** ::: {.callout-note} The preassembled workflow script for Task H is available [here](https://girke.bioinformatics.ucr.edu/GEN242/tutorials/rbasics/rbasics/#export-results-and-run-entire-exercise-as-script) — it does **not** include solutions for Tasks A–C. :::