Welcome to Software Carpentry Etherpad for the May 1st workshop at the University of Connecticut
This pad is synchronized as you type, so that everyone viewing this page sees the same text. This allows you to collaborate seamlessly on documents.

Use of this service is restricted to members of The Carpentries community; this is not for general purpose use (for that, try https://etherpad.wikimedia.org/).
Users are expected to follow our code of conduct: https://docs.carpentries.org/topic_folders/policies/code-of-conduct.html
All content is publicly available under the Creative Commons Attribution License: https://creativecommons.org/licenses/by/4.0/
We will use this Etherpad during the workshop for chatting, asking questions, taking notes collaboratively, and sharing URLs or bits of code.

 Todo list for participants:
- Go to the workshop website: https://carpentries-uconn.github.io/2020-05-01-UConn-online (link in chat, too)
- Click the link under the Collaborative Notes section to get to this page
- Name yourself in this page in the top right corner where it says Enter your name
- Add your name, university, & operating system (try to match the helper's OS) under a breakout room.
- Open up RStudio. In the Console window (bottom left quarter) run the following command:
install.packages(c("ggplot2", "gapminder", "cowplot", "plotly"))
- Open a tab with https://b.socrative.com/login/student/, and join room: SWCUCONN
- Take the pre-workshop survey on the workshop website if you haven't already: 
- Introduce yourselves in the chat (on the right), so we know who you are

* James Mickley - Ecology and Evolutionary Biology (james.mickley@uconn.edu)
* Dyanna Louyakis - Molecular and Cell Biology (artemis.louyakis@uconn.edu)
* Timothy Moore - COR2E Statistical Consulting Services & UConn Carpentries (timothy.e.moore@uconn.edu)
* Kendra Maas - COR2E MARS (kendra.maas@uconn.edu)
* Jeremy Teitelbaum - Math (jeremy.teitelbaum@uconn.edu)

For participants - Choose your breakout rooms:

Breakout room Tim
Helper: Timothy Moore - Statistical Consulting Services & UConn Carpentries (Windows)
1. Dennis-UConn, Psychological Sciences, OSX
2. Nikola Vukovic (OSX)

Breakout room Jeremy
Helper: Jeremy Teitelbaum - Math (Linux & OSX) 
1. Siliva - UConn Psycholgoical Sciences - OSX
2. Matt- UCSF-OSX
3. Oliver- UConn, Psychological Sciences, OSX

Breakout room Kendra & Megan
Helper: Kendra Maas MARS (Windows) & Megan Chiovaro - Psychological Sciences - PAC-E (OSX)
1. Leah - UConn- OSX
2. Rebecca - UMich - OSX

Breakout room Eliza
Helper: Eliza Grames - Ecology and Evolutionary Bio (Linux or Windows or OSX)
1. Olga Kepinska - UCSF/UConn (OSX)
2. Shaan Kamal (OSX)
3. Florence Bouhali UCSF (OSX)

Breakout room Michael
Helper: Michael LaScaleia - Ecology and Evolutionary Bio (Windows) 
1. Natasza Marrouch, UConn (OSX)
2. Jieyin - UConn -  Windows

Breakout room Jie
Helper: Jie Chen- Nursing (Linux or OSX)
1. Jocelyn Caballero (OSX)
2. Chloe Jones UConn (OSX)


Workshop Website: https://carpentries-uconn.github.io/2020-05-01-UConn-online

Socrative Login (for quizzes): https://b.socrative.com/login/student/

Download gapminder_data.csv here (Click download button at top right, and choose Direct Download)

Follow along with Dropbox script:


- getting involved
- etherpad export
- resources


Beginning of Workshop

# use etherpad for collaborative note taking

# Socrative is a way to give you all a chance to test what you've learned so far.

# In Zoom, you can raise your hand if you have a question. Kendra will also monitor the etherpad chat if you have questions there.

# If you only have one screen, we suggest you put zoom and rstudio side by side and change zoom to either "fit to screen" or 150%

Check your R version and or package versions



# Creating a project will help you organize your analysis for yourself and enable you to share a project (code and data) with a collaborator+1

# we're going to create a 'data' and 'figures' folders. also create a new R Script and name it 'ggplot.R'
### move the gapminder_data.csv into the 'data' folder

"#" is a comment in R, leave yourself and you collaborators lots of comments explaining what you are doing!

>?read.csv # bring up help on a specific function

# check your data when you read it in

# ggplot Grammer of Graphics
### ggplot uses slightly different syntax as base R, this will take a bit to get used to. But is super powerful once you get it.+
### just like you can structure a sentence in many ways, you can structure a ggplot command in many ways. We're going to put the "noun", the data within ggplot() function. RStudio has really handy cheatsheets for some major packages like ggplot2, you can get to it in the Help menu.

ggplot(data = gap, mapping = aes(x = gdpPercap, y = lifeExp))

# this gives an empty plot because you haven't told ggplot the "verb" or what you want ggplot to do with that data. geom are the main type of verb in ggplot

ggplot(data = gap, mapping = aes(x = gdpPercap, y = lifeExp))+
# You can map more than x & y position, add color to your mapping   

ggplot(data = gap, mapping = aes(x = gdpPercap, y = lifeExp, color = continent))+

# maybe we can see the data better as lines rather than points. To do that we need to tell ggplot how to group the data.

ggplot(data = gap, mapping = aes(x = gdpPercap, y = lifeExp, color = continent, group_by = country))+
#you can also put more than one geom (or layer) on a plot

ggplot(data = gap, mapping = aes(x = gdpPercap, y = lifeExp, color = continent, group_by = country))+
  geom_line(mapping = aes(color = continent) +
  geom_point(color = "blue")
** You can think of ggplot as taking on layers:

Hadley Wickham quote:
NOTE 'gg' in ggplot stands for grammar of graphics.

So far we've seen the noun and verb of our grammer. now we can add in the adjectives and adverbs.

ggplot(data = gap, mapping = aes(x = gdpPercap, y = lifeExp))+
# since ggplot is a grammer there is often more than one way to accomplish the graph that you want. You can specify mapping = aes(???) in the main ggplot() or in a specific geom_X() for example, if you want to color the points by continent and run a linear model for each continent you can do that in a few different ways.

ggplot(data = gap, mapping = aes(x = gdpPercap, y = lifeExp))+
   geom_point(aes(color = continent)+
   geom_smooth(aes(group = continent), method = "lm")

# the order of the geom control which is layer is on top

# you can add more than one mapping to a geom
ggplot(data = gap, mapping = aes(x = gdpPercap, y = lifeExp))+
   geom_point(aes(color = continent, shape = continent), size = 2, alpha = 0.5)+
   geom_smooth(aes(group = continent), method = "lm")
# Now to clean this figure up for publication. Control the axis labels and breaks, change the background and guide lines, add nicer title and guide (legend)
ggplot(data = gap, mapping = aes(x = gdpPercap, y = lifeExp, color = continent)) +     
geom_point(mapping = aes(shape = continent), size = 2) +
      scale_x_log10() +      
      geom_smooth(method = "lm") +      
      scale_y_continuous(limits = c(0, 100), breaks = seq(0, 100, by = 10)) +      
      theme_minimal() +     
      labs(title = "Effects of per-capita GDP", x = "GDP per Capita ($)", y = "Life Expectancy (yrs)", color = "Continents", shape = "Continents")

# exporting your plots. Best practics is to not use the "Export" button because that isn't reproducable

ggsave(file = "figures/life_expectancy.png")
ggsave(file = "figures/life_expectancy.pdf") 
ggsave(file = "figures/life_expectancy.pdf", width = 10, height = 6, dpi = 300)

# when you specify the width and height you are changing the ratio between the plot and text, you may need to play with the values for width and height if your text is too big or small

# you can save plots to a variable then explicitly name that plot in the ggsave()
lifeExp_plot <- geom_point(mapping = aes(shape = continent), size = 2) +
      scale_x_log10() +      
      geom_smooth(method = "lm") +      
      scale_y_continuous(limits = c(0, 100), breaks = seq(0, 100, by = 10)) +      
      theme_minimal() +     
      labs(title = "Effects of per-capita GDP", x = "GDP per Capita ($)", y = "Life Expectancy (yrs)", color = "Continents", shape = "Continents")

ggsave(lifeExp_plot, file = "figures/life_expectancy.pdf", width = 10, height = 6, dpi = 300)

# split your data into different panels within a larger plot by "facet"ing on data

ggplot(data = gap, mapping = aes(x = gdpPercap, y = lifeExp)) +
      facet_wrap(~ continent, ncol = 2, scales = "free") +      
      geom_point(alpha = 0.5) +      
      scale_x_log10() +      
      geom_smooth(method = "lm")

# within facet_wrap you can set whether you want the coordinates to be the same for each of the smaller plots or let them be "free" from the others. I use scales = "free" carefully because often I want to use facets to be able to easily compare the different subsets of data.

Please give feedback for your instructors: https://forms.gle/6JaT94UxFLqn4eo76
Take the post-workshop survey for Software Carpentry: https://carpentries.typeform.com/to/UgVdRQ?slug=2020-05-01-UConn-online

- R Graph catalog: http://shiny.stat.ubc.ca/r-graph-catalog/
- GGPlot2 online help: http://ggplot2.tidyverse.org
- R Graph Cookbook: http://www.cookbook-r.com/Graphs/ 
- ggplot2 essentials: http://www.sthda.com/english/wiki/ggplot2-essentials 
- Rstudio cheatsheets: https://www.rstudio.com/resources/cheatsheets/
- Cowplot: https://cran.r-project.org/web/packages/cowplot/vignettes/introduction.html

- Rather than using theme_minimal() or theme_cowplot() by itself, we can customize that, too 
- http://ggplot2.tidyverse.org./reference/ggtheme.html
- http://ggplot2.tidyverse.org./reference/theme.html

Most of what we covered is here: http://swcarpentry.github.io/r-novice-gapminder/08-plot-ggplot2/index.html
Full Reproducible Research lesson: http://swcarpentry.github.io/r-novice-gapminder/


Becoming a UConn Carpentries Instructor/Helper
Also see: https://carpentries-uconn.github.io/

### bonus more plots

# Cowplot theme allows you to easily make multipanel plots with different data. Make your plots and save them to an object.

plotA  < - ggplot(data = gap, mapping = aes(x = gdpPercap, y = lifeExp)) +
plotB <- ggplot(data = gap, mapping = aes(x = continent, y = lifeExp))+

# now combine the plots
plot_grid(plotA, plotB, labels = c("A", "B")
ggsave(file = "figures/combined_plot.pdf", width  = 10, height = 4, units = "in")

# if you want to make one subplot bigger than the other

     draw_plot(plotA, x = 0, y = 0, width = 0.3, height = 1)
     draw_plot(plotB, x = 0.3, width = 0.7, height = 1)
# your plot space is 0 - 1 in both x and y. 

# vignettes are a great way to learn how to use a new-to-you package

# interactive graphs
yearLifeExp <- ggplot(data = gap, mapping = aes(x = year, y = lifeExp, group = country))+
     facet_wrap(~ continent)+
     geom_line() +
yearLifeExp # view the plot object


# Now when you hover over your plot, a pop up window will show up with data. The data that is in the aes is show.