4.3 A brief style guide: Commenting and spacing
Like all programming languages, R isn’t just meant to be read by a computer, it’s also meant to be read by other humans – or very well-trained dolphins. For this reason, it’s important that your code looks nice and is understandable to other people and your future self. To keep things brief, I won’t provide a complete style guide – instead I’ll focus on the two most critical aspects of good style: commenting and spacing.
4.3.1 Commenting code with the # (pound) sign
Comments are completely ignored by R and are just there for whomever is reading the code. You can use comments to explain what a certain line of code is doing, or just to visually separate meaningful chunks of code from each other. Comments in R are designated by a # (pound) sign. Whenever R encounters a # sign, it will ignore all the code after the # sign on that line. Additionally, in most coding editors (like RStudio) the editor will display comments in a separate color than standard R code to remind you that it’s a comment:
Here is an example of a short script that is nicely commented. Try to make your scripts look like this!
# Author: Pirate Jack
# Title: My nicely commented R Script
# Date: None today :(
# Step 1: Load the yarrr package
library(yarrr)
# Step 2: See the column names in the movies dataset
names(movies)
# Step 3: Calculations
# What percent of movies are sequels?
mean(movies$sequel, na.rm = T)
# How much did Pirate's of the Caribbean: On Stranger Tides make?
movies$revenue.all[movies$name == 'Pirates of the Caribbean: On Stranger Tides']
I cannot stress enough how important it is to comment your code! Trust me, even if you don’t plan on sharing your code with anyone else, keep in mind that your future self will be reading it in the future.
4.3.2 Spacing
Howwouldyouliketoreadabookiftherewerenospacesbetweenwords? I’mguessingyouwouldn’t. Soeverytimeyouwritecodewithoutproperspacing,rememberthissentence.
Commenting isn’t the only way to make your code legible. It’s important to make appropriate use of spaces and line breaks. For example, I include spaces between arithmetic operators (like =, + and -) and after commas (which we’ll get to later). For example, look at the following code:
# Shitty looking code
a<-(100+3)-2
mean(c(a/100,642564624.34))
t.test(formula=revenue.all~sequel,data=movies)
plot(x=movies$budget,y=movies$dvd.usa,main="myplot")
That code looks like shit. Don’t write code like that. It makes my eyes hurt. Now, let’s use some liberal amounts of commenting and spacing to make it look less shitty.
# Some meaningless calculations. Not important
a <- (100 + 3) - 2
mean(c(a / 100, 642564624.34))
# t.test comparing revenue of sequels v non-sequels
t.test(formula = revenue.all ~ sequel,
data = movies)
# A scatterplot of budget and dvd revenue.
# Hard to see a relationship
plot(x = movies$budget,
y = movies$dvd.usa,
main = "myplot")
See how much better that second chunk of code looks? Not only do the comments tell us the purpose behind the code, but there are spaces and line-breaks separating distinct elements.
There are a lot more aspects of good code formatting. For a list of recommendations on how to make your code easier to follow, check out Google’s own company R Style guide at https://google.github.io/styleguide/Rguide.xml