10/22/2018
for
loopfor
loop: motivationYou need to generate sequneces, for which all lower bounds equals to 1
, and upper bounds are defined by a given vector sizes
:
sizes <- c(3, 5, 7, 10) # we need to generate: # 1 2 3 # 1 2 3 4 5 # 1 2 3 4 5 6 7 # 1 2 3 4 5 6 7 8 9 10
for
loop: motivationThe function seq
is not vectorized, i.e. one cannot pass vectors of length greater than 1
to from
and to
argument.
sizes <- c(3, 5, 7, 10) seq(from = c(1, 1, 1, 1), to = sizes) ## Error in seq.default(from = c(1, 1, 1, 1), to = sizes) : ## 'from' must be of length 1
for
loop: motivationA crude solution:
sizes <- c(3, 5, 7, 10) seq(from = 1, to = sizes[1]) ## [1] 1 2 3 seq(from = 1, to = sizes[2]) ## [1] 1 2 3 4 5 seq(from = 1, to = sizes[3]) ## [1] 1 2 3 4 5 6 7 seq(from = 1, to = sizes[4]) ## [1] 1 2 3 4 5 6 7 8 9 10
Basically, the same line is repeated 4 times. We need a tool that will execute the same code chunk a certain number of times using each element from sizes
.
Typically, good developers have an allergy to manual labor :)
for
loopWe have this tool and it is called for
loop:
for(element in vector) { expression }
for
loop executes expression
the same number of times (iterations) as a length of vector
. At the same time iterator variable element
is assigned to successive values from this vector
. Typically expression
does something to this element
.
for
loopsizes <- c(3, 5, 7, 10) for(element in sizes) { print(seq(from = 1, to = element)) } ## [1] 1 2 3 ## [1] 1 2 3 4 5 ## [1] 1 2 3 4 5 6 7 ## [1] 1 2 3 4 5 6 7 8 9 10
Note: there is nothing specific about the name element
, it could be any valid variable name in R.
for
loopfor
loops work with all types of vectors, i.e. with atomic vectors…
colors_set <- c("red", "black", "green", "yellow") for(color in colors_set) { print(color) } ## [1] "red" ## [1] "black" ## [1] "green" ## [1] "yellow"
for
loop…and lists:
my_list <- list("string", 2:3, TRUE) for(element in my_list) { print(element) } ## [1] "string" ## [1] 2 3 ## [1] TRUE
for
loopNevertheless, it is possible to iterate over indices:
colors_set <- c("red", "black", "green", "yellow") for(i in 1:length(colors_set)) { # or use seq_along(colors_set) print(colors_set[i]) } ## [1] "red" ## [1] "black" ## [1] "green" ## [1] "yellow"
for
loop: applicationGet mean
of each element in the list:
grades <- list( jack = c(6.0, 5.5, 3.5, 5.5), bob = c(2.0, 5.5, 4.5, 3.5), stacy = c(5.5, 5.5, 5.0, 5.5, 5.5) ) for(person in grades) { print(mean(person)) } ## [1] 5.125 ## [1] 3.875 ## [1] 5.4
for
loopApplication: multiply each numeric element of the list by 10
:
messy_list <- list("some", 3.14, TRUE) for(i in 1:length(messy_list)) { if(is.numeric(messy_list[[i]])) messy_list[[i]] <- messy_list[[i]] * 10 } messy_list ## [[1]] ## [1] "some" ## ## [[2]] ## [1] 31.4 ## ## [[3]] ## [1] TRUE
for
loopfor
loops in R can be nested:
matrix_1999 <- matrix(data = 1:4, nrow = 2, ncol = 2, byrow = TRUE) for(i in 1:nrow(matrix_1999)) { for(j in 1:ncol(matrix_1999)) { print(matrix_1999[i,j]) } } ## [1] 1 ## [1] 2 ## [1] 3 ## [1] 4
for
loopNext time lapply
?
%>%
%>%
Problem: coerce character messy prices to clean numeric.
prices <- c("CHF 850", "CHF 790", "CHF 1,390", "CHF 1,560", "CHF 1,950", "CHF 670", "CHF 1,850", "CHF 1,326", "CHF 1,430")
We need to remove first four character (i.e., "CHF "
); remove ","
from each element; explicetly coerce to numeric.
%>%
prices <- c("CHF 850", "CHF 790", "CHF 1,390", "CHF 1,560", "CHF 1,950", "CHF 670", "CHF 1,850", "CHF 1,326", "CHF 1,430") prices_without_CHF <- substring(text = prices, first = 5) prices_without_coma <- gsub(",", "", prices_without_CHF) prices_numeric <- as.numeric(prices_without_coma) prices_numeric ## [1] 850 790 1390 1560 1950 670 1850 1326 1430
Too many intermediate variables, too easy to do a mistake.
%>%
prices <- c("CHF 850", "CHF 790", "CHF 1,390", "CHF 1,560", "CHF 1,950", "CHF 670", "CHF 1,850", "CHF 1,326", "CHF 1,430") as.numeric( gsub( x = substring( text = prices, first = 5 ), pattern = ",", replacement = "" ) ) ## [1] 850 790 1390 1560 1950 670 1850 1326 1430
Too long, too ugly, too confusing, read inside out, etc.
%>%
Pipe %>%
takes the output of one statement and makes it the input of the next statement. It allows you to write a sequence of function calls from left to right:
library(magrittr) prices <- c("CHF 850", "CHF 790", "CHF 1,390", "CHF 1,560", "CHF 1,950", "CHF 670", "CHF 1,850", "CHF 1,326", "CHF 1,430") prices %>% substring(first = 5) %>% gsub(pattern = ",", replacement = "") %>% as.numeric() prices ## [1] 850 790 1390 1560 1950 670 1850 1326 1430
%>%
A simpler example:
library(magrittr) f(g(x)) g(x) %>% f()