10/22/2018
for loopfor loop: motivationYou need to generate sequneces, for which all lower bounds equals to 1, and upper bounds are defined by a given vector sizes:
sizes <- c(3, 5, 7, 10) # we need to generate: # 1 2 3 # 1 2 3 4 5 # 1 2 3 4 5 6 7 # 1 2 3 4 5 6 7 8 9 10
for loop: motivationThe function seq is not vectorized, i.e. one cannot pass vectors of length greater than 1 to from and to argument.
sizes <- c(3, 5, 7, 10) seq(from = c(1, 1, 1, 1), to = sizes) ## Error in seq.default(from = c(1, 1, 1, 1), to = sizes) : ## 'from' must be of length 1
for loop: motivationA crude solution:
sizes <- c(3, 5, 7, 10) seq(from = 1, to = sizes[1]) ## [1] 1 2 3 seq(from = 1, to = sizes[2]) ## [1] 1 2 3 4 5 seq(from = 1, to = sizes[3]) ## [1] 1 2 3 4 5 6 7 seq(from = 1, to = sizes[4]) ## [1] 1 2 3 4 5 6 7 8 9 10
Basically, the same line is repeated 4 times. We need a tool that will execute the same code chunk a certain number of times using each element from sizes.
Typically, good developers have an allergy to manual labor :)
for loopWe have this tool and it is called for loop:
for(element in vector) {
expression
}
for loop executes expression the same number of times (iterations) as a length of vector. At the same time iterator variable element is assigned to successive values from this vector. Typically expression does something to this element.
for loopsizes <- c(3, 5, 7, 10)
for(element in sizes) {
print(seq(from = 1, to = element))
}
## [1] 1 2 3
## [1] 1 2 3 4 5
## [1] 1 2 3 4 5 6 7
## [1] 1 2 3 4 5 6 7 8 9 10
Note: there is nothing specific about the name element, it could be any valid variable name in R.
for loopfor loops work with all types of vectors, i.e. with atomic vectors…
colors_set <- c("red", "black", "green", "yellow")
for(color in colors_set) {
print(color)
}
## [1] "red"
## [1] "black"
## [1] "green"
## [1] "yellow"
for loop…and lists:
my_list <- list("string", 2:3, TRUE)
for(element in my_list) {
print(element)
}
## [1] "string"
## [1] 2 3
## [1] TRUE
for loopNevertheless, it is possible to iterate over indices:
colors_set <- c("red", "black", "green", "yellow")
for(i in 1:length(colors_set)) { # or use seq_along(colors_set)
print(colors_set[i])
}
## [1] "red"
## [1] "black"
## [1] "green"
## [1] "yellow"
for loop: applicationGet mean of each element in the list:
grades <- list(
jack = c(6.0, 5.5, 3.5, 5.5),
bob = c(2.0, 5.5, 4.5, 3.5),
stacy = c(5.5, 5.5, 5.0, 5.5, 5.5)
)
for(person in grades) {
print(mean(person))
}
## [1] 5.125
## [1] 3.875
## [1] 5.4
for loopApplication: multiply each numeric element of the list by 10:
messy_list <- list("some", 3.14, TRUE)
for(i in 1:length(messy_list)) {
if(is.numeric(messy_list[[i]]))
messy_list[[i]] <- messy_list[[i]] * 10
}
messy_list
## [[1]]
## [1] "some"
##
## [[2]]
## [1] 31.4
##
## [[3]]
## [1] TRUE
for loopfor loops in R can be nested:
matrix_1999 <- matrix(data = 1:4, nrow = 2, ncol = 2, byrow = TRUE)
for(i in 1:nrow(matrix_1999)) {
for(j in 1:ncol(matrix_1999)) {
print(matrix_1999[i,j])
}
}
## [1] 1
## [1] 2
## [1] 3
## [1] 4
for loopNext time lapply?
%>%%>%Problem: coerce character messy prices to clean numeric.
prices <- c("CHF 850", "CHF 790", "CHF 1,390", "CHF 1,560",
"CHF 1,950", "CHF 670", "CHF 1,850", "CHF 1,326",
"CHF 1,430")
We need to remove first four character (i.e., "CHF "); remove "," from each element; explicetly coerce to numeric.
%>%prices <- c("CHF 850", "CHF 790", "CHF 1,390", "CHF 1,560",
"CHF 1,950", "CHF 670", "CHF 1,850", "CHF 1,326",
"CHF 1,430")
prices_without_CHF <- substring(text = prices, first = 5)
prices_without_coma <- gsub(",", "", prices_without_CHF)
prices_numeric <- as.numeric(prices_without_coma)
prices_numeric
## [1] 850 790 1390 1560 1950 670 1850 1326 1430
Too many intermediate variables, too easy to do a mistake.
%>%prices <- c("CHF 850", "CHF 790", "CHF 1,390", "CHF 1,560",
"CHF 1,950", "CHF 670", "CHF 1,850", "CHF 1,326",
"CHF 1,430")
as.numeric(
gsub(
x = substring(
text = prices,
first = 5
),
pattern = ",",
replacement = ""
)
)
## [1] 850 790 1390 1560 1950 670 1850 1326 1430
Too long, too ugly, too confusing, read inside out, etc.
%>%Pipe %>% takes the output of one statement and makes it the input of the next statement. It allows you to write a sequence of function calls from left to right:
library(magrittr)
prices <- c("CHF 850", "CHF 790", "CHF 1,390", "CHF 1,560",
"CHF 1,950", "CHF 670", "CHF 1,850", "CHF 1,326",
"CHF 1,430")
prices %>%
substring(first = 5) %>%
gsub(pattern = ",", replacement = "") %>%
as.numeric()
prices
## [1] 850 790 1390 1560 1950 670 1850 1326 1430
%>%A simpler example:
library(magrittr) f(g(x)) g(x) %>% f()