10/22/2018

for loop

for loop: motivation

You need to generate sequneces, for which all lower bounds equals to 1, and upper bounds are defined by a given vector sizes:

sizes <- c(3, 5, 7, 10)

# we need to generate:
# 1 2 3
# 1 2 3 4 5 
# 1 2 3 4 5 6 7
# 1 2 3 4 5 6 7 8 9 10

for loop: motivation

The function seq is not vectorized, i.e. one cannot pass vectors of length greater than 1 to from and to argument.

sizes <- c(3, 5, 7, 10)
seq(from = c(1, 1, 1, 1), to = sizes)

## Error in seq.default(from = c(1, 1, 1, 1), to = sizes) : 
##  'from' must be of length 1

for loop: motivation

A crude solution:

sizes <- c(3, 5, 7, 10)

seq(from = 1, to = sizes[1])
## [1] 1 2 3
seq(from = 1, to = sizes[2])
## [1] 1 2 3 4 5
seq(from = 1, to = sizes[3])
## [1] 1 2 3 4 5 6 7
seq(from = 1, to = sizes[4])
## [1]  1  2  3  4  5  6  7  8  9 10

Basically, the same line is repeated 4 times. We need a tool that will execute the same code chunk a certain number of times using each element from sizes.

Typically, good developers have an allergy to manual labor :)

for loop

We have this tool and it is called for loop:

for(element in vector) {
    expression
}

for loop executes expression the same number of times (iterations) as a length of vector. At the same time iterator variable element is assigned to successive values from this vector. Typically expression does something to this element.

for loop

sizes <- c(3, 5, 7, 10)

for(element in sizes) {
    print(seq(from = 1, to = element))
}

## [1] 1 2 3
## [1] 1 2 3 4 5
## [1] 1 2 3 4 5 6 7
## [1] 1  2  3  4  5  6  7  8  9 10

Note: there is nothing specific about the name element, it could be any valid variable name in R.

for loop

for loops work with all types of vectors, i.e. with atomic vectors…

colors_set <- c("red", "black", "green", "yellow")
for(color in colors_set) {
    print(color)
}
## [1] "red"
## [1] "black"
## [1] "green"
## [1] "yellow"

for loop

…and lists:

my_list <- list("string", 2:3, TRUE)
for(element in my_list) {
    print(element)
}
## [1] "string"
## [1] 2 3
## [1] TRUE

for loop

Nevertheless, it is possible to iterate over indices:

colors_set <- c("red", "black", "green", "yellow")
for(i in 1:length(colors_set)) { # or use seq_along(colors_set)
    print(colors_set[i])
}
## [1] "red"
## [1] "black"
## [1] "green"
## [1] "yellow"

for loop: application

Get mean of each element in the list:

grades <- list(
    jack = c(6.0, 5.5, 3.5, 5.5),
    bob = c(2.0, 5.5, 4.5, 3.5),
    stacy = c(5.5, 5.5, 5.0, 5.5, 5.5)
)

for(person in grades) {
    print(mean(person))
}

## [1] 5.125
## [1] 3.875
## [1] 5.4

for loop

Application: multiply each numeric element of the list by 10:

messy_list <- list("some", 3.14, TRUE)

for(i in 1:length(messy_list)) {
    if(is.numeric(messy_list[[i]]))
        messy_list[[i]] <- messy_list[[i]] * 10
}
messy_list

## [[1]]
## [1] "some"
##
## [[2]]
## [1] 31.4
## 
## [[3]]
## [1] TRUE

for loop

for loops in R can be nested:

matrix_1999 <- matrix(data = 1:4, nrow = 2, ncol = 2, byrow = TRUE)

for(i in 1:nrow(matrix_1999)) {
    for(j in 1:ncol(matrix_1999)) {
        print(matrix_1999[i,j])
    }
}
## [1] 1
## [1] 2
## [1] 3
## [1] 4

for loop

Next time lapply?

Pipe %>%

Pipe %>%

Problem: coerce character messy prices to clean numeric.

prices <- c("CHF 850", "CHF 790", "CHF 1,390", "CHF 1,560",
            "CHF 1,950", "CHF 670", "CHF 1,850", "CHF 1,326",
            "CHF 1,430")

We need to remove first four character (i.e., "CHF "); remove "," from each element; explicetly coerce to numeric.

Pipe %>%

prices <- c("CHF 850", "CHF 790", "CHF 1,390", "CHF 1,560",
            "CHF 1,950", "CHF 670", "CHF 1,850", "CHF 1,326",
            "CHF 1,430")

prices_without_CHF <- substring(text = prices, first = 5)
prices_without_coma <- gsub(",", "", prices_without_CHF)
prices_numeric <- as.numeric(prices_without_coma)

prices_numeric

## [1]  850  790 1390 1560 1950  670 1850 1326 1430

Too many intermediate variables, too easy to do a mistake.

Pipe %>%

prices <- c("CHF 850", "CHF 790", "CHF 1,390", "CHF 1,560",
            "CHF 1,950", "CHF 670", "CHF 1,850", "CHF 1,326",
            "CHF 1,430")

as.numeric(
    gsub(
        x = substring(
            text = prices,
            first = 5
        ),
        pattern = ",",
        replacement = ""
    )
)

## [1]  850  790 1390 1560 1950  670 1850 1326 1430

Too long, too ugly, too confusing, read inside out, etc.

Pipe %>%

Pipe %>% takes the output of one statement and makes it the input of the next statement. It allows you to write a sequence of function calls from left to right:

library(magrittr)

prices <- c("CHF 850", "CHF 790", "CHF 1,390", "CHF 1,560",
            "CHF 1,950", "CHF 670", "CHF 1,850", "CHF 1,326",
            "CHF 1,430")

prices %>% 
    substring(first = 5) %>%
    gsub(pattern = ",", replacement = "") %>%
    as.numeric()

prices

## [1]  850  790 1390 1560 1950  670 1850 1326 1430

Pipe %>%

A simpler example:

library(magrittr)

f(g(x))

g(x) %>% f()