Chapter 7 Indexing Vectors with [ ]

boat.names boat.colors boat.ages boat.prices boat.costs
a black 143 53 52
b green 53 87 80
c pink 356 54 20
d blue 23 66 100
e blue 647 264 189
f green 24 32 12
g green 532 532 520
h yellow 43 58 68
i black 66 99 80
j black 86 132 100
# Boat sale. Creating the data vectors
boat.names <- c("a", "b", "c", "d", "e", "f", "g", "h", "i", "j")
boat.colors <- c("black", "green", "pink", "blue", "blue", 
                "green", "green", "yellow", "black", "black")
boat.ages <- c(143, 53, 356, 23, 647, 24, 532, 43, 66, 86)
boat.prices <- c(53, 87, 54, 66, 264, 32, 532, 58, 99, 132)
boat.costs <- c(52, 80, 20, 100, 189, 12, 520, 68, 80, 100)

# What was the price of the first boat?
boat.prices[1]
## [1] 53

# What were the ages of the first 5 boats?
boat.ages[1:5]
## [1] 143  53 356  23 647

# What were the names of the black boats?
boat.names[boat.colors == "black"]
## [1] "a" "i" "j"

# What were the prices of either green or yellow boats?
boat.prices[boat.colors == "green" | boat.colors == "yellow"]
## [1]  87  32 532  58

# Change the price of boat "s" to 100
boat.prices[boat.names == "s"] <- 100

# What was the median price of black boats less than 100 years old?
median(boat.prices[boat.colors == "black" & boat.ages < 100])
## [1] 116

# How many pink boats were there?
sum(boat.colors == "pink")
## [1] 1

# What percent of boats were older than 100 years old?
mean(boat.ages > 100)
## [1] 0.4

By now you should be a whiz at applying functions like mean() and table() to vectors. However, in many analyses, you won’t want to calculate statistics of an entire vector. Instead, you will want to access specific subsets of values of a vector based on some criteria. For example, you may want to access values in a specific location in the vector (i.e.; the first 10 elements) or based on some criteria within that vector (i.e.; all values greater than 0), or based on criterion from values in a different vector (e.g.; All values of age where sex is Female). To access specific values of a vector in R, we use indexing using brackets []. In general, whatever you put inside the brackets, tells R which values of the vector object you want. There are two main ways that you can use indexing to access subsets of data in a vector: numerical and logical indexing.

##Numerical Indexing

With numerical indexing, you enter a vector of integers corresponding to the values in the vector you want to access in the form a[index], where a is the vector, and index is a vector of index values. For example, let’s use numerical indexing to get values from our boat vectors.

# What is the first boat name?
boat.names[1]
## [1] "a"

# What are the first five boat colors?
boat.colors[1:5]
## [1] "black" "green" "pink"  "blue"  "blue"

# What is every second boat age?
boat.ages[seq(1, 5, by = 2)]
## [1] 143 356 647

You can use any indexing vector as long as it contains integers. You can even access the same elements multiple times:

# What is the first boat age (3 times)
boat.ages[c(1, 1, 1)]
## [1] 143 143 143

If it makes your code clearer, you can define an indexing object before doing your actual indexing. For example, let’s define an object called my.index and use this object to index our data vector:

my.index <- 3:5
boat.names[my.index]
## [1] "c" "d" "e"