7.2 Changing values of a vector
Now that you know how to index a vector, you can easily change specific values in a vector using the assignment (<-
) operation. To do this, just assign a vector of new values to the indexed values of the original vector:
Let’s create a vector a
which contains 10 1s:
Now, let’s change the first 5 values in the vector to 9s by indexing the first five values, and assigning the value of 9:
Now let’s change the last 5 values to 0s. We’ll index the values 6 through 10, and assign a value of 0.
Of course, you can also change values of a vector using a logical indexing vector. For example, let’s say you have a vector of numbers that should be from 1 to 10. If values are outside of this range, you want to set them to either the minimum (1) or maximum (10) value:
# x is a vector of numbers that should be from 1 to 10
x <- c(5, -5, 7, 4, 11, 5, -2)
# Assign values less than 1 to 1
x[x < 1] <- 1
# Assign values greater than 10 to 10
x[x > 10] <- 10
# Print the result!
x
## [1] 5 1 7 4 10 5 1
As you can see, our new values of x are now never less than 1 or greater than 10!
A note on indexing…
Technically, when you assign new values to a vector, you should always assign a vector of the same length as the number of values that you are updating. For example, given a vector a with 10 1s:
To update the first 5 values with 5 9s, we should assign a new vector of 5 9s
However, if we repeat this code but just assign a single 9, R will repeat the value as many times as necessary to fill the indexed value of the vector. That’s why the following code still works:
In other languages this code wouldn’t work because we’re trying to replace 5 values with just 1. However, this is a case where R bends the rules a bit.
7.2.1 Ex: Fixing invalid responses to a Happiness survey
Assigning and indexing is a particularly helpful tool when, for example, you want to remove invalid values in a vector before performing an analysis. For example, let’s say you asked 10 people how happy they were on a scale of 1 to 5 and received the following responses:
As you can see, we have some invalid values (999 and -2) in this vector. To remove them, we’ll use logical indexing to change the invalid values (999 and -2) to NA. We’ll create a logical vector indicating which values of happy
are invalid using the %in%
operation. Because we want to see which values are invalid, we’ll add the == FALSE
condition (If we don’t, the index will tell us which values are valid).
# Which values of happy are NOT in the set 1:5?
invalid <- (happy %in% 1:5) == FALSE
invalid
## [1] FALSE FALSE FALSE TRUE FALSE FALSE TRUE FALSE FALSE TRUE
Now that we have a logical index invalid
telling us which values are invalid (that is, not in the set 1 through 5), we’ll index happy
with invalid
, and assign the invalid values as NA:
# Convert any invalid values in happy to NA
happy[invalid] <- NA
happy
## [1] 1 4 2 NA 2 3 NA 3 2 NA
We can also recode all the invalid values of happy
in one line as follows:
# Convert all values of happy that are NOT integers from 1 to 5 to NA
happy[(happy %in% 1:5) == FALSE] <- NA
As you can see, happy
now has NAs for previously invalid values. Now we can take a mean()
of the vector and see the mean of the valid responses.