4.4 Objects and functions

To understand how R works, you need to know that R revolves around two things: objects and functions. Almost everything in R is either an object or a function. In the following code chunk, I’ll define a simple object called tattoos using a function c():

# 1: Create a vector object called tattoos
tattoos <- c(4, 67, 23, 4, 10, 35)

# 2: Apply the mean() function to the tattoos object
mean(tattoos)
## [1] 24

What is an object? An object is a thing – like a number, a dataset, a summary statistic like a mean or standard deviation, or a statistical test. Objects come in many different shapes and sizes in R. There are simple objects like which represent single numbers, vectors (like our tattoos object above) which represent several numbers, more complex objects like dataframes which represent tables of data, and even more complex objects like hypothesis tests or regression which contain all sorts of statistical information.

Different types of objects have different attributes. For example, a vector of data has a length attribute (i.e.; how many numbers are in the vector), while a hypothesis test has many attributes such as a test-statistic and a p-value. Don’t worry if this is a bit confusing now – it will all become clearer when you meet these new objects in person in later chapters. For now, just know that objects in R are things, and different objects have different attributes.

What is a function? A function is a procedure that typically takes one or more objects as arguments (aka, inputs), does something with those objects, then returns a new object. For example, the mean() function we used above takes a vector object, like tattoos, of numeric data as an argument, calculates the arithmetic mean of those data, then returns a single number (a scalar) as a result. A great thing about R is that you can easily create your own functions that do whatever you want – but we’ll get to that much later in the book. Thankfully, R has hundreds (thousands?) of built-in functions that perform most of the basic analysis tasks you can think of.

99% of the time you are using R, you will do the following: 1) Define objects. 2) Apply functions to those objects. 3) Repeat!. Seriously, that’s about it. However, as you’ll soon learn, the hard part is knowing how to define objects they way you want them, and knowing which function(s) will accomplish the task you want for your objects.

4.4.1 Numbers versus characters

For the most part, objects in R come in one of two flavors: numeric and character. It is very important to keep these two separate as certain functions, like mean(), and max() will only work for numeric objects, while functions like grep() and strtrim() only work for character objects.

A numeric object is just a number like 1, 10 or 3.14. You don’t have to do anything special to create a numeric object, just type it like you were using a calculator.

# These are all numeric objects
1
10
3.14

A character object is a name like "Madisen", "Brian", or "University of Konstanz". To specify a character object, you need to include quotation marks "" around the text.

# These are all character objects
"Madisen"
"Brian"
"10"

If you try to perform a function or operation meant for a numeric object on a character object (and vice-versa), R will yell at you. For example, here’s what happens when I try to take the mean of the two character objects "1" and "10":

# This will return an error because the arguments are not numeric!
mean(c("1", "10"))
Warning message: argument is not numeric or logical, returning NA

If I make sure that the arguments are numeric (by not including the quotation marks), I won’t receive the error:

# This is ok!
mean(c(1, 10))
## [1] 5.5

4.4.2 Creating new objects with <-

By now you know that you can use R to do simple calculations. But to really take advantage of R, you need to know how to create and manipulate objects. All of the data, analyses, and even plots, you use and create are, or can be, saved as objects in R. For example the movies dataset which we’ve used before is an object stored in the yarrr package. This object was defined in the yarrr package with the name movies. When you loaded the yarrr package with the library(yarrr) command, you told R to give you access to the movies object. Once the object was loaded, we could use it to calculate descriptive statistics, hypothesis tests, and to create plots.

To create new objects in R, you need to do object assignment. Object assignment is our way of storing information, such as a number or a statistical test, into something we can easily refer to later. This is a pretty big deal. Object assignment allows us to store data objects under relevant names which we can then use to slice and dice specific data objects anytime we’d like to.

To do an assignment, we use the almighty <- operator called assign To assign something to a new object (or to change an existing object), use the notation object <- ..., where object is the new (or updated) object, and ... is whatever you want to store in object. Let’s start by creating a very simple object called a and assigning the value of 100 to it:

Good object names strike a balance between being easy to type (i.e.; short names) and interpret. If you have several datasets, it’s probably not a good idea to name them a, b, c because you’ll forget which is which. However, using long names like March2015Group1OnlyFemales will give you carpal tunnel syndrome.

Let’s start by creating a very simple object called a and assigning the value of 100 to it:

# Create a new object called a with a value of 100
a <- 100

Once you run this code, you’ll notice that R doesn’t tell you anything. However, as long as you didn’t type something wrong, R should now have a new object called a which contains the number 100. If you want to see the value, you need to call the object by just executing its name. This will print the value of the object to the console:

# Print the object a
a
## [1] 100

Now, R will print the value of a (in this case 100) to the console. If you try to evaluate an object that is not yet defined, R will return an error. For example, let’s try to print the object b which we haven’t yet defined:

b
Error: object ‘b’ not found

As you can see, R yelled at us because the object b hasn’t been defined yet.

Once you’ve defined an object, you can combine it with other objects using basic arithmetic. Let’s create objects a and b and play around with them.

a <- 1
b <- 100

# What is a + b?
a + b
## [1] 101

# Assign a + b to a new object (c)
c <- a + b

# What is c?
c
## [1] 101

4.4.2.1 To change an object, you must assign it again!

Normally I try to avoid excessive emphasis, but because this next sentence is so important, I have to just go for it. Here it goes…

To change an object, you assign it again!

No matter what you do with an object, if you don’t assign it again, it won’t change. For example, let’s say you have an object z with a value of 0. You’d like to add 1 to z in order to make it 1. To do this, you might want to just enter z + 1 – but that won’t do the job. Here’s what happens if you don’t assign it again:

z <- 0
z + 1
## [1] 1

Ok! Now let’s see the value of z

z
## [1] 0

Damn! As you can see, the value of z is still 0! What went wrong? Oh yeah…

To change an object, you must assign it again!

The problem is that when we wrote z + 1 on the second line, R thought we just wanted it to calculate and print the value of z + 1, without storing the result as a new z object. If we want to actually update the value of z, we need to reassign the result back to z as follows:

z <- 0
z <- z + 1  # Now I'm REALLY changing z
z
## [1] 1

Phew, z is now 1. Because we used assignment, z has been updated. About freaking time.

4.4.3 How to name objects

Good object names strike a balance between being easy to type (i.e.; short names) and interpret. If you have several datasets, it’s probably not a good idea to name them a, b, c because you’ll forget which is which. However, using long names like March2015Group1OnlyFemales will give you carpal tunnel syndrome.

You can name objects using any combination of letters and a few special characters (like . and _). Here are some valid object names:

# Valid object names
group.mean <- 10.21
my.age <- 32
FavoritePirate <- "Jack Sparrow"
sum.1.to.5 <- 1 + 2 + 3 + 4 + 5

All the object names above are perfectly valid. Now, let’s look at some examples of invalid object names. These object names are all invalid because they either contain spaces, start with numbers, or have invalid characters:

# Invalid object names!
famale ages <- 50 # spaces
5experiment <- 50 # starts with a number
a! <- 50 # has an invalid character
If you try running the code above in R, you will receive a warning message starting with
Error: unexpected symbol

Anytime you see this warning in R, it almost always means that you have a naming error of some kind.

4.4.3.1 R is case-sensitive!

Like a text message, you should probably watch your use of capitalization in R.

Figure 4.9: Like a text message, you should probably watch your use of capitalization in R.

Like English, R is case-sensitive – it R treats capital letters differently from lower-case letters. For example, the three following objects Plunder, plunder and PLUNDER are totally different objects in R:

# These are all different objects
Plunder <- 1
plunder <- 100
PLUNDER <- 5

I try to avoid using too many capital letters in object names because they require me to hold the shift key. This may sound silly, but you’d be surprised how much easier it is to type mydata than MyData 100 times.

###Example: Pirates of The Caribbean

Let’s do a more practical example – we’ll define an object called blackpearl.usd which has the global revenue of Pirates of the Caribbean: Curse of the Black Pearl in U.S. dollars. A quick Google search showed me that the revenue was $634,954,103. I’ll create the new object using assignment:

blackpearl.usd <- 634954103

Now, my fellow European pirates might want to know how much this is in Euros. Let’s create a new object called blackpearl.eur which converts our original value to Euros by multiplying the original amount by 0.88 (assuming 1 USD = 0.88 EUR)

blackpearl.eur <- blackpearl.usd * 0.88
blackpearl.eur
## [1] 5.6e+08

It looks like the movie made 558,759,611 in Euros. Not bad. Now, let’s see how much more Pirates of the Caribbean 2: Dead Man’s Chest made compared to “Curse of the Black Pearl.” Another Google search uncovered that Dead Man’s Chest made $1,066,215,812 (that wasn’t a mistype, the freaking movie made over a billion dollars).

deadman.usd <- 1066215812

Now, I’ll divide deadman.usd by blackpearl.usd:

deadman.usd / blackpearl.usd
## [1] 1.7

It looks like “Dead Man’s Chest” made 168% as much as “Curse of the Black Pearl” - not bad for two movies based off of a ride from Disneyland.