8.3 Matrix and dataframe functions
R has lots of functions for viewing matrices and dataframes and returning information about them. Table 8.3 shows some of the most common:
Function | Description |
---|---|
head(x), tail(x) |
Print the first few rows (or last few rows). |
View(x) |
Open the entire object in a new window |
nrow(x), ncol(x), dim(x) |
Count the number of rows and columns |
rownames(), colnames(), names() |
Show the row (or column) names |
str(x), summary(x) |
Show the structure of the dataframe (ie., dimensions and classes) and summary statistics |
8.3.1 head(), tail(), View()
To see the first few rows of a dataframe, use head()
, to see the last few rows, use tail()
# head() shows the first few rows
head(ChickWeight)
## Grouped Data: weight ~ Time | Chick
## weight Time Chick Diet
## 1 42 0 1 1
## 2 51 2 1 1
## 3 59 4 1 1
## 4 64 6 1 1
## 5 76 8 1 1
## 6 93 10 1 1
# tail() shows he last few rows
tail(ChickWeight)
## Grouped Data: weight ~ Time | Chick
## weight Time Chick Diet
## 573 155 12 50 4
## 574 175 14 50 4
## 575 205 16 50 4
## 576 234 18 50 4
## 577 264 20 50 4
## 578 264 21 50 4
To see an entire dataframe in a separate window that looks like spreadsheet, use View()
When you run View()
, you’ll see a new window like the one in Figure 8.3
8.3.2 summary()
, str()
To get summary statistics on all columns in a dataframe, use the summary()
function:
# Print summary statistics of ToothGrowth to the console
summary(ToothGrowth)
## len supp dose
## Min. : 4 OJ:30 Min. :0.50
## 1st Qu.:13 VC:30 1st Qu.:0.50
## Median :19 Median :1.00
## Mean :19 Mean :1.17
## 3rd Qu.:25 3rd Qu.:2.00
## Max. :34 Max. :2.00
To learn about the classes of columns in a dataframe, in addition to some other summary information, use the str()
(structure) function. This function returns information for more advanced R users, so don’t worry if the output looks confusing.
# Print additional information about ToothGrowth to the console
str(ToothGrowth)
## 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
Here, we can see that ToothGrowth
is a dataframe with 60 observations (ie., rows) and 5 variables (ie., columns). We can also see that the column names are index
, len
, len.cm
, supp
, and dose