8.3 Matrix and dataframe functions
R has lots of functions for viewing matrices and dataframes and returning information about them. Table 8.3 shows some of the most common:
Function | Description |
---|---|
head(x), tail(x) |
Print the first few rows (or last few rows). |
View(x) |
Open the entire object in a new window |
nrow(x), ncol(x), dim(x) |
Count the number of rows and columns |
rownames(), colnames(), names() |
Show the row (or column) names |
str(x), summary(x) |
Show the structure of the dataframe (ie., dimensions and classes) and summary statistics |
8.3.1 head(), tail(), View()
To see the first few rows of a dataframe, use head()
, to see the last few rows, use tail()
# head() shows the first few rows
head(ChickWeight)
## Grouped Data: weight ~ Time | Chick
## weight Time Chick Diet
## 1 42 0 1 1
## 2 51 2 1 1
## 3 59 4 1 1
## 4 64 6 1 1
## 5 76 8 1 1
## 6 93 10 1 1
# tail() shows he last few rows
tail(ChickWeight)
## Grouped Data: weight ~ Time | Chick
## weight Time Chick Diet
## 573 155 12 50 4
## 574 175 14 50 4
## 575 205 16 50 4
## 576 234 18 50 4
## 577 264 20 50 4
## 578 264 21 50 4
To see an entire dataframe in a separate window that looks like spreadsheet, use View()
When you run View()
, you’ll see a new window like the one in Figure 8.3
data:image/s3,"s3://crabby-images/90e16/90e16a566ab35468af5dcb209005392cc81d0f9d" alt="Screenshot of the window from View(ChickWeight). You can use this window to visually sort and filter the data to get an idea of how it looks, but you can't add or remove data and nothing you do will actually change the dataframe."
Figure 8.3: Screenshot of the window from View(ChickWeight). You can use this window to visually sort and filter the data to get an idea of how it looks, but you can’t add or remove data and nothing you do will actually change the dataframe.
8.3.2 summary()
, str()
To get summary statistics on all columns in a dataframe, use the summary()
function:
# Print summary statistics of ToothGrowth to the console
summary(ToothGrowth)
## len supp dose
## Min. : 4 OJ:30 Min. :0.50
## 1st Qu.:13 VC:30 1st Qu.:0.50
## Median :19 Median :1.00
## Mean :19 Mean :1.17
## 3rd Qu.:25 3rd Qu.:2.00
## Max. :34 Max. :2.00
To learn about the classes of columns in a dataframe, in addition to some other summary information, use the str()
(structure) function. This function returns information for more advanced R users, so don’t worry if the output looks confusing.
# Print additional information about ToothGrowth to the console
str(ToothGrowth)
## 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
Here, we can see that ToothGrowth
is a dataframe with 60 observations (ie., rows) and 5 variables (ie., columns). We can also see that the column names are index
, len
, len.cm
, supp
, and dose