{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" }, "toc": "true" }, "source": [ "

Table of Contents

\n", "
" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "![Imgur](https://i.imgur.com/VXYtqs9.jpg)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "** Objects **\n", "> Objects are simply a definition for a type of data to be stored \n", "e.g., vector, matrix, array, data frame, list, function" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Data structures\n", "A perticular way of organizing data " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "\n", "\t* Atomic vectors (basic data types in R and is pretty much the workhorse of R)\n", "\t* Recursive lists (List is a special vector. Each element can be a different class)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#### Atomic vectors" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "* A vector can be a vector of characters, logical, integers or numeric. \n", "You can create vectors by concatenating them using the `c()` function. \n", "Various examples: \n", "`x <- c(1, 2, 3)` \n", "x is a numeric vector. These are the most common kind. They are numeric objects and are treated as double precision real numbers. To explicitly create integers, add a L at the end. \n", "`x1 <- c(1L, 2L, 3L)` \n", "You can also have logical vectors. \n", "`y <- c(TRUE, TRUE, FALSE, FALSE)` \n", "Finally you can have character vectors: \n", "`z <- c(\"Alec\", \"Dan\", \"Rob\", \"Rich\")` " ] }, { "cell_type": "markdown", "metadata": { "run_control": { "marked": false }, "slideshow": { "slide_type": "subslide" } }, "source": [ "```{r}\n", "#The general pattern is vector(class of object, length). \n", "Create an empty vector with vector()\n", "x <- vector() # will get default logical(0) values\n", "# with a specified type and pre-defined length\n", "x <- vector(length = 10) # will get default 10 FALSE values \n", "# with a length and type\n", "vector(mode = \"character\", length = 10) # is equalent to character(10)\n", "vector(\"numeric\", length = 5) # is equalent to numeric(5)\n", "vector(\"integer\", 10)\n", "vector(mode = \"logical\", 8)\n", "vector(\"complex\", 5)\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**Vectors may only have one type** \n", "* atomic vectors are homogeneous. This means that every element within a single atomic vector has to be of the same type\n", "* R will create a resulting vector that is the least common denominator. The coercion will move towards the one that's easiest to coerce to." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "**Some Function for Vectors**\n", "\n", "| Functions | Description |\n", "|----------------|---------------------------------------------------------------------------------------------------------|\n", "| `c()` | combines values, vectors, and/or lists to create new objects |\n", "| `unique()` | returns a vector containing one element for each unique value in the vector |\n", "| `duplicated()` | returns a logical vector which tells if elements of a vector are duplicated with regard to previous one |\n", "| `rev()` | reverse the order of element in a vector |\n", "| `sort()` | sorts the elements in a vector |\n", "| `append()` | append or insert elements in a vector. |\n", "| `sum()` | sum of the elements of a vector |\n", "| `min()` | minimum value in a vector |\n", "| `max()` | maximum value in a vector |\n", "| `cumsum` | cumulative sum | \n", "| `diff` | x[i+1] - x[i] | \n", "| `prod` | product | \n", "| `cumprod` | cumulative product | \n", "| `mean` | average | \n", "| `median` | median | \n", "| `range` | range (minimum and maximum) | \n", "| `order` | order | \n", "| `rank` | rank | \n", "| `sample` | random sample | \n", "| `quartile` | percentile | \n", "| `var` | variance, covariance | \n", "| `sd` | standard deviation | \n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "**Other Functions** \n", "```{r}\n", "(x <- c(sort(sample(1:20, 9)), NA))\n", "(y <- c(sort(sample(3:23, 7)), NA))\n", "union(x, y)\n", "intersect(x, y)\n", "setdiff(x, y)\n", "setdiff(y, x)\n", "setequal(x, y)\n", "which.min(x)\n", "which.max(x)\n", "match(x,y)\n", "``` " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#### Lists (recursive vectors)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "* List is a special vector. Each element can be a different class.\n", "* lists act as containers\n", "* Unlike atomic vectors, its contents are not restricted to a single type\n", "* a list can be anything, and two elements within a list can be of different types!\n", "* Lists are sometimes called recursive vectors, because a list can contain other lists" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "** Create a ists using list function** \n", "`x <- list(1, \"a\", TRUE, 1+4i)` " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "R also has many data structures. These include\n", "\t* matrix and arrays\n", " * data frame\t" ] }, { "cell_type": "markdown", "metadata": { "run_control": { "frozen": false, "read_only": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "![Imgur](https://i.imgur.com/c73sZAd.png?1)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#### matrix and arrays" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "* Stacking multiple matrices\n", "* Matrices in R can be thought of as vectors indexed using two indices instead of one" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "run_control": { "frozen": false, "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\t\n", "\t\n", "\t\n", "\n", "
47 36 1
48 89 2
52 45 -25
\n" ], "text/latex": [ "\\begin{tabular}{lll}\n", "\t 47 & 36 & 1\\\\\n", "\t 48 & 89 & 2\\\\\n", "\t 52 & 45 & -25\\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "| 47 | 36 | 1 | \n", "| 48 | 89 | 2 | \n", "| 52 | 45 | -25 | \n", "\n", "\n" ], "text/plain": [ " [,1] [,2] [,3]\n", "[1,] 47 36 1 \n", "[2,] 48 89 2 \n", "[3,] 52 45 -25 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "m <- matrix(c(47,48,52,36,89,45,1,2,-25), nrow = 3, ncol = 3)\n", "m\n", "# this code creates a 3 by 3 matrix" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "##### using cbind and rbind\n", "(to create a matrix)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "run_control": { "frozen": false, "read_only": false }, "scrolled": true, "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "'matrix'" ], "text/latex": [ "'matrix'" ], "text/markdown": [ "'matrix'" ], "text/plain": [ "[1] \"matrix\"" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\n", "
IndexAgeSalary
1 30 500
2 45 600
3 34 550
\n" ], "text/latex": [ "\\begin{tabular}{lll}\n", " Index & Age & Salary\\\\\n", "\\hline\n", "\t 1 & 30 & 500\\\\\n", "\t 2 & 45 & 600\\\\\n", "\t 3 & 34 & 550\\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "Index | Age | Salary | \n", "|---|---|---|\n", "| 1 | 30 | 500 | \n", "| 2 | 45 | 600 | \n", "| 3 | 34 | 550 | \n", "\n", "\n" ], "text/plain": [ " Index Age Salary\n", "[1,] 1 30 500 \n", "[2,] 2 45 600 \n", "[3,] 3 34 550 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# 3 column matrix with column dimnames\n", "m <- cbind(Index = c(1:3), Age = c(30, 45, 34), Salary = c(500, 600, 550)) \n", "class(m)\n", "m" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "* If a higher dimension vector is desired, then use the array() function to generate the n-dimensional object. A 3x3x3 array can be created as follows:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "run_control": { "frozen": false, "read_only": false }, "scrolled": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ ", , 1\n", "\n", " [,1] [,2] [,3]\n", "[1,] 1 4 7\n", "[2,] 2 5 8\n", "[3,] 3 6 9\n", "\n", ", , 2\n", "\n", " [,1] [,2] [,3]\n", "[1,] 10 13 16\n", "[2,] 11 14 17\n", "[3,] 12 15 18\n", "\n", ", , 3\n", "\n", " [,1] [,2] [,3]\n", "[1,] 19 22 25\n", "[2,] 20 23 26\n", "[3,] 21 24 27\n", "\n" ] } ], "source": [ "a <- array(1:27, dim=c(3,3,3))\n", "print(a)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**To check the `dimensions` of array or matrix** \n", "`dim()` will give you the number of rows, columns and slices \n", "`length(dim())` will give actual dimension as a single number " ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
    \n", "\t
  1. 3
  2. \n", "\t
  3. 3
  4. \n", "
\n" ], "text/latex": [ "\\begin{enumerate*}\n", "\\item 3\n", "\\item 3\n", "\\end{enumerate*}\n" ], "text/markdown": [ "1. 3\n", "2. 3\n", "\n", "\n" ], "text/plain": [ "[1] 3 3" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "2" ], "text/latex": [ "2" ], "text/markdown": [ "2" ], "text/plain": [ "[1] 2" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
    \n", "\t
  1. 3
  2. \n", "\t
  3. 3
  4. \n", "\t
  5. 3
  6. \n", "
\n" ], "text/latex": [ "\\begin{enumerate*}\n", "\\item 3\n", "\\item 3\n", "\\item 3\n", "\\end{enumerate*}\n" ], "text/markdown": [ "1. 3\n", "2. 3\n", "3. 3\n", "\n", "\n" ], "text/plain": [ "[1] 3 3 3" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "3" ], "text/latex": [ "3" ], "text/markdown": [ "3" ], "text/plain": [ "[1] 3" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# for matrix\n", "dim(m)\n", "length(dim(m))\n", "# for array\n", "dim(a)\n", "length(dim(a))\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Creating more than 3 dimensions like 4, 5 and etc**" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
    \n", "\t
  1. 3
  2. \n", "\t
  3. 2
  4. \n", "\t
  5. 2
  6. \n", "\t
  7. 3
  8. \n", "
\n" ], "text/latex": [ "\\begin{enumerate*}\n", "\\item 3\n", "\\item 2\n", "\\item 2\n", "\\item 3\n", "\\end{enumerate*}\n" ], "text/markdown": [ "1. 3\n", "2. 2\n", "3. 2\n", "4. 3\n", "\n", "\n" ], "text/plain": [ "[1] 3 2 2 3" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "4" ], "text/latex": [ "4" ], "text/markdown": [ "4" ], "text/plain": [ "[1] 4" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
    \n", "\t
  1. 3
  2. \n", "\t
  3. 2
  4. \n", "\t
  5. 2
  6. \n", "\t
  7. 3
  8. \n", "\t
  9. 1
  10. \n", "
\n" ], "text/latex": [ "\\begin{enumerate*}\n", "\\item 3\n", "\\item 2\n", "\\item 2\n", "\\item 3\n", "\\item 1\n", "\\end{enumerate*}\n" ], "text/markdown": [ "1. 3\n", "2. 2\n", "3. 2\n", "4. 3\n", "5. 1\n", "\n", "\n" ], "text/plain": [ "[1] 3 2 2 3 1" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "5" ], "text/latex": [ "5" ], "text/markdown": [ "5" ], "text/plain": [ "[1] 5" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "ar <- array(1:36, dim = c(3,2,2,3)) # 4 dimensional array\n", "dim(ar)\n", "length(dim(ar))\n", "arr <- array(1:36, dim = c(3,2,2,3,1)) # 5 dimensional array\n", "dim(arr)\n", "length(dim(arr))" ] }, { "cell_type": "markdown", "metadata": { "run_control": { "marked": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "##### Vector and Matrix Operations" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "![Imgur](https://i.imgur.com/HmnOp1F.png)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "![Imgur](https://i.imgur.com/xe79KhH.png)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#### data frame " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "* unlike a matrix, a data frame can mix data types across columns" ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "run_control": { "frozen": false, "read_only": false }, "scrolled": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\n", "
NameValue1Value2
a123 1
a2 445
b312 5
\n" ], "text/latex": [ "\\begin{tabular}{r|lll}\n", " Name & Value1 & Value2\\\\\n", "\\hline\n", "\t a1 & 23 & 1\\\\\n", "\t a2 & 4 & 45\\\\\n", "\t b3 & 12 & 5\\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "Name | Value1 | Value2 | \n", "|---|---|---|\n", "| a1 | 23 | 1 | \n", "| a2 | 4 | 45 | \n", "| b3 | 12 | 5 | \n", "\n", "\n" ], "text/plain": [ " Name Value1 Value2\n", "1 a1 23 1 \n", "2 a2 4 45 \n", "3 b3 12 5 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "Name <- c(\"a1\", \"a2\", \"b3\")\n", "Value1 <- c(23, 4, 12)\n", "Value2 <- c(1,45,5)\n", "dat <- data.frame(Name, Value1, Value2)\n", "dat" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "##### using cbind and rbind\n", "(to use for a data frame)" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "run_control": { "frozen": false, "read_only": false }, "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\n", "
xy
1a
2b
3c
\n" ], "text/latex": [ "\\begin{tabular}{r|ll}\n", " x & y\\\\\n", "\\hline\n", "\t 1 & a\\\\\n", "\t 2 & b\\\\\n", "\t 3 & c\\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "x | y | \n", "|---|---|---|\n", "| 1 | a | \n", "| 2 | b | \n", "| 3 | c | \n", "\n", "\n" ], "text/plain": [ " x y\n", "1 1 a\n", "2 2 b\n", "3 3 c" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "'data.frame'" ], "text/latex": [ "'data.frame'" ], "text/markdown": [ "'data.frame'" ], "text/plain": [ "[1] \"data.frame\"" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "df <- data.frame(x = 1:3, y = c(\"a\", \"b\", \"c\"))\n", "df\n", "df1 <- cbind(1, df) \n", "class(df1)" ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "run_control": { "frozen": false, "read_only": false }, "scrolled": true, "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
ab
1 1
2 4
3 9
4 16
5 25
\n" ], "text/latex": [ "\\begin{tabular}{r|ll}\n", " a & b\\\\\n", "\\hline\n", "\t 1 & 1\\\\\n", "\t 2 & 4\\\\\n", "\t 3 & 9\\\\\n", "\t 4 & 16\\\\\n", "\t 5 & 25\\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "a | b | \n", "|---|---|---|---|---|\n", "| 1 | 1 | \n", "| 2 | 4 | \n", "| 3 | 9 | \n", "| 4 | 16 | \n", "| 5 | 25 | \n", "\n", "\n" ], "text/plain": [ " a b \n", "1 1 1\n", "2 2 4\n", "3 3 9\n", "4 4 16\n", "5 5 25" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
ab
1 1
2 4
3 9
4 16
5 25
2 3
5 6
\n" ], "text/latex": [ "\\begin{tabular}{r|ll}\n", " a & b\\\\\n", "\\hline\n", "\t 1 & 1\\\\\n", "\t 2 & 4\\\\\n", "\t 3 & 9\\\\\n", "\t 4 & 16\\\\\n", "\t 5 & 25\\\\\n", "\t 2 & 3\\\\\n", "\t 5 & 6\\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "a | b | \n", "|---|---|---|---|---|---|---|\n", "| 1 | 1 | \n", "| 2 | 4 | \n", "| 3 | 9 | \n", "| 4 | 16 | \n", "| 5 | 25 | \n", "| 2 | 3 | \n", "| 5 | 6 | \n", "\n", "\n" ], "text/plain": [ " a b \n", "1 1 1\n", "2 2 4\n", "3 3 9\n", "4 4 16\n", "5 5 25\n", "6 2 3\n", "7 5 6" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "df <- data.frame(a = c(1:5), b = (1:5)^2)\n", "df\n", "rbind(df, c(2, 3), c(5, 6))" ] }, { "cell_type": "markdown", "metadata": { "run_control": { "marked": false }, "slideshow": { "slide_type": "slide" } }, "source": [ "##### using expand.grid() \n", "(to create a data frame)\n", "* The function `expand.grid` gives us all the combinations of entries of two vectors." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "For example: \n", "all combinations of `blue and black pants` and `white, grey and plaid shirts`" ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "run_control": { "frozen": false, "read_only": false }, "scrolled": true, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "'data.frame'" ], "text/latex": [ "'data.frame'" ], "text/markdown": [ "'data.frame'" ], "text/plain": [ "[1] \"data.frame\"" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "eg <- expand.grid(pants = c(\"blue\", \"black\"), shirt = c(\"white\", \"grey\", \"plaid\"))\n", "class(eg)" ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "run_control": { "frozen": false, "read_only": false }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "'data.frame'" ], "text/latex": [ "'data.frame'" ], "text/markdown": [ "'data.frame'" ], "text/plain": [ "[1] \"data.frame\"" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ " number suit\n", "1 Ace Diamonds\n", "2 Deuce Diamonds\n", "3 Three Diamonds\n", "4 Four Diamonds\n", "5 Five Diamonds\n", "6 Six Diamonds\n", "7 Seven Diamonds\n", "8 Eight Diamonds\n", "9 Nine Diamonds\n", "10 Ten Diamonds\n", "11 Jack Diamonds\n", "12 Queen Diamonds\n", "13 King Diamonds\n", "14 Ace Clubs\n", "15 Deuce Clubs\n", "16 Three Clubs\n", "17 Four Clubs\n", "18 Five Clubs\n", "19 Six Clubs\n", "20 Seven Clubs\n", "21 Eight Clubs\n", "22 Nine Clubs\n", "23 Ten Clubs\n", "24 Jack Clubs\n", "25 Queen Clubs\n", "26 King Clubs\n", "27 Ace Hearts\n", "28 Deuce Hearts\n", "29 Three Hearts\n", "30 Four Hearts\n", "31 Five Hearts\n", "32 Six Hearts\n", "33 Seven Hearts\n", "34 Eight Hearts\n", "35 Nine Hearts\n", "36 Ten Hearts\n", "37 Jack Hearts\n", "38 Queen Hearts\n", "39 King Hearts\n", "40 Ace Spades\n", "41 Deuce Spades\n", "42 Three Spades\n", "43 Four Spades\n", "44 Five Spades\n", "45 Six Spades\n", "46 Seven Spades\n", "47 Eight Spades\n", "48 Nine Spades\n", "49 Ten Spades\n", "50 Jack Spades\n", "51 Queen Spades\n", "52 King Spades\n" ] }, { "data": { "text/html": [ "'character'" ], "text/latex": [ "'character'" ], "text/markdown": [ "'character'" ], "text/plain": [ "[1] \"character\"" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ " [1] \"Ace Diamonds\" \"Deuce Diamonds\" \"Three Diamonds\" \"Four Diamonds\" \n", " [5] \"Five Diamonds\" \"Six Diamonds\" \"Seven Diamonds\" \"Eight Diamonds\"\n", " [9] \"Nine Diamonds\" \"Ten Diamonds\" \"Jack Diamonds\" \"Queen Diamonds\"\n", "[13] \"King Diamonds\" \"Ace Clubs\" \"Deuce Clubs\" \"Three Clubs\" \n", "[17] \"Four Clubs\" \"Five Clubs\" \"Six Clubs\" \"Seven Clubs\" \n", "[21] \"Eight Clubs\" \"Nine Clubs\" \"Ten Clubs\" \"Jack Clubs\" \n", "[25] \"Queen Clubs\" \"King Clubs\" \"Ace Hearts\" \"Deuce Hearts\" \n", "[29] \"Three Hearts\" \"Four Hearts\" \"Five Hearts\" \"Six Hearts\" \n", "[33] \"Seven Hearts\" \"Eight Hearts\" \"Nine Hearts\" \"Ten Hearts\" \n", "[37] \"Jack Hearts\" \"Queen Hearts\" \"King Hearts\" \"Ace Spades\" \n", "[41] \"Deuce Spades\" \"Three Spades\" \"Four Spades\" \"Five Spades\" \n", "[45] \"Six Spades\" \"Seven Spades\" \"Eight Spades\" \"Nine Spades\" \n", "[49] \"Ten Spades\" \"Jack Spades\" \"Queen Spades\" \"King Spades\" \n" ] } ], "source": [ "#generate a deck of cards\n", "suits <- c(\"Diamonds\", \"Clubs\", \"Hearts\", \"Spades\")\n", "numbers <- c(\"Ace\", \"Deuce\", \"Three\", \"Four\", \"Five\", \"Six\", \"Seven\", \"Eight\", \"Nine\", \"Ten\", \"Jack\", \"Queen\", \"King\")\n", "deck <- expand.grid(number=numbers, suit=suits)\n", "class(deck)\n", "print(deck)\n", "deck <- paste(deck$number, deck$suit)\n", "class(deck)\n", "print(deck)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data Types, Class and Attributes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**R contains five classes of objects. That is: **\n", " \n", "1). Numeric- real numbers \n", "2). Character \n", "3). Integer- whole numbers \n", "4). Logical \n", "5). Complex \n", "\n", "**These objects can have attributes (contain metadata) such as: **\n", " \n", "1). Dimensions \n", "2). Class \n", "3). Names \n", "4). Length \n", " \n", "These attributes are accessed with `attribute()` function. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Numeric (e.g, 9, 7.2, pi)\n", "* Characters (e.g, \"apple\", \"red\")\n", "* Integers (e.g,, 3L, as.integer(5))\n", "* Logical (e.g., TRUE, FALSE)\n", "* Complex (e.g, -3-87i, 1 + 0i, 1 + 4i)\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Special values \n", "`NA, NULL, ±Inf and NaN` " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### NA\n", "* NA Stands for not available\n", "* NA is a placeholder for a missing value\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "<NA>" ], "text/latex": [ "" ], "text/markdown": [ "<NA>" ], "text/plain": [ "[1] NA" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "<NA>" ], "text/latex": [ "" ], "text/markdown": [ "<NA>" ], "text/plain": [ "[1] NA" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "4" ], "text/latex": [ "4" ], "text/markdown": [ "4" ], "text/plain": [ "[1] 4" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "4" ], "text/latex": [ "4" ], "text/markdown": [ "4" ], "text/plain": [ "[1] 4" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "<NA>" ], "text/latex": [ "" ], "text/markdown": [ "<NA>" ], "text/plain": [ "[1] NA" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "<NA>" ], "text/latex": [ "" ], "text/markdown": [ "<NA>" ], "text/plain": [ "[1] NA" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "TRUE" ], "text/latex": [ "TRUE" ], "text/markdown": [ "TRUE" ], "text/plain": [ "[1] TRUE" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "NA + 2\n", "sum(c(NA, 4, 6))\n", "median(c(NA, 4, 8, 4), na.rm = TRUE)\n", "length(c(NA, 2, 3, 4))\n", "5 == NA\n", "NA == NA\n", "TRUE | NA" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
    \n", "\t
  1. FALSE
  2. \n", "\t
  3. TRUE
  4. \n", "\t
  5. FALSE
  6. \n", "\t
  7. FALSE
  8. \n", "\t
  9. FALSE
  10. \n", "\t
  11. FALSE
  12. \n", "\t
  13. FALSE
  14. \n", "
\n" ], "text/latex": [ "\\begin{enumerate*}\n", "\\item FALSE\n", "\\item TRUE\n", "\\item FALSE\n", "\\item FALSE\n", "\\item FALSE\n", "\\item FALSE\n", "\\item FALSE\n", "\\end{enumerate*}\n" ], "text/markdown": [ "1. FALSE\n", "2. TRUE\n", "3. FALSE\n", "4. FALSE\n", "5. FALSE\n", "6. FALSE\n", "7. FALSE\n", "\n", "\n" ], "text/plain": [ "[1] FALSE TRUE FALSE FALSE FALSE FALSE FALSE" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "x <- c(2,NA,5,4.89,10,TRUE,6/7)\n", "is.na(x)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### NULL\n", "* The class of NULL is null and has length 0\n", "* Does not take up any space in a vector\n", "* The function `is.null()` can be used to detect NULL variables." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "scrolled": true, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "3" ], "text/latex": [ "3" ], "text/markdown": [ "3" ], "text/plain": [ "[1] 3" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "10" ], "text/latex": [ "10" ], "text/markdown": [ "10" ], "text/plain": [ "[1] 10" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "5" ], "text/latex": [ "5" ], "text/markdown": [ "5" ], "text/plain": [ "[1] 5" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "length(c(3, 4, NULL, 1))\n", "sum(c(5, 1, NULL, 4))\n", "\n", "x <- NULL\n", "c(x, 5)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "\n", "### Inf\n", "* Inf is a valid `numeric` that results from calculations like division of a number by zero.\n", "* Since Inf is a numeric, operations between Inf and a finite numeric are well-defined and comparison operators work as expected." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "Inf" ], "text/latex": [ "Inf" ], "text/markdown": [ "Inf" ], "text/plain": [ "[1] Inf" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "Inf" ], "text/latex": [ "Inf" ], "text/markdown": [ "Inf" ], "text/plain": [ "[1] Inf" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "Inf" ], "text/latex": [ "Inf" ], "text/markdown": [ "Inf" ], "text/plain": [ "[1] Inf" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "Inf" ], "text/latex": [ "Inf" ], "text/markdown": [ "Inf" ], "text/plain": [ "[1] Inf" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "FALSE" ], "text/latex": [ "FALSE" ], "text/markdown": [ "FALSE" ], "text/plain": [ "[1] FALSE" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "TRUE" ], "text/latex": [ "TRUE" ], "text/markdown": [ "TRUE" ], "text/plain": [ "[1] TRUE" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "32/0\n", "5 * Inf\n", "Inf - 2e+10\n", "Inf + Inf\n", "8 < -Inf\n", "Inf == Inf" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### NaN\n", "* Stands for not a number. \n", "* unknown resulsts, but it is surely not a number\n", "* e.g like `0/0, Inf-Inf` and `Inf/Inf` result in NaN\n", "* Computations involving numbers and NaN always result in NaN" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "NaN" ], "text/latex": [ "NaN" ], "text/markdown": [ "NaN" ], "text/plain": [ "[1] NaN" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "NaN" ], "text/latex": [ "NaN" ], "text/markdown": [ "NaN" ], "text/plain": [ "[1] NaN" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "NaN + 1\n", "exp(NaN)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Coercion" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Internal (implicit) coercion" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Guess what the following do without running them first \n", "```{r}\n", "xx <- c(1.7, \"a\")\n", "xx <- c(TRUE, 2)\n", "xx <- c(\"a\", TRUE)\n", "This is called implicit coercion.\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If the two arguments are atomic vectors of different types, one is coerced to the type of the other, the (decreasing) order of precedence being character, complex, numeric, integer, logical " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "![Imgur](https://i.imgur.com/6uz95Br.jpg)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "c(1, FALSE)\n", "numeric 1, 0\n", "\n", "c(\"a\", 1)\n", "character \"a\", \"1\"\n", "\n", "c(list(1), \"a\")\n", "list 1, \"a\"\n", "\n", "c(TRUE, 1L)\n", "numeric 1, 1" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1] 5 7 8 9 45 -87\n" ] }, { "data": { "text/html": [ "'numeric'" ], "text/latex": [ "'numeric'" ], "text/markdown": [ "'numeric'" ], "text/plain": [ "[1] \"numeric\"" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "[1] \"5\" \"7\" \"8\" \"9\" \"45\" \"-87\" \"hello\"\n" ] }, { "data": { "text/html": [ "'character'" ], "text/latex": [ "'character'" ], "text/markdown": [ "'character'" ], "text/plain": [ "[1] \"character\"" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "vec <- c(5,7,8,9,45,-87)\n", "print(vec)\n", "mode(vec)\n", "vec <- append(vec, \"hello\")\n", "print(vec)\n", "mode(vec)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Similarly** \n", "\n", "`1 == \"1\"` # returns `TRUE` here 1 will get coerced into char \"1\" since the more flexible type here is character \n", "`-1 < FALSE` # returns `TRUE` here FALSE will get coerced into int 0 since the more flexible type here is integer. \n", "`\"one\" < 2` # returns `FALSE` here 2 will get coerced into char \"2\" since the more flexible type here is character.\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### External coercion and testing objects" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "![Imgur](https://i.imgur.com/hYYdi9O.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**while testing the objects** \n", "Use `is.atomic()` to test if an object is either atomic vector or `is.recursive()` || `is.list()` for recursive list. \n", "*note:* `is.vector()` does not test if an object is a vector. Instead it returns TRUE only if the object is a vector with no attributes apart from names. \n", "`is.atomic()` is more suitable for testing if an object is a vector. \n", "`is.list()` tests whether an object is truly a list. \n", "`is.numeric()`, similarly, is TRUE for either integer or double vectors, but not for lists. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Ex: \n", "# create a string of double-precision values\n", "dbl_var <- c(1, 2.5, 4.5) \n", "dbl_var\n", "## [1] 1.0 2.5 4.5\n", "\n", "# placing an L after the values creates a string of integers\n", "int_var <- c(1L, 6L, 10L)\n", "int_var\n", "## [1] 1 6 10" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#To check whether a vector is made up of integer or double values:\n", "\n", "# identifies the vector type (double, integer, logical, or character)\n", "typeof(dbl_var)\n", "## [1] \"double\"\n", "\n", "typeof(int_var)\n", "## [1] \"integer\"" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Reshaping R Objects" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### for vectors and matrices" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " [1] 1 2 3 4 5 6 7 8 9 10 11 12\n" ] } ], "source": [ "vec <- 1:12 # a vector\n", "print(vec)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " [,1] [,2] [,3] [,4] [,5] [,6]\n", "[1,] 1 3 5 7 9 11\n", "[2,] 2 4 6 8 10 12\n" ] } ], "source": [ "mat <- matrix( vec, nrow=2) # a matrix\n", "print(mat)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " [1] 1 2 3 4 5 6 7 8 9 10 11 12\n" ] } ], "source": [ "dim(mat) <- NULL\n", "print(mat) # back to vector" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### for dataframes " ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "scrolled": true, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " mpg cyl disp hp drat wt qsec vs am gear carb\n", "Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4\n", "Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4\n", "Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1\n", "Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1\n", "Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2\n", "Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1\n", "Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4\n", "Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2\n", "Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2\n", "Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4\n", "Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4\n", "Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3\n", "Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3\n", "Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3\n", "Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4\n", "Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4\n", "Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4\n", "Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1\n", "Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2\n", "Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1\n", "Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1\n", "Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2\n", "AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2\n", "Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4\n", "Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2\n", "Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1\n", "Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2\n", "Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2\n", "Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4\n", "Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6\n", "Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8\n", "Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2\n" ] } ], "source": [ "print(mtcars)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " mpg1 mpg2 mpg3 mpg4 mpg5 mpg6 mpg7 mpg8 mpg9 mpg10 \n", " 21.000 21.000 22.800 21.400 18.700 18.100 14.300 24.400 22.800 19.200 \n", " mpg11 mpg12 mpg13 mpg14 mpg15 mpg16 mpg17 mpg18 mpg19 mpg20 \n", " 17.800 16.400 17.300 15.200 10.400 10.400 14.700 32.400 30.400 33.900 \n", " mpg21 mpg22 mpg23 mpg24 mpg25 mpg26 mpg27 mpg28 mpg29 mpg30 \n", " 21.500 15.500 15.200 13.300 19.200 27.300 26.000 30.400 15.800 19.700 \n", " mpg31 mpg32 cyl1 cyl2 cyl3 cyl4 cyl5 cyl6 cyl7 cyl8 \n", " 15.000 21.400 6.000 6.000 4.000 6.000 8.000 6.000 8.000 4.000 \n", " cyl9 cyl10 cyl11 cyl12 cyl13 cyl14 cyl15 cyl16 cyl17 cyl18 \n", " 4.000 6.000 6.000 8.000 8.000 8.000 8.000 8.000 8.000 4.000 \n", " cyl19 cyl20 cyl21 cyl22 cyl23 cyl24 cyl25 cyl26 cyl27 cyl28 \n", " 4.000 4.000 4.000 8.000 8.000 8.000 8.000 4.000 4.000 4.000 \n", " cyl29 cyl30 cyl31 cyl32 disp1 disp2 disp3 disp4 disp5 disp6 \n", " 8.000 6.000 8.000 4.000 160.000 160.000 108.000 258.000 360.000 225.000 \n", " disp7 disp8 disp9 disp10 disp11 disp12 disp13 disp14 disp15 disp16 \n", "360.000 146.700 140.800 167.600 167.600 275.800 275.800 275.800 472.000 460.000 \n", " disp17 disp18 disp19 disp20 disp21 disp22 disp23 disp24 disp25 disp26 \n", "440.000 78.700 75.700 71.100 120.100 318.000 304.000 350.000 400.000 79.000 \n", " disp27 disp28 disp29 disp30 disp31 disp32 hp1 hp2 hp3 hp4 \n", "120.300 95.100 351.000 145.000 301.000 121.000 110.000 110.000 93.000 110.000 \n", " hp5 hp6 hp7 hp8 hp9 hp10 hp11 hp12 hp13 hp14 \n", "175.000 105.000 245.000 62.000 95.000 123.000 123.000 180.000 180.000 180.000 \n", " hp15 hp16 hp17 hp18 hp19 hp20 hp21 hp22 hp23 hp24 \n", "205.000 215.000 230.000 66.000 52.000 65.000 97.000 150.000 150.000 245.000 \n", " hp25 hp26 hp27 hp28 hp29 hp30 hp31 hp32 drat1 drat2 \n", "175.000 66.000 91.000 113.000 264.000 175.000 335.000 109.000 3.900 3.900 \n", " drat3 drat4 drat5 drat6 drat7 drat8 drat9 drat10 drat11 drat12 \n", " 3.850 3.080 3.150 2.760 3.210 3.690 3.920 3.920 3.920 3.070 \n", " drat13 drat14 drat15 drat16 drat17 drat18 drat19 drat20 drat21 drat22 \n", " 3.070 3.070 2.930 3.000 3.230 4.080 4.930 4.220 3.700 2.760 \n", " drat23 drat24 drat25 drat26 drat27 drat28 drat29 drat30 drat31 drat32 \n", " 3.150 3.730 3.080 4.080 4.430 3.770 4.220 3.620 3.540 4.110 \n", " wt1 wt2 wt3 wt4 wt5 wt6 wt7 wt8 wt9 wt10 \n", " 2.620 2.875 2.320 3.215 3.440 3.460 3.570 3.190 3.150 3.440 \n", " wt11 wt12 wt13 wt14 wt15 wt16 wt17 wt18 wt19 wt20 \n", " 3.440 4.070 3.730 3.780 5.250 5.424 5.345 2.200 1.615 1.835 \n", " wt21 wt22 wt23 wt24 wt25 wt26 wt27 wt28 wt29 wt30 \n", " 2.465 3.520 3.435 3.840 3.845 1.935 2.140 1.513 3.170 2.770 \n", " wt31 wt32 qsec1 qsec2 qsec3 qsec4 qsec5 qsec6 qsec7 qsec8 \n", " 3.570 2.780 16.460 17.020 18.610 19.440 17.020 20.220 15.840 20.000 \n", " qsec9 qsec10 qsec11 qsec12 qsec13 qsec14 qsec15 qsec16 qsec17 qsec18 \n", " 22.900 18.300 18.900 17.400 17.600 18.000 17.980 17.820 17.420 19.470 \n", " qsec19 qsec20 qsec21 qsec22 qsec23 qsec24 qsec25 qsec26 qsec27 qsec28 \n", " 18.520 19.900 20.010 16.870 17.300 15.410 17.050 18.900 16.700 16.900 \n", " qsec29 qsec30 qsec31 qsec32 vs1 vs2 vs3 vs4 vs5 vs6 \n", " 14.500 15.500 14.600 18.600 0.000 0.000 1.000 1.000 0.000 1.000 \n", " vs7 vs8 vs9 vs10 vs11 vs12 vs13 vs14 vs15 vs16 \n", " 0.000 1.000 1.000 1.000 1.000 0.000 0.000 0.000 0.000 0.000 \n", " vs17 vs18 vs19 vs20 vs21 vs22 vs23 vs24 vs25 vs26 \n", " 0.000 1.000 1.000 1.000 1.000 0.000 0.000 0.000 0.000 1.000 \n", " vs27 vs28 vs29 vs30 vs31 vs32 am1 am2 am3 am4 \n", " 0.000 1.000 0.000 0.000 0.000 1.000 1.000 1.000 1.000 0.000 \n", " am5 am6 am7 am8 am9 am10 am11 am12 am13 am14 \n", " 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 \n", " am15 am16 am17 am18 am19 am20 am21 am22 am23 am24 \n", " 0.000 0.000 0.000 1.000 1.000 1.000 0.000 0.000 0.000 0.000 \n", " am25 am26 am27 am28 am29 am30 am31 am32 gear1 gear2 \n", " 0.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 4.000 4.000 \n", " gear3 gear4 gear5 gear6 gear7 gear8 gear9 gear10 gear11 gear12 \n", " 4.000 3.000 3.000 3.000 3.000 4.000 4.000 4.000 4.000 3.000 \n", " gear13 gear14 gear15 gear16 gear17 gear18 gear19 gear20 gear21 gear22 \n", " 3.000 3.000 3.000 3.000 3.000 4.000 4.000 4.000 3.000 3.000 \n", " gear23 gear24 gear25 gear26 gear27 gear28 gear29 gear30 gear31 gear32 \n", " 3.000 3.000 3.000 4.000 5.000 5.000 5.000 5.000 5.000 4.000 \n", " carb1 carb2 carb3 carb4 carb5 carb6 carb7 carb8 carb9 carb10 \n", " 4.000 4.000 1.000 1.000 2.000 1.000 4.000 2.000 2.000 4.000 \n", " carb11 carb12 carb13 carb14 carb15 carb16 carb17 carb18 carb19 carb20 \n", " 4.000 3.000 3.000 3.000 4.000 4.000 4.000 1.000 2.000 1.000 \n", " carb21 carb22 carb23 carb24 carb25 carb26 carb27 carb28 carb29 carb30 \n", " 1.000 2.000 2.000 4.000 2.000 1.000 2.000 2.000 4.000 6.000 \n", " carb31 carb32 \n", " 8.000 2.000 \n" ] } ], "source": [ "ULmtcars <- unlist(mtcars) # produces a vector from the dataframe\n", "# the atomic type of a dataframe is a list\n", "print(ULmtcars)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "$mpg\n", " [1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4\n", "[16] 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7\n", "[31] 15.0 21.4\n", "\n", "$cyl\n", " [1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4\n", "\n", "$disp\n", " [1] 160.0 160.0 108.0 258.0 360.0 225.0 360.0 146.7 140.8 167.6 167.6 275.8\n", "[13] 275.8 275.8 472.0 460.0 440.0 78.7 75.7 71.1 120.1 318.0 304.0 350.0\n", "[25] 400.0 79.0 120.3 95.1 351.0 145.0 301.0 121.0\n", "\n", "$hp\n", " [1] 110 110 93 110 175 105 245 62 95 123 123 180 180 180 205 215 230 66 52\n", "[20] 65 97 150 150 245 175 66 91 113 264 175 335 109\n", "\n", "$drat\n", " [1] 3.90 3.90 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 3.92 3.07 3.07 3.07 2.93\n", "[16] 3.00 3.23 4.08 4.93 4.22 3.70 2.76 3.15 3.73 3.08 4.08 4.43 3.77 4.22 3.62\n", "[31] 3.54 4.11\n", "\n", "$wt\n", " [1] 2.620 2.875 2.320 3.215 3.440 3.460 3.570 3.190 3.150 3.440 3.440 4.070\n", "[13] 3.730 3.780 5.250 5.424 5.345 2.200 1.615 1.835 2.465 3.520 3.435 3.840\n", "[25] 3.845 1.935 2.140 1.513 3.170 2.770 3.570 2.780\n", "\n", "$qsec\n", " [1] 16.46 17.02 18.61 19.44 17.02 20.22 15.84 20.00 22.90 18.30 18.90 17.40\n", "[13] 17.60 18.00 17.98 17.82 17.42 19.47 18.52 19.90 20.01 16.87 17.30 15.41\n", "[25] 17.05 18.90 16.70 16.90 14.50 15.50 14.60 18.60\n", "\n", "$vs\n", " [1] 0 0 1 1 0 1 0 1 1 1 1 0 0 0 0 0 0 1 1 1 1 0 0 0 0 1 0 1 0 0 0 1\n", "\n", "$am\n", " [1] 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 1 1 1 1 1 1 1\n", "\n", "$gear\n", " [1] 4 4 4 3 3 3 3 4 4 4 4 3 3 3 3 3 3 4 4 4 3 3 3 3 3 4 5 5 5 5 5 4\n", "\n", "$carb\n", " [1] 4 4 1 1 2 1 4 2 2 4 4 3 3 3 4 4 4 1 2 1 1 2 2 4 2 1 2 2 4 6 8 2\n", "\n", "attr(,\"row.names\")\n", " [1] \"Mazda RX4\" \"Mazda RX4 Wag\" \"Datsun 710\" \n", " [4] \"Hornet 4 Drive\" \"Hornet Sportabout\" \"Valiant\" \n", " [7] \"Duster 360\" \"Merc 240D\" \"Merc 230\" \n", "[10] \"Merc 280\" \"Merc 280C\" \"Merc 450SE\" \n", "[13] \"Merc 450SL\" \"Merc 450SLC\" \"Cadillac Fleetwood\" \n", "[16] \"Lincoln Continental\" \"Chrysler Imperial\" \"Fiat 128\" \n", "[19] \"Honda Civic\" \"Toyota Corolla\" \"Toyota Corona\" \n", "[22] \"Dodge Challenger\" \"AMC Javelin\" \"Camaro Z28\" \n", "[25] \"Pontiac Firebird\" \"Fiat X1-9\" \"Porsche 914-2\" \n", "[28] \"Lotus Europa\" \"Ford Pantera L\" \"Ferrari Dino\" \n", "[31] \"Maserati Bora\" \"Volvo 142E\" \n" ] } ], "source": [ "UCmtcars <- unclass(mtcars) # removes the class attribute, turning the dataframe into a\n", "# series of vectors plus any names attributes, same as setting\n", "# class(mtcars) <- NULL\n", "print(UCmtcars)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "scrolled": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\t
$mpg
\n", "\t\t
    \n", "\t
  1. 21
  2. \n", "\t
  3. 21
  4. \n", "\t
  5. 22.8
  6. \n", "\t
  7. 21.4
  8. \n", "\t
  9. 18.7
  10. \n", "\t
  11. 18.1
  12. \n", "\t
  13. 14.3
  14. \n", "\t
  15. 24.4
  16. \n", "\t
  17. 22.8
  18. \n", "\t
  19. 19.2
  20. \n", "\t
  21. 17.8
  22. \n", "\t
  23. 16.4
  24. \n", "\t
  25. 17.3
  26. \n", "\t
  27. 15.2
  28. \n", "\t
  29. 10.4
  30. \n", "\t
  31. 10.4
  32. \n", "\t
  33. 14.7
  34. \n", "\t
  35. 32.4
  36. \n", "\t
  37. 30.4
  38. \n", "\t
  39. 33.9
  40. \n", "\t
  41. 21.5
  42. \n", "\t
  43. 15.5
  44. \n", "\t
  45. 15.2
  46. \n", "\t
  47. 13.3
  48. \n", "\t
  49. 19.2
  50. \n", "\t
  51. 27.3
  52. \n", "\t
  53. 26
  54. \n", "\t
  55. 30.4
  56. \n", "\t
  57. 15.8
  58. \n", "\t
  59. 19.7
  60. \n", "\t
  61. 15
  62. \n", "\t
  63. 21.4
  64. \n", "
\n", "
\n", "\t
$cyl
\n", "\t\t
    \n", "\t
  1. 6
  2. \n", "\t
  3. 6
  4. \n", "\t
  5. 4
  6. \n", "\t
  7. 6
  8. \n", "\t
  9. 8
  10. \n", "\t
  11. 6
  12. \n", "\t
  13. 8
  14. \n", "\t
  15. 4
  16. \n", "\t
  17. 4
  18. \n", "\t
  19. 6
  20. \n", "\t
  21. 6
  22. \n", "\t
  23. 8
  24. \n", "\t
  25. 8
  26. \n", "\t
  27. 8
  28. \n", "\t
  29. 8
  30. \n", "\t
  31. 8
  32. \n", "\t
  33. 8
  34. \n", "\t
  35. 4
  36. \n", "\t
  37. 4
  38. \n", "\t
  39. 4
  40. \n", "\t
  41. 4
  42. \n", "\t
  43. 8
  44. \n", "\t
  45. 8
  46. \n", "\t
  47. 8
  48. \n", "\t
  49. 8
  50. \n", "\t
  51. 4
  52. \n", "\t
  53. 4
  54. \n", "\t
  55. 4
  56. \n", "\t
  57. 8
  58. \n", "\t
  59. 6
  60. \n", "\t
  61. 8
  62. \n", "\t
  63. 4
  64. \n", "
\n", "
\n", "\t
$disp
\n", "\t\t
    \n", "\t
  1. 160
  2. \n", "\t
  3. 160
  4. \n", "\t
  5. 108
  6. \n", "\t
  7. 258
  8. \n", "\t
  9. 360
  10. \n", "\t
  11. 225
  12. \n", "\t
  13. 360
  14. \n", "\t
  15. 146.7
  16. \n", "\t
  17. 140.8
  18. \n", "\t
  19. 167.6
  20. \n", "\t
  21. 167.6
  22. \n", "\t
  23. 275.8
  24. \n", "\t
  25. 275.8
  26. \n", "\t
  27. 275.8
  28. \n", "\t
  29. 472
  30. \n", "\t
  31. 460
  32. \n", "\t
  33. 440
  34. \n", "\t
  35. 78.7
  36. \n", "\t
  37. 75.7
  38. \n", "\t
  39. 71.1
  40. \n", "\t
  41. 120.1
  42. \n", "\t
  43. 318
  44. \n", "\t
  45. 304
  46. \n", "\t
  47. 350
  48. \n", "\t
  49. 400
  50. \n", "\t
  51. 79
  52. \n", "\t
  53. 120.3
  54. \n", "\t
  55. 95.1
  56. \n", "\t
  57. 351
  58. \n", "\t
  59. 145
  60. \n", "\t
  61. 301
  62. \n", "\t
  63. 121
  64. \n", "
\n", "
\n", "\t
$hp
\n", "\t\t
    \n", "\t
  1. 110
  2. \n", "\t
  3. 110
  4. \n", "\t
  5. 93
  6. \n", "\t
  7. 110
  8. \n", "\t
  9. 175
  10. \n", "\t
  11. 105
  12. \n", "\t
  13. 245
  14. \n", "\t
  15. 62
  16. \n", "\t
  17. 95
  18. \n", "\t
  19. 123
  20. \n", "\t
  21. 123
  22. \n", "\t
  23. 180
  24. \n", "\t
  25. 180
  26. \n", "\t
  27. 180
  28. \n", "\t
  29. 205
  30. \n", "\t
  31. 215
  32. \n", "\t
  33. 230
  34. \n", "\t
  35. 66
  36. \n", "\t
  37. 52
  38. \n", "\t
  39. 65
  40. \n", "\t
  41. 97
  42. \n", "\t
  43. 150
  44. \n", "\t
  45. 150
  46. \n", "\t
  47. 245
  48. \n", "\t
  49. 175
  50. \n", "\t
  51. 66
  52. \n", "\t
  53. 91
  54. \n", "\t
  55. 113
  56. \n", "\t
  57. 264
  58. \n", "\t
  59. 175
  60. \n", "\t
  61. 335
  62. \n", "\t
  63. 109
  64. \n", "
\n", "
\n", "\t
$drat
\n", "\t\t
    \n", "\t
  1. 3.9
  2. \n", "\t
  3. 3.9
  4. \n", "\t
  5. 3.85
  6. \n", "\t
  7. 3.08
  8. \n", "\t
  9. 3.15
  10. \n", "\t
  11. 2.76
  12. \n", "\t
  13. 3.21
  14. \n", "\t
  15. 3.69
  16. \n", "\t
  17. 3.92
  18. \n", "\t
  19. 3.92
  20. \n", "\t
  21. 3.92
  22. \n", "\t
  23. 3.07
  24. \n", "\t
  25. 3.07
  26. \n", "\t
  27. 3.07
  28. \n", "\t
  29. 2.93
  30. \n", "\t
  31. 3
  32. \n", "\t
  33. 3.23
  34. \n", "\t
  35. 4.08
  36. \n", "\t
  37. 4.93
  38. \n", "\t
  39. 4.22
  40. \n", "\t
  41. 3.7
  42. \n", "\t
  43. 2.76
  44. \n", "\t
  45. 3.15
  46. \n", "\t
  47. 3.73
  48. \n", "\t
  49. 3.08
  50. \n", "\t
  51. 4.08
  52. \n", "\t
  53. 4.43
  54. \n", "\t
  55. 3.77
  56. \n", "\t
  57. 4.22
  58. \n", "\t
  59. 3.62
  60. \n", "\t
  61. 3.54
  62. \n", "\t
  63. 4.11
  64. \n", "
\n", "
\n", "\t
$wt
\n", "\t\t
    \n", "\t
  1. 2.62
  2. \n", "\t
  3. 2.875
  4. \n", "\t
  5. 2.32
  6. \n", "\t
  7. 3.215
  8. \n", "\t
  9. 3.44
  10. \n", "\t
  11. 3.46
  12. \n", "\t
  13. 3.57
  14. \n", "\t
  15. 3.19
  16. \n", "\t
  17. 3.15
  18. \n", "\t
  19. 3.44
  20. \n", "\t
  21. 3.44
  22. \n", "\t
  23. 4.07
  24. \n", "\t
  25. 3.73
  26. \n", "\t
  27. 3.78
  28. \n", "\t
  29. 5.25
  30. \n", "\t
  31. 5.424
  32. \n", "\t
  33. 5.345
  34. \n", "\t
  35. 2.2
  36. \n", "\t
  37. 1.615
  38. \n", "\t
  39. 1.835
  40. \n", "\t
  41. 2.465
  42. \n", "\t
  43. 3.52
  44. \n", "\t
  45. 3.435
  46. \n", "\t
  47. 3.84
  48. \n", "\t
  49. 3.845
  50. \n", "\t
  51. 1.935
  52. \n", "\t
  53. 2.14
  54. \n", "\t
  55. 1.513
  56. \n", "\t
  57. 3.17
  58. \n", "\t
  59. 2.77
  60. \n", "\t
  61. 3.57
  62. \n", "\t
  63. 2.78
  64. \n", "
\n", "
\n", "\t
$qsec
\n", "\t\t
    \n", "\t
  1. 16.46
  2. \n", "\t
  3. 17.02
  4. \n", "\t
  5. 18.61
  6. \n", "\t
  7. 19.44
  8. \n", "\t
  9. 17.02
  10. \n", "\t
  11. 20.22
  12. \n", "\t
  13. 15.84
  14. \n", "\t
  15. 20
  16. \n", "\t
  17. 22.9
  18. \n", "\t
  19. 18.3
  20. \n", "\t
  21. 18.9
  22. \n", "\t
  23. 17.4
  24. \n", "\t
  25. 17.6
  26. \n", "\t
  27. 18
  28. \n", "\t
  29. 17.98
  30. \n", "\t
  31. 17.82
  32. \n", "\t
  33. 17.42
  34. \n", "\t
  35. 19.47
  36. \n", "\t
  37. 18.52
  38. \n", "\t
  39. 19.9
  40. \n", "\t
  41. 20.01
  42. \n", "\t
  43. 16.87
  44. \n", "\t
  45. 17.3
  46. \n", "\t
  47. 15.41
  48. \n", "\t
  49. 17.05
  50. \n", "\t
  51. 18.9
  52. \n", "\t
  53. 16.7
  54. \n", "\t
  55. 16.9
  56. \n", "\t
  57. 14.5
  58. \n", "\t
  59. 15.5
  60. \n", "\t
  61. 14.6
  62. \n", "\t
  63. 18.6
  64. \n", "
\n", "
\n", "\t
$vs
\n", "\t\t
    \n", "\t
  1. 0
  2. \n", "\t
  3. 0
  4. \n", "\t
  5. 1
  6. \n", "\t
  7. 1
  8. \n", "\t
  9. 0
  10. \n", "\t
  11. 1
  12. \n", "\t
  13. 0
  14. \n", "\t
  15. 1
  16. \n", "\t
  17. 1
  18. \n", "\t
  19. 1
  20. \n", "\t
  21. 1
  22. \n", "\t
  23. 0
  24. \n", "\t
  25. 0
  26. \n", "\t
  27. 0
  28. \n", "\t
  29. 0
  30. \n", "\t
  31. 0
  32. \n", "\t
  33. 0
  34. \n", "\t
  35. 1
  36. \n", "\t
  37. 1
  38. \n", "\t
  39. 1
  40. \n", "\t
  41. 1
  42. \n", "\t
  43. 0
  44. \n", "\t
  45. 0
  46. \n", "\t
  47. 0
  48. \n", "\t
  49. 0
  50. \n", "\t
  51. 1
  52. \n", "\t
  53. 0
  54. \n", "\t
  55. 1
  56. \n", "\t
  57. 0
  58. \n", "\t
  59. 0
  60. \n", "\t
  61. 0
  62. \n", "\t
  63. 1
  64. \n", "
\n", "
\n", "\t
$am
\n", "\t\t
    \n", "\t
  1. 1
  2. \n", "\t
  3. 1
  4. \n", "\t
  5. 1
  6. \n", "\t
  7. 0
  8. \n", "\t
  9. 0
  10. \n", "\t
  11. 0
  12. \n", "\t
  13. 0
  14. \n", "\t
  15. 0
  16. \n", "\t
  17. 0
  18. \n", "\t
  19. 0
  20. \n", "\t
  21. 0
  22. \n", "\t
  23. 0
  24. \n", "\t
  25. 0
  26. \n", "\t
  27. 0
  28. \n", "\t
  29. 0
  30. \n", "\t
  31. 0
  32. \n", "\t
  33. 0
  34. \n", "\t
  35. 1
  36. \n", "\t
  37. 1
  38. \n", "\t
  39. 1
  40. \n", "\t
  41. 0
  42. \n", "\t
  43. 0
  44. \n", "\t
  45. 0
  46. \n", "\t
  47. 0
  48. \n", "\t
  49. 0
  50. \n", "\t
  51. 1
  52. \n", "\t
  53. 1
  54. \n", "\t
  55. 1
  56. \n", "\t
  57. 1
  58. \n", "\t
  59. 1
  60. \n", "\t
  61. 1
  62. \n", "\t
  63. 1
  64. \n", "
\n", "
\n", "\t
$gear
\n", "\t\t
    \n", "\t
  1. 4
  2. \n", "\t
  3. 4
  4. \n", "\t
  5. 4
  6. \n", "\t
  7. 3
  8. \n", "\t
  9. 3
  10. \n", "\t
  11. 3
  12. \n", "\t
  13. 3
  14. \n", "\t
  15. 4
  16. \n", "\t
  17. 4
  18. \n", "\t
  19. 4
  20. \n", "\t
  21. 4
  22. \n", "\t
  23. 3
  24. \n", "\t
  25. 3
  26. \n", "\t
  27. 3
  28. \n", "\t
  29. 3
  30. \n", "\t
  31. 3
  32. \n", "\t
  33. 3
  34. \n", "\t
  35. 4
  36. \n", "\t
  37. 4
  38. \n", "\t
  39. 4
  40. \n", "\t
  41. 3
  42. \n", "\t
  43. 3
  44. \n", "\t
  45. 3
  46. \n", "\t
  47. 3
  48. \n", "\t
  49. 3
  50. \n", "\t
  51. 4
  52. \n", "\t
  53. 5
  54. \n", "\t
  55. 5
  56. \n", "\t
  57. 5
  58. \n", "\t
  59. 5
  60. \n", "\t
  61. 5
  62. \n", "\t
  63. 4
  64. \n", "
\n", "
\n", "\t
$carb
\n", "\t\t
    \n", "\t
  1. 4
  2. \n", "\t
  3. 4
  4. \n", "\t
  5. 1
  6. \n", "\t
  7. 1
  8. \n", "\t
  9. 2
  10. \n", "\t
  11. 1
  12. \n", "\t
  13. 4
  14. \n", "\t
  15. 2
  16. \n", "\t
  17. 2
  18. \n", "\t
  19. 4
  20. \n", "\t
  21. 4
  22. \n", "\t
  23. 3
  24. \n", "\t
  25. 3
  26. \n", "\t
  27. 3
  28. \n", "\t
  29. 4
  30. \n", "\t
  31. 4
  32. \n", "\t
  33. 4
  34. \n", "\t
  35. 1
  36. \n", "\t
  37. 2
  38. \n", "\t
  39. 1
  40. \n", "\t
  41. 1
  42. \n", "\t
  43. 2
  44. \n", "\t
  45. 2
  46. \n", "\t
  47. 4
  48. \n", "\t
  49. 2
  50. \n", "\t
  51. 1
  52. \n", "\t
  53. 2
  54. \n", "\t
  55. 2
  56. \n", "\t
  57. 4
  58. \n", "\t
  59. 6
  60. \n", "\t
  61. 8
  62. \n", "\t
  63. 2
  64. \n", "
\n", "
\n", "
\n" ], "text/latex": [ "\\begin{description}\n", "\\item[\\$mpg] \\begin{enumerate*}\n", "\\item 21\n", "\\item 21\n", "\\item 22.8\n", "\\item 21.4\n", "\\item 18.7\n", "\\item 18.1\n", "\\item 14.3\n", "\\item 24.4\n", "\\item 22.8\n", "\\item 19.2\n", "\\item 17.8\n", "\\item 16.4\n", "\\item 17.3\n", "\\item 15.2\n", "\\item 10.4\n", "\\item 10.4\n", "\\item 14.7\n", "\\item 32.4\n", "\\item 30.4\n", "\\item 33.9\n", "\\item 21.5\n", "\\item 15.5\n", "\\item 15.2\n", "\\item 13.3\n", "\\item 19.2\n", "\\item 27.3\n", "\\item 26\n", "\\item 30.4\n", "\\item 15.8\n", "\\item 19.7\n", "\\item 15\n", "\\item 21.4\n", "\\end{enumerate*}\n", "\n", "\\item[\\$cyl] \\begin{enumerate*}\n", "\\item 6\n", "\\item 6\n", "\\item 4\n", "\\item 6\n", "\\item 8\n", "\\item 6\n", "\\item 8\n", "\\item 4\n", "\\item 4\n", "\\item 6\n", "\\item 6\n", "\\item 8\n", "\\item 8\n", "\\item 8\n", "\\item 8\n", "\\item 8\n", "\\item 8\n", "\\item 4\n", "\\item 4\n", "\\item 4\n", "\\item 4\n", "\\item 8\n", "\\item 8\n", "\\item 8\n", "\\item 8\n", "\\item 4\n", "\\item 4\n", "\\item 4\n", "\\item 8\n", "\\item 6\n", "\\item 8\n", "\\item 4\n", "\\end{enumerate*}\n", "\n", "\\item[\\$disp] \\begin{enumerate*}\n", "\\item 160\n", "\\item 160\n", "\\item 108\n", "\\item 258\n", "\\item 360\n", "\\item 225\n", "\\item 360\n", "\\item 146.7\n", "\\item 140.8\n", "\\item 167.6\n", "\\item 167.6\n", "\\item 275.8\n", "\\item 275.8\n", "\\item 275.8\n", "\\item 472\n", "\\item 460\n", "\\item 440\n", "\\item 78.7\n", "\\item 75.7\n", "\\item 71.1\n", "\\item 120.1\n", "\\item 318\n", "\\item 304\n", "\\item 350\n", "\\item 400\n", "\\item 79\n", "\\item 120.3\n", "\\item 95.1\n", "\\item 351\n", "\\item 145\n", "\\item 301\n", "\\item 121\n", "\\end{enumerate*}\n", "\n", "\\item[\\$hp] \\begin{enumerate*}\n", "\\item 110\n", "\\item 110\n", "\\item 93\n", "\\item 110\n", "\\item 175\n", "\\item 105\n", "\\item 245\n", "\\item 62\n", "\\item 95\n", "\\item 123\n", "\\item 123\n", "\\item 180\n", "\\item 180\n", "\\item 180\n", "\\item 205\n", "\\item 215\n", "\\item 230\n", "\\item 66\n", "\\item 52\n", "\\item 65\n", "\\item 97\n", "\\item 150\n", "\\item 150\n", "\\item 245\n", "\\item 175\n", "\\item 66\n", "\\item 91\n", "\\item 113\n", "\\item 264\n", "\\item 175\n", "\\item 335\n", "\\item 109\n", "\\end{enumerate*}\n", "\n", "\\item[\\$drat] \\begin{enumerate*}\n", "\\item 3.9\n", "\\item 3.9\n", "\\item 3.85\n", "\\item 3.08\n", "\\item 3.15\n", "\\item 2.76\n", "\\item 3.21\n", "\\item 3.69\n", "\\item 3.92\n", "\\item 3.92\n", "\\item 3.92\n", "\\item 3.07\n", "\\item 3.07\n", "\\item 3.07\n", "\\item 2.93\n", "\\item 3\n", "\\item 3.23\n", "\\item 4.08\n", "\\item 4.93\n", "\\item 4.22\n", "\\item 3.7\n", "\\item 2.76\n", "\\item 3.15\n", "\\item 3.73\n", "\\item 3.08\n", "\\item 4.08\n", "\\item 4.43\n", "\\item 3.77\n", "\\item 4.22\n", "\\item 3.62\n", "\\item 3.54\n", "\\item 4.11\n", "\\end{enumerate*}\n", "\n", "\\item[\\$wt] \\begin{enumerate*}\n", "\\item 2.62\n", "\\item 2.875\n", "\\item 2.32\n", "\\item 3.215\n", "\\item 3.44\n", "\\item 3.46\n", "\\item 3.57\n", "\\item 3.19\n", "\\item 3.15\n", "\\item 3.44\n", "\\item 3.44\n", "\\item 4.07\n", "\\item 3.73\n", "\\item 3.78\n", "\\item 5.25\n", "\\item 5.424\n", "\\item 5.345\n", "\\item 2.2\n", "\\item 1.615\n", "\\item 1.835\n", "\\item 2.465\n", "\\item 3.52\n", "\\item 3.435\n", "\\item 3.84\n", "\\item 3.845\n", "\\item 1.935\n", "\\item 2.14\n", "\\item 1.513\n", "\\item 3.17\n", "\\item 2.77\n", "\\item 3.57\n", "\\item 2.78\n", "\\end{enumerate*}\n", "\n", "\\item[\\$qsec] \\begin{enumerate*}\n", "\\item 16.46\n", "\\item 17.02\n", "\\item 18.61\n", "\\item 19.44\n", "\\item 17.02\n", "\\item 20.22\n", "\\item 15.84\n", "\\item 20\n", "\\item 22.9\n", "\\item 18.3\n", "\\item 18.9\n", "\\item 17.4\n", "\\item 17.6\n", "\\item 18\n", "\\item 17.98\n", "\\item 17.82\n", "\\item 17.42\n", "\\item 19.47\n", "\\item 18.52\n", "\\item 19.9\n", "\\item 20.01\n", "\\item 16.87\n", "\\item 17.3\n", "\\item 15.41\n", "\\item 17.05\n", "\\item 18.9\n", "\\item 16.7\n", "\\item 16.9\n", "\\item 14.5\n", "\\item 15.5\n", "\\item 14.6\n", "\\item 18.6\n", "\\end{enumerate*}\n", "\n", "\\item[\\$vs] \\begin{enumerate*}\n", "\\item 0\n", "\\item 0\n", "\\item 1\n", "\\item 1\n", "\\item 0\n", "\\item 1\n", "\\item 0\n", "\\item 1\n", "\\item 1\n", "\\item 1\n", "\\item 1\n", "\\item 0\n", "\\item 0\n", "\\item 0\n", "\\item 0\n", "\\item 0\n", "\\item 0\n", "\\item 1\n", "\\item 1\n", "\\item 1\n", "\\item 1\n", "\\item 0\n", "\\item 0\n", "\\item 0\n", "\\item 0\n", "\\item 1\n", "\\item 0\n", "\\item 1\n", "\\item 0\n", "\\item 0\n", "\\item 0\n", "\\item 1\n", "\\end{enumerate*}\n", "\n", "\\item[\\$am] \\begin{enumerate*}\n", "\\item 1\n", "\\item 1\n", "\\item 1\n", "\\item 0\n", "\\item 0\n", "\\item 0\n", "\\item 0\n", "\\item 0\n", "\\item 0\n", "\\item 0\n", "\\item 0\n", "\\item 0\n", "\\item 0\n", "\\item 0\n", "\\item 0\n", "\\item 0\n", "\\item 0\n", "\\item 1\n", "\\item 1\n", "\\item 1\n", "\\item 0\n", "\\item 0\n", "\\item 0\n", "\\item 0\n", "\\item 0\n", "\\item 1\n", "\\item 1\n", "\\item 1\n", "\\item 1\n", "\\item 1\n", "\\item 1\n", "\\item 1\n", "\\end{enumerate*}\n", "\n", "\\item[\\$gear] \\begin{enumerate*}\n", "\\item 4\n", "\\item 4\n", "\\item 4\n", "\\item 3\n", "\\item 3\n", "\\item 3\n", "\\item 3\n", "\\item 4\n", "\\item 4\n", "\\item 4\n", "\\item 4\n", "\\item 3\n", "\\item 3\n", "\\item 3\n", "\\item 3\n", "\\item 3\n", "\\item 3\n", "\\item 4\n", "\\item 4\n", "\\item 4\n", "\\item 3\n", "\\item 3\n", "\\item 3\n", "\\item 3\n", "\\item 3\n", "\\item 4\n", "\\item 5\n", "\\item 5\n", "\\item 5\n", "\\item 5\n", "\\item 5\n", "\\item 4\n", "\\end{enumerate*}\n", "\n", "\\item[\\$carb] \\begin{enumerate*}\n", "\\item 4\n", "\\item 4\n", "\\item 1\n", "\\item 1\n", "\\item 2\n", "\\item 1\n", "\\item 4\n", "\\item 2\n", "\\item 2\n", "\\item 4\n", "\\item 4\n", "\\item 3\n", "\\item 3\n", "\\item 3\n", "\\item 4\n", "\\item 4\n", "\\item 4\n", "\\item 1\n", "\\item 2\n", "\\item 1\n", "\\item 1\n", "\\item 2\n", "\\item 2\n", "\\item 4\n", "\\item 2\n", "\\item 1\n", "\\item 2\n", "\\item 2\n", "\\item 4\n", "\\item 6\n", "\\item 8\n", "\\item 2\n", "\\end{enumerate*}\n", "\n", "\\end{description}\n" ], "text/markdown": [ "$mpg\n", ": 1. 21\n", "2. 21\n", "3. 22.8\n", "4. 21.4\n", "5. 18.7\n", "6. 18.1\n", "7. 14.3\n", "8. 24.4\n", "9. 22.8\n", "10. 19.2\n", "11. 17.8\n", "12. 16.4\n", "13. 17.3\n", "14. 15.2\n", "15. 10.4\n", "16. 10.4\n", "17. 14.7\n", "18. 32.4\n", "19. 30.4\n", "20. 33.9\n", "21. 21.5\n", "22. 15.5\n", "23. 15.2\n", "24. 13.3\n", "25. 19.2\n", "26. 27.3\n", "27. 26\n", "28. 30.4\n", "29. 15.8\n", "30. 19.7\n", "31. 15\n", "32. 21.4\n", "\n", "\n", "\n", "$cyl\n", ": 1. 6\n", "2. 6\n", "3. 4\n", "4. 6\n", "5. 8\n", "6. 6\n", "7. 8\n", "8. 4\n", "9. 4\n", "10. 6\n", "11. 6\n", "12. 8\n", "13. 8\n", "14. 8\n", "15. 8\n", "16. 8\n", "17. 8\n", "18. 4\n", "19. 4\n", "20. 4\n", "21. 4\n", "22. 8\n", "23. 8\n", "24. 8\n", "25. 8\n", "26. 4\n", "27. 4\n", "28. 4\n", "29. 8\n", "30. 6\n", "31. 8\n", "32. 4\n", "\n", "\n", "\n", "$disp\n", ": 1. 160\n", "2. 160\n", "3. 108\n", "4. 258\n", "5. 360\n", "6. 225\n", "7. 360\n", "8. 146.7\n", "9. 140.8\n", "10. 167.6\n", "11. 167.6\n", "12. 275.8\n", "13. 275.8\n", "14. 275.8\n", "15. 472\n", "16. 460\n", "17. 440\n", "18. 78.7\n", "19. 75.7\n", "20. 71.1\n", "21. 120.1\n", "22. 318\n", "23. 304\n", "24. 350\n", "25. 400\n", "26. 79\n", "27. 120.3\n", "28. 95.1\n", "29. 351\n", "30. 145\n", "31. 301\n", "32. 121\n", "\n", "\n", "\n", "$hp\n", ": 1. 110\n", "2. 110\n", "3. 93\n", "4. 110\n", "5. 175\n", "6. 105\n", "7. 245\n", "8. 62\n", "9. 95\n", "10. 123\n", "11. 123\n", "12. 180\n", "13. 180\n", "14. 180\n", "15. 205\n", "16. 215\n", "17. 230\n", "18. 66\n", "19. 52\n", "20. 65\n", "21. 97\n", "22. 150\n", "23. 150\n", "24. 245\n", "25. 175\n", "26. 66\n", "27. 91\n", "28. 113\n", "29. 264\n", "30. 175\n", "31. 335\n", "32. 109\n", "\n", "\n", "\n", "$drat\n", ": 1. 3.9\n", "2. 3.9\n", "3. 3.85\n", "4. 3.08\n", "5. 3.15\n", "6. 2.76\n", "7. 3.21\n", "8. 3.69\n", "9. 3.92\n", "10. 3.92\n", "11. 3.92\n", "12. 3.07\n", "13. 3.07\n", "14. 3.07\n", "15. 2.93\n", "16. 3\n", "17. 3.23\n", "18. 4.08\n", "19. 4.93\n", "20. 4.22\n", "21. 3.7\n", "22. 2.76\n", "23. 3.15\n", "24. 3.73\n", "25. 3.08\n", "26. 4.08\n", "27. 4.43\n", "28. 3.77\n", "29. 4.22\n", "30. 3.62\n", "31. 3.54\n", "32. 4.11\n", "\n", "\n", "\n", "$wt\n", ": 1. 2.62\n", "2. 2.875\n", "3. 2.32\n", "4. 3.215\n", "5. 3.44\n", "6. 3.46\n", "7. 3.57\n", "8. 3.19\n", "9. 3.15\n", "10. 3.44\n", "11. 3.44\n", "12. 4.07\n", "13. 3.73\n", "14. 3.78\n", "15. 5.25\n", "16. 5.424\n", "17. 5.345\n", "18. 2.2\n", "19. 1.615\n", "20. 1.835\n", "21. 2.465\n", "22. 3.52\n", "23. 3.435\n", "24. 3.84\n", "25. 3.845\n", "26. 1.935\n", "27. 2.14\n", "28. 1.513\n", "29. 3.17\n", "30. 2.77\n", "31. 3.57\n", "32. 2.78\n", "\n", "\n", "\n", "$qsec\n", ": 1. 16.46\n", "2. 17.02\n", "3. 18.61\n", "4. 19.44\n", "5. 17.02\n", "6. 20.22\n", "7. 15.84\n", "8. 20\n", "9. 22.9\n", "10. 18.3\n", "11. 18.9\n", "12. 17.4\n", "13. 17.6\n", "14. 18\n", "15. 17.98\n", "16. 17.82\n", "17. 17.42\n", "18. 19.47\n", "19. 18.52\n", "20. 19.9\n", "21. 20.01\n", "22. 16.87\n", "23. 17.3\n", "24. 15.41\n", "25. 17.05\n", "26. 18.9\n", "27. 16.7\n", "28. 16.9\n", "29. 14.5\n", "30. 15.5\n", "31. 14.6\n", "32. 18.6\n", "\n", "\n", "\n", "$vs\n", ": 1. 0\n", "2. 0\n", "3. 1\n", "4. 1\n", "5. 0\n", "6. 1\n", "7. 0\n", "8. 1\n", "9. 1\n", "10. 1\n", "11. 1\n", "12. 0\n", "13. 0\n", "14. 0\n", "15. 0\n", "16. 0\n", "17. 0\n", "18. 1\n", "19. 1\n", "20. 1\n", "21. 1\n", "22. 0\n", "23. 0\n", "24. 0\n", "25. 0\n", "26. 1\n", "27. 0\n", "28. 1\n", "29. 0\n", "30. 0\n", "31. 0\n", "32. 1\n", "\n", "\n", "\n", "$am\n", ": 1. 1\n", "2. 1\n", "3. 1\n", "4. 0\n", "5. 0\n", "6. 0\n", "7. 0\n", "8. 0\n", "9. 0\n", "10. 0\n", "11. 0\n", "12. 0\n", "13. 0\n", "14. 0\n", "15. 0\n", "16. 0\n", "17. 0\n", "18. 1\n", "19. 1\n", "20. 1\n", "21. 0\n", "22. 0\n", "23. 0\n", "24. 0\n", "25. 0\n", "26. 1\n", "27. 1\n", "28. 1\n", "29. 1\n", "30. 1\n", "31. 1\n", "32. 1\n", "\n", "\n", "\n", "$gear\n", ": 1. 4\n", "2. 4\n", "3. 4\n", "4. 3\n", "5. 3\n", "6. 3\n", "7. 3\n", "8. 4\n", "9. 4\n", "10. 4\n", "11. 4\n", "12. 3\n", "13. 3\n", "14. 3\n", "15. 3\n", "16. 3\n", "17. 3\n", "18. 4\n", "19. 4\n", "20. 4\n", "21. 3\n", "22. 3\n", "23. 3\n", "24. 3\n", "25. 3\n", "26. 4\n", "27. 5\n", "28. 5\n", "29. 5\n", "30. 5\n", "31. 5\n", "32. 4\n", "\n", "\n", "\n", "$carb\n", ": 1. 4\n", "2. 4\n", "3. 1\n", "4. 1\n", "5. 2\n", "6. 1\n", "7. 4\n", "8. 2\n", "9. 2\n", "10. 4\n", "11. 4\n", "12. 3\n", "13. 3\n", "14. 3\n", "15. 4\n", "16. 4\n", "17. 4\n", "18. 1\n", "19. 2\n", "20. 1\n", "21. 1\n", "22. 2\n", "23. 2\n", "24. 4\n", "25. 2\n", "26. 1\n", "27. 2\n", "28. 2\n", "29. 4\n", "30. 6\n", "31. 8\n", "32. 2\n", "\n", "\n", "\n", "\n", "\n" ], "text/plain": [ "$mpg\n", " [1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4\n", "[16] 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7\n", "[31] 15.0 21.4\n", "\n", "$cyl\n", " [1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4\n", "\n", "$disp\n", " [1] 160.0 160.0 108.0 258.0 360.0 225.0 360.0 146.7 140.8 167.6 167.6 275.8\n", "[13] 275.8 275.8 472.0 460.0 440.0 78.7 75.7 71.1 120.1 318.0 304.0 350.0\n", "[25] 400.0 79.0 120.3 95.1 351.0 145.0 301.0 121.0\n", "\n", "$hp\n", " [1] 110 110 93 110 175 105 245 62 95 123 123 180 180 180 205 215 230 66 52\n", "[20] 65 97 150 150 245 175 66 91 113 264 175 335 109\n", "\n", "$drat\n", " [1] 3.90 3.90 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 3.92 3.07 3.07 3.07 2.93\n", "[16] 3.00 3.23 4.08 4.93 4.22 3.70 2.76 3.15 3.73 3.08 4.08 4.43 3.77 4.22 3.62\n", "[31] 3.54 4.11\n", "\n", "$wt\n", " [1] 2.620 2.875 2.320 3.215 3.440 3.460 3.570 3.190 3.150 3.440 3.440 4.070\n", "[13] 3.730 3.780 5.250 5.424 5.345 2.200 1.615 1.835 2.465 3.520 3.435 3.840\n", "[25] 3.845 1.935 2.140 1.513 3.170 2.770 3.570 2.780\n", "\n", "$qsec\n", " [1] 16.46 17.02 18.61 19.44 17.02 20.22 15.84 20.00 22.90 18.30 18.90 17.40\n", "[13] 17.60 18.00 17.98 17.82 17.42 19.47 18.52 19.90 20.01 16.87 17.30 15.41\n", "[25] 17.05 18.90 16.70 16.90 14.50 15.50 14.60 18.60\n", "\n", "$vs\n", " [1] 0 0 1 1 0 1 0 1 1 1 1 0 0 0 0 0 0 1 1 1 1 0 0 0 0 1 0 1 0 0 0 1\n", "\n", "$am\n", " [1] 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 1 1 1 1 1 1 1\n", "\n", "$gear\n", " [1] 4 4 4 3 3 3 3 4 4 4 4 3 3 3 3 3 3 4 4 4 3 3 3 3 3 4 5 5 5 5 5 4\n", "\n", "$carb\n", " [1] 4 4 1 1 2 1 4 2 2 4 4 3 3 3 4 4 4 1 2 1 1 2 2 4 2 1 2 2 4 6 8 2\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "c(mtcars) # similar to unclass but without the attributes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## siwrl Package:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "also installing the dependencies 'bitops', 'RCurl'\n", "\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n", " There is a binary version available but the source version is later:\n", " binary source needs_compilation\n", "RCurl 1.98-1.3 1.98-1.5 TRUE\n", "\n", " Binaries will be installed\n", "package 'bitops' successfully unpacked and MD5 sums checked\n", "package 'RCurl' successfully unpacked and MD5 sums checked\n", "package 'swirl' successfully unpacked and MD5 sums checked\n", "\n", "The downloaded binary packages are in\n", "\tC:\\Users\\Administrator\\AppData\\Local\\Temp\\Rtmp44AWmj\\downloaded_packages\n" ] } ], "source": [ "install.packages(\"swirl\")" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "scrolled": true }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Warning message:\n", "\"package 'swirl' was built under R version 3.6.3\"\n", "| Hi! I see that you have some variables saved in your workspace. To keep\n", "| things running smoothly, I recommend you clean up before starting swirl.\n", "\n", "| Type ls() to see a list of the variables in your workspace. Then, type\n", "| rm(list=ls()) to clear your workspace.\n", "\n", "| Type swirl() when you are ready to begin.\n", "\n" ] } ], "source": [ "library(swirl)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "Sources & References \n", "http://eriqande.github.io/rep-res-web/ \n", "http://www.pitt.edu/~njc23/ \n", "http://adv-r.had.co.nz/Data-structures.html \n", "https://cran.r-project.org/doc/contrib/de_Jonge+van_der_Loo-Introduction_to_data_cleaning_with_R.pdf \n", "https://github.com/aammd/UBCadv-r/wiki/01:-Data-Structures \n", "\n" ] } ], "metadata": { "celltoolbar": "Slideshow", "hide_input": false, "kernelspec": { "display_name": "R", "language": "R", "name": "ir" }, "language_info": { "codemirror_mode": "r", "file_extension": ".r", "mimetype": "text/x-r-source", "name": "R", "pygments_lexer": "r", "version": "3.6.1" }, "nav_menu": {}, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": true, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": true, "toc_position": { "height": "calc(100% - 180px)", "left": "10px", "top": "150px", "width": "256px" }, "toc_section_display": "block", "toc_window_display": true } }, "nbformat": 4, "nbformat_minor": 2 }