{ "cells": [ { "cell_type": "markdown", "id": "ac93161f", "metadata": {}, "source": [ "In this tutorial, you'll learn how to use R to describe your data, both numerically and visually." ] }, { "cell_type": "markdown", "id": "4e55267f", "metadata": {}, "source": [ "Any time that you get a new dataset, one of the first tasks that you have to do is to find ways of\n", "summarizing the data in a compact, easily understood fashion. This is what **descriptive statistics** (as\n", "opposed to **inferential statistics**) is all about." ] }, { "cell_type": "markdown", "id": "74be9b37", "metadata": {}, "source": [ "# Measures of descriptive statistics" ] }, { "cell_type": "code", "execution_count": 2, "id": "37f43b2f", "metadata": {}, "outputs": [], "source": [ "# Let's use the these Student's Sleep data, included by R, for this demostration\n", "?sleep" ] }, { "cell_type": "markdown", "id": "2a0a18c0", "metadata": {}, "source": [ "## Measures of tendency" ] }, { "cell_type": "markdown", "id": "d61b661d", "metadata": {}, "source": [ "### The mean\n", "\n", "The mean of a set of observations is just an average, i.e., add all of the values up, and\n", "then divide by the total number of values (N.B. ).\n", "\n", "$ = \\frac{X_1 + X_2 + \\dots + X_N}{N} = \\frac{1}{N}\\sum_{i=1}^N X_i$" ] }, { "cell_type": "code", "execution_count": 3, "id": "96f2227b", "metadata": {}, "outputs": [ { "data": { "text/html": [ "1.54" ], "text/latex": [ "1.54" ], "text/markdown": [ "1.54" ], "text/plain": [ "[1] 1.54" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "sum(sleep$extra)/length(sleep$extra)" ] }, { "cell_type": "markdown", "id": "787e4ea7", "metadata": {}, "source": [ "However, R has a built-in function, `mean`, to compute this:" ] }, { "cell_type": "code", "execution_count": 4, "id": "d19b87fe", "metadata": {}, "outputs": [ { "data": { "text/html": [ "1.54" ], "text/latex": [ "1.54" ], "text/markdown": [ "1.54" ], "text/plain": [ "[1] 1.54" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "mean(sleep$extra)" ] }, { "cell_type": "markdown", "id": "b6264945", "metadata": {}, "source": [ "### The median\n", "\n", "The second measure of central tendency that people use a lot is the median, and it’s even easier to\n", "describe than the mean. The median of a set of observations is just the middle value." ] }, { "cell_type": "code", "execution_count": 5, "id": "7a7f3c46", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
  1. -1.6
  2. -1.2
  3. -0.2
  4. -0.1
  5. -0.1
  6. 0
  7. 0.1
  8. 0.7
  9. 0.8
  10. 0.8
  11. 1.1
  12. 1.6
  13. 1.9
  14. 2
  15. 3.4
  16. 3.7
  17. 4.4
  18. 4.6
  19. 5.5
\n" ], "text/latex": [ "\\begin{enumerate*}\n", "\\item -1.6\n", "\\item -1.2\n", "\\item -0.2\n", "\\item -0.1\n", "\\item -0.1\n", "\\item 0\n", "\\item 0.1\n", "\\item 0.7\n", "\\item 0.8\n", "\\item 0.8\n", "\\item 1.1\n", "\\item 1.6\n", "\\item 1.9\n", "\\item 2\n", "\\item 3.4\n", "\\item 3.7\n", "\\item 4.4\n", "\\item 4.6\n", "\\item 5.5\n", "\\end{enumerate*}\n" ], "text/markdown": [ "1. -1.6\n", "2. -1.2\n", "3. -0.2\n", "4. -0.1\n", "5. -0.1\n", "6. 0\n", "7. 0.1\n", "8. 0.7\n", "9. 0.8\n", "10. 0.8\n", "11. 1.1\n", "12. 1.6\n", "13. 1.9\n", "14. 2\n", "15. 3.4\n", "16. 3.7\n", "17. 4.4\n", "18. 4.6\n", "19. 5.5\n", "\n", "\n" ], "text/plain": [ " [1] -1.6 -1.2 -0.2 -0.1 -0.1 0.0 0.1 0.7 0.8 0.8 1.1 1.6 1.9 2.0 3.4\n", "[16] 3.7 4.4 4.6 5.5" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "sort(sleep$extra[-length(sleep$extra)])" ] }, { "cell_type": "markdown", "id": "819817ae", "metadata": {}, "source": [ "Here there are 19 observations, so the value in the middle is the tenth one, i.e. 0.8. Therefore, in contrast to the mean in a dataset, the median many times will correspond to an observed value in our datasets." ] }, { "cell_type": "markdown", "id": "e573e675", "metadata": {}, "source": [ "The aforementioned example occurs in the case of datasets with an odd number of observations. What would happen if we had used the 20 observations?" ] }, { "cell_type": "code", "execution_count": 6, "id": "74ba568c", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
  1. -1.6
  2. -1.2
  3. -0.2
  4. -0.1
  5. -0.1
  6. 0
  7. 0.1
  8. 0.7
  9. 0.8
  10. 0.8
  11. 1.1
  12. 1.6
  13. 1.9
  14. 2
  15. 3.4
  16. 3.4
  17. 3.7
  18. 4.4
  19. 4.6
  20. 5.5
\n" ], "text/latex": [ "\\begin{enumerate*}\n", "\\item -1.6\n", "\\item -1.2\n", "\\item -0.2\n", "\\item -0.1\n", "\\item -0.1\n", "\\item 0\n", "\\item 0.1\n", "\\item 0.7\n", "\\item 0.8\n", "\\item 0.8\n", "\\item 1.1\n", "\\item 1.6\n", "\\item 1.9\n", "\\item 2\n", "\\item 3.4\n", "\\item 3.4\n", "\\item 3.7\n", "\\item 4.4\n", "\\item 4.6\n", "\\item 5.5\n", "\\end{enumerate*}\n" ], "text/markdown": [ "1. -1.6\n", "2. -1.2\n", "3. -0.2\n", "4. -0.1\n", "5. -0.1\n", "6. 0\n", "7. 0.1\n", "8. 0.7\n", "9. 0.8\n", "10. 0.8\n", "11. 1.1\n", "12. 1.6\n", "13. 1.9\n", "14. 2\n", "15. 3.4\n", "16. 3.4\n", "17. 3.7\n", "18. 4.4\n", "19. 4.6\n", "20. 5.5\n", "\n", "\n" ], "text/plain": [ " [1] -1.6 -1.2 -0.2 -0.1 -0.1 0.0 0.1 0.7 0.8 0.8 1.1 1.6 1.9 2.0 3.4\n", "[16] 3.4 3.7 4.4 4.6 5.5" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "sort(sleep$extra)" ] }, { "cell_type": "markdown", "id": "801cd042", "metadata": {}, "source": [ "The middle points would correspond to both tenth and eleven positions. In this case, the median would correspond to the average between these two values, i.e. (0.8 + 1.1)/2 = 0.95" ] }, { "cell_type": "markdown", "id": "0cef4c94", "metadata": {}, "source": [ "Again, in order not do these things manually every time, R provides a built-in function called `median`, which essentially sorts the input values and takes the corresponding middle point." ] }, { "cell_type": "code", "execution_count": 8, "id": "1206ba87", "metadata": {}, "outputs": [ { "data": { "text/html": [ "0.8" ], "text/latex": [ "0.8" ], "text/markdown": [ "0.8" ], "text/plain": [ "[1] 0.8" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "0.95" ], "text/latex": [ "0.95" ], "text/markdown": [ "0.95" ], "text/plain": [ "[1] 0.95" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "median(sleep$extra[-length(sleep$extra)])\n", "\n", "median(sleep$extra)" ] }, { "cell_type": "markdown", "id": "a502c875", "metadata": {}, "source": [ "### Trimmed mean\n", "\n", "Something that you will often find when working with data is that these are messy and sometimes, noisy. \n", "\n", "For example, let's replace the first element of our data with a value that looks suspicious compared to the rest of the values. " ] }, { "cell_type": "code", "execution_count": 10, "id": "d7fe74c4", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
  1. 1000
  2. -1.6
  3. -0.2
  4. -1.2
  5. -0.1
  6. 3.4
  7. 3.7
  8. 0.8
  9. 0
  10. 2
  11. 1.9
  12. 0.8
  13. 1.1
  14. 0.1
  15. -0.1
  16. 4.4
  17. 5.5
  18. 1.6
  19. 4.6
  20. 3.4
\n" ], "text/latex": [ "\\begin{enumerate*}\n", "\\item 1000\n", "\\item -1.6\n", "\\item -0.2\n", "\\item -1.2\n", "\\item -0.1\n", "\\item 3.4\n", "\\item 3.7\n", "\\item 0.8\n", "\\item 0\n", "\\item 2\n", "\\item 1.9\n", "\\item 0.8\n", "\\item 1.1\n", "\\item 0.1\n", "\\item -0.1\n", "\\item 4.4\n", "\\item 5.5\n", "\\item 1.6\n", "\\item 4.6\n", "\\item 3.4\n", "\\end{enumerate*}\n" ], "text/markdown": [ "1. 1000\n", "2. -1.6\n", "3. -0.2\n", "4. -1.2\n", "5. -0.1\n", "6. 3.4\n", "7. 3.7\n", "8. 0.8\n", "9. 0\n", "10. 2\n", "11. 1.9\n", "12. 0.8\n", "13. 1.1\n", "14. 0.1\n", "15. -0.1\n", "16. 4.4\n", "17. 5.5\n", "18. 1.6\n", "19. 4.6\n", "20. 3.4\n", "\n", "\n" ], "text/plain": [ " [1] 1000.0 -1.6 -0.2 -1.2 -0.1 3.4 3.7 0.8 0.0 2.0\n", "[11] 1.9 0.8 1.1 0.1 -0.1 4.4 5.5 1.6 4.6 3.4" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "extra_noisy<-sleep$extra\n", "extra_noisy[1]<-1000\n", "extra_noisy" ] }, { "cell_type": "markdown", "id": "7721f956", "metadata": {}, "source": [ "This is an example of an **outlier**, a value that does not appear to belong with the others. Many times, in examples as clear as this it is reasonable to remove this point from the data. However, in real life one does not always get such cut-and-dried examples. For instance, you might get this instead:" ] }, { "cell_type": "code", "execution_count": 11, "id": "107131d5", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
  1. -20
  2. -1.6
  3. -0.2
  4. -1.2
  5. -0.1
  6. 3.4
  7. 3.7
  8. 0.8
  9. 0
  10. 2
  11. 1.9
  12. 0.8
  13. 1.1
  14. 0.1
  15. -0.1
  16. 4.4
  17. 5.5
  18. 1.6
  19. 4.6
  20. 18
\n" ], "text/latex": [ "\\begin{enumerate*}\n", "\\item -20\n", "\\item -1.6\n", "\\item -0.2\n", "\\item -1.2\n", "\\item -0.1\n", "\\item 3.4\n", "\\item 3.7\n", "\\item 0.8\n", "\\item 0\n", "\\item 2\n", "\\item 1.9\n", "\\item 0.8\n", "\\item 1.1\n", "\\item 0.1\n", "\\item -0.1\n", "\\item 4.4\n", "\\item 5.5\n", "\\item 1.6\n", "\\item 4.6\n", "\\item 18\n", "\\end{enumerate*}\n" ], "text/markdown": [ "1. -20\n", "2. -1.6\n", "3. -0.2\n", "4. -1.2\n", "5. -0.1\n", "6. 3.4\n", "7. 3.7\n", "8. 0.8\n", "9. 0\n", "10. 2\n", "11. 1.9\n", "12. 0.8\n", "13. 1.1\n", "14. 0.1\n", "15. -0.1\n", "16. 4.4\n", "17. 5.5\n", "18. 1.6\n", "19. 4.6\n", "20. 18\n", "\n", "\n" ], "text/plain": [ " [1] -20.0 -1.6 -0.2 -1.2 -0.1 3.4 3.7 0.8 0.0 2.0 1.9 0.8\n", "[13] 1.1 0.1 -0.1 4.4 5.5 1.6 4.6 18.0" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "extra_noisy[1]<- -20\n", "extra_noisy[length(extra_noisy)]<- 18\n", "extra_noisy" ] }, { "cell_type": "markdown", "id": "7bdd090e", "metadata": {}, "source": [ "Here, -20 and 18 look both suspicious, but it's harder to say whether these values are legitimate or not.\n", "\n", "When faced with a situation where some of the most extreme-valued observations might not be quite\n", "trustworthy, the mean is not necessarily a good measure of central tendency. It is highly sensitive to extreme values, and thus, is not considered a **robust** measure. \n", "\n", "One general solution would be to use a \"trimmed mean\", that is, a mean calculated discarding the most extreme examples on both ends (e.g., the largest and the smallest). The goal is to preserve the best characteristics of the mean while not being highly influenced by extreme outliers." ] }, { "cell_type": "markdown", "id": "c4e914ca", "metadata": {}, "source": [ "For example, let's say we are to calculate the 5% trimmed mean. In our example, this would correspond to discarding the largest 5% and the smallest 5% of the observations, and then takes the mean of the remaining 90% of the observations." ] }, { "cell_type": "code", "execution_count": 12, "id": "3d538c6f", "metadata": {}, "outputs": [ { "data": { "text/html": [ "1.48333333333333" ], "text/latex": [ "1.48333333333333" ], "text/markdown": [ "1.48333333333333" ], "text/plain": [ "[1] 1.483333" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "mean(extra_noisy[c(2:(length(extra_noisy)-1))])" ] }, { "cell_type": "markdown", "id": "d85835d1", "metadata": {}, "source": [ "Now, compare this mean to the one using all the observations:" ] }, { "cell_type": "code", "execution_count": 13, "id": "364e26ab", "metadata": {}, "outputs": [ { "data": { "text/html": [ "1.235" ], "text/latex": [ "1.235" ], "text/markdown": [ "1.235" ], "text/plain": [ "[1] 1.235" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "mean(extra_noisy)" ] }, { "cell_type": "markdown", "id": "7d2c952a", "metadata": {}, "source": [ "There's a difference, right? \n", "\n", "As usual, to make our lifes easier, R allows you to specify the amount of trimming when calling the function `mean`. Let's look at this using the help (by prepending `?` operator or using `help`) :" ] }, { "cell_type": "code", "execution_count": 14, "id": "02f497c7", "metadata": {}, "outputs": [], "source": [ "?mean" ] }, { "cell_type": "code", "execution_count": 15, "id": "02d44ce2", "metadata": {}, "outputs": [ { "data": { "text/html": [ "1.48333333333333" ], "text/latex": [ "1.48333333333333" ], "text/markdown": [ "1.48333333333333" ], "text/plain": [ "[1] 1.483333" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "mean(extra_noisy, trim=0.05)" ] }, { "cell_type": "markdown", "id": "d9ef4881", "metadata": {}, "source": [ "
Question: We are talking about outliers, but do you think the median would be also a robust measure against outliers?
" ] }, { "cell_type": "markdown", "id": "1ad2e93b", "metadata": {}, "source": [ "### Mode\n", "\n", "The mode of a sample is very simple: it is the value that occurs most frequently. In this case, R does not have a built-in function for this. You'll create one in the assigments (extra credit; it can be challending, I know!)" ] }, { "cell_type": "markdown", "id": "fdd2f941", "metadata": {}, "source": [ "## Measures of variability" ] }, { "cell_type": "markdown", "id": "851d3410", "metadata": {}, "source": [ "### Variance\n", "\n", "The formula for the variance is the following:\n", "\n", "$s^2 = \\frac{1}{N}\\sum_{i=1}^N (X_i - )^2$" ] }, { "cell_type": "code", "execution_count": 16, "id": "82a99c5c", "metadata": {}, "outputs": [ { "data": { "text/html": [ "3.8684" ], "text/latex": [ "3.8684" ], "text/markdown": [ "3.8684" ], "text/plain": [ "[1] 3.8684" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Lets do it\n", "mean.sleep.extra<-mean(sleep$extra)\n", "N<-length(sleep$extra)\n", "sum((sleep$extra - mean.sleep.extra)**2)/N" ] }, { "cell_type": "markdown", "id": "6693a767", "metadata": {}, "source": [ "Fortunately, R has a built-in function for calculating this, `var` :" ] }, { "cell_type": "code", "execution_count": 17, "id": "4425552c", "metadata": {}, "outputs": [ { "data": { "text/html": [ "4.072" ], "text/latex": [ "4.072" ], "text/markdown": [ "4.072" ], "text/plain": [ "[1] 4.072" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "var(sleep$extra)" ] }, { "cell_type": "markdown", "id": "b885d780", "metadata": {}, "source": [ "Ohhh, what happened?? What R is actually calculating is a slightly different formula, which implies dividing by N-1 instead that N. Let's see this:" ] }, { "cell_type": "code", "execution_count": 18, "id": "801ed685", "metadata": {}, "outputs": [ { "data": { "text/html": [ "4.072" ], "text/latex": [ "4.072" ], "text/markdown": [ "4.072" ], "text/plain": [ "[1] 4.072" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "sum((sleep$extra - mean.sleep.extra)**2)/(N-1)" ] }, { "cell_type": "markdown", "id": "dc8d60e9", "metadata": {}, "source": [ "**WHY??** We'll see this in a near future..." ] }, { "cell_type": "markdown", "id": "fdfaf405", "metadata": {}, "source": [ "### Standard Deviation\n", "\n", "The Standard Deviation, also called the “root mean squared deviation” or RMSD, is the square root of the variance, and it aims at providing information about the dispersion of the data in their same units:\n", "\n", "$s = \\sqrt{\\frac{1}{N}\\sum_{i=1}^N (X_i - )^2}$" ] }, { "cell_type": "markdown", "id": "b7499d39", "metadata": {}, "source": [ "We could calculate this manually, but again (and thankfully), R has a built-in function for doing this, `sd`" ] }, { "cell_type": "code", "execution_count": 19, "id": "d6f0827e", "metadata": {}, "outputs": [ { "data": { "text/html": [ "2.01791972090071" ], "text/latex": [ "2.01791972090071" ], "text/markdown": [ "2.01791972090071" ], "text/plain": [ "[1] 2.01792" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "sd(sleep$extra)" ] }, { "cell_type": "markdown", "id": "79b5e1ac", "metadata": {}, "source": [ "### Average Absolute Devation (AAD)\n", "\n", "R does not have a built-in function for this, so you either create one yourself, or make use of a very useful library called **lsr**, which has this function implemented." ] }, { "cell_type": "code", "execution_count": 20, "id": "cb2163d4", "metadata": {}, "outputs": [], "source": [ "library(lsr)" ] }, { "cell_type": "code", "execution_count": 21, "id": "2e57c4b3", "metadata": {}, "outputs": [ { "data": { "text/html": [ "1.664" ], "text/latex": [ "1.664" ], "text/markdown": [ "1.664" ], "text/plain": [ "[1] 1.664" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "mean(abs(sleep$extra - mean(sleep$extra)))" ] }, { "cell_type": "code", "execution_count": 22, "id": "1139fbd5", "metadata": {}, "outputs": [ { "data": { "text/html": [ "1.664" ], "text/latex": [ "1.664" ], "text/markdown": [ "1.664" ], "text/plain": [ "[1] 1.664" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "aad(sleep$extra)" ] }, { "cell_type": "markdown", "id": "476ae79a", "metadata": {}, "source": [ "### Median Absolute Deviaton (MAD)" ] }, { "cell_type": "code", "execution_count": 23, "id": "8bf597f4", "metadata": {}, "outputs": [ { "data": { "text/html": [ "1.05" ], "text/latex": [ "1.05" ], "text/markdown": [ "1.05" ], "text/plain": [ "[1] 1.05" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "median(abs(sleep$extra - median(sleep$extra)))" ] }, { "cell_type": "markdown", "id": "6f4361ca", "metadata": {}, "source": [ "R has a built-in function for this: `mad`" ] }, { "cell_type": "code", "execution_count": 24, "id": "0e8d659b", "metadata": {}, "outputs": [ { "data": { "text/html": [ "1.05" ], "text/latex": [ "1.05" ], "text/markdown": [ "1.05" ], "text/plain": [ "[1] 1.05" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "mad(sleep$extra, constant = 1)\n", "#we had to set constant argument to 1 to get the expected behaviour. Why? Look at the documentation for details." ] }, { "cell_type": "markdown", "id": "f0b83c5f", "metadata": {}, "source": [ "### Range\n", "\n", "This is just the maximum minus the minimum, so it's very easy to compute using the R built-in functions `max` and `min`." ] }, { "cell_type": "code", "execution_count": 25, "id": "e2871cc9", "metadata": {}, "outputs": [ { "data": { "text/html": [ "7.1" ], "text/latex": [ "7.1" ], "text/markdown": [ "7.1" ], "text/plain": [ "[1] 7.1" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "max( sleep$extra ) - min( sleep$extra )" ] }, { "cell_type": "markdown", "id": "caec387b", "metadata": {}, "source": [ "You can compute the same from the R built-in function `range` :" ] }, { "cell_type": "code", "execution_count": 26, "id": "7cffbd18", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
  1. -1.6
  2. 5.5
\n" ], "text/latex": [ "\\begin{enumerate*}\n", "\\item -1.6\n", "\\item 5.5\n", "\\end{enumerate*}\n" ], "text/markdown": [ "1. -1.6\n", "2. 5.5\n", "\n", "\n" ], "text/plain": [ "[1] -1.6 5.5" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "range( sleep$extra )" ] }, { "cell_type": "markdown", "id": "3cf1eab3", "metadata": {}, "source": [ "### Interquartile range (IQR)\n", "\n", "IQR is the difference between the 25th and the 75th quantile. What is a quantile? We'll learn this in more detail later in the course." ] }, { "cell_type": "code", "execution_count": 29, "id": "96736095", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
25%
-0.025
75%
3.4
\n" ], "text/latex": [ "\\begin{description*}\n", "\\item[25\\textbackslash{}\\%] -0.025\n", "\\item[75\\textbackslash{}\\%] 3.4\n", "\\end{description*}\n" ], "text/markdown": [ "25%\n", ": -0.02575%\n", ": 3.4\n", "\n" ], "text/plain": [ " 25% 75% \n", "-0.025 3.400 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "quantile( x = sleep$extra, probs = c(.25, .75))" ] }, { "cell_type": "markdown", "id": "c5a2c65f", "metadata": {}, "source": [ "Again, R has a built-in function `IQR` to make your life easier :-)" ] }, { "cell_type": "code", "execution_count": 30, "id": "cc2f0509", "metadata": {}, "outputs": [ { "data": { "text/html": [ "3.425" ], "text/latex": [ "3.425" ], "text/markdown": [ "3.425" ], "text/plain": [ "[1] 3.425" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "IQR(sleep$extra)" ] }, { "cell_type": "markdown", "id": "00209bdf", "metadata": {}, "source": [ "# Measures of shape" ] }, { "cell_type": "markdown", "id": "74004d91", "metadata": {}, "source": [ "$skewness(X) = \\frac{1}{N\\hat{\\sigma}^3}\\sum_{i=1}^N(X-)^3$\n", "\n", "$kurtosis(X) = \\frac{1}{N\\hat{\\sigma}^4}\\sum_{i=1}^N(X-)^4 -3$" ] }, { "cell_type": "markdown", "id": "1e2b0a49", "metadata": {}, "source": [ "These less often used, but in case you want to calculate them, you can do this by using the library **psych**, which is another useful package:" ] }, { "cell_type": "code", "execution_count": 31, "id": "2f9ca629", "metadata": {}, "outputs": [], "source": [ "# Install this\n", "library(psych)" ] }, { "cell_type": "code", "execution_count": 32, "id": "d35179cb", "metadata": {}, "outputs": [ { "data": { "text/html": [ "0.386338071862147" ], "text/latex": [ "0.386338071862147" ], "text/markdown": [ "0.386338071862147" ], "text/plain": [ "[1] 0.3863381" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "-1.0865904195985" ], "text/latex": [ "-1.0865904195985" ], "text/markdown": [ "-1.0865904195985" ], "text/plain": [ "[1] -1.08659" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Skewness\n", "skew(sleep$extra)\n", "\n", "# Kurtosis\n", "kurtosi(sleep$extra)" ] }, { "cell_type": "code", "execution_count": 33, "id": "928849e7", "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAA0gAAANICAIAAAByhViMAAAACXBIWXMAABJ0AAASdAHeZh94\nAAAgAElEQVR4nOzdZ2AUdd7A8Qk9EHrvUqSj0puKXVEUT+wgKHqiPqfY9RQVRcVy9rNzZ0E9\nCxYQDzwFFZQi2EVFRUQBkd7BkGSfF0kgYCAJJFny5/N5BbOT2d9mNuTL7M5sQiwWiwAAKPqK\nxXsAAADyh7ADAAiEsAMACISwAwAIhLADAAiEsAMACISwAwAIhLADAAiEsAMACISwAwAIhLAD\nAAiEsAMACISwAwAIhLADAAiEsAMACISwAwAIhLADAAiEsAMACISwAwAIhLADAAiEsAMACISw\nAwAIhLADAAiEsAMACISwAwAIhLADAAiEsAMACISwAwAIhLADAAiEsAMACISwAwAIhLADAAiE\nsAMACISwAwAIhLADAAiEsAMACISwAwAIhLADYAcWTnnqscfGfLU+3nMAuSXsAMhOytpf33no\nkgsvHPbSnPVp8R4GyB1hB3uMDy+tlZCu1qUf/vnmcf3LZNzcZtjXGcsmnJeUsezEF1N28W7/\n+OWT2ct2dea90bovn72yT6dG1ZJKl06qUq9lj5P/+elubS8/dmI+S1s8acQZB9Su2uCcV9ZF\n0ee3dahSo9WJN41fsLnA79mzEXaTsIO92Kaf37r1pDYtT//PgnhPUnQs+PepBw28Z+ysn5ev\nT05ev3Lhd1PnLC8d76Hy14px5/c4+roXv1i2teOSl3875pZjO5z2/G8Fd7eejZAfhB3srVa8\neV7rlr1veP3HDbF4j1KELBszavya9D8mNTu6/8AzTuh16iEt4jtT/kr+4KbB//opJYqi4vWP\nOaFjYhTVb9+pVrEoiqIlr194+avrCuRePRshn5SI9wDAbjj07u9/HZYWRVGUWDWvP81r5nz+\n06YCmClsCxdkHk9qfcWrE4a1ieswBWLa6NGLoiiKEtrdNH5ct4eqjp3V4KzR/+zVq91t30TR\n2rdeeye5719K5fu9ejZCPhF2UJSVrlynXuV4D7FX2bQpMz9q1aoV10kKSOrChYujKIqiyt0O\nbF08NX1hsQMue+ndwzfWadKkUb0q+V91QP7xUiwUZTt4333KoskPX3ryQa3rV0kqXaJE6aSq\n9VsfdNKlD01amPmmqTf6l0lodNUnGX+be1u7hISEhAPvz/LmptQlM5+/7dyj2zetWzmxdLkq\ndVv1PPXyRz9YkJzNELFVX4y6+uSujauVK1O2SsOOfa565vO10YSzMybr+I+fM9Zb/M9D0heV\n6f/G+i+fGNi5flLpxEp1W5868of0od97aMhferSqW7lsqRIlSpWtVLtZp97n/+PdLO/Y/3xo\ni/QttBj6eWz5jMeGHN+uQeWyZcrXbHHIoHs/WBqLomj1p09dcXz7BpUSy1So2eyg/sPH/5K7\n8xFyfrzrRh6TkJDQdcs3aeKF1RMSEhKaXvv5jrea447YudVfv3hD/0Na1qmUWCqxUq19u534\nf/f/b/6fj2vluNq6p3tnPE16P70ped6428/u2aJWhcTEyg3a9b74kRnLsp7yWrxixaQoiqJo\nxaT/Tt+6kaptDj+0U8sGVcps+0tj53e95JXTamTccULNM19blbE4bc4/emSeCFSx179+2cmz\ncfefM7DXiQF7iClDamb8XNYcMuXPN7/ZL/M9+q1v+ipj2fhzy2Us6/OfzRnLNn/7yBHVErL7\ncU+o22fk98mxWCz2er9s3u/f475fMzaxeOL1PWtkt4lyB1z0+q8p24z162vnNN3+GE6pVhcN\n6Z0xWYe752Ws+dtDPdMXlT6iX9+aW1aufN6EzbHUX188vWH2/9GsfMTjP6Smb+Gz65unL2ty\n+uUn1Su+7XrFm1886qkBTbd7GaJY7X5jluf0nc/V41375NHZjdfkms92sNXc7Igd7cRYLGXu\n82c2Lvnnr01qf9nbS2J5W23tU8dlLD58yE0HVtxu1eIN+j43d+tOXfbvYzP3Z+UOPVqXyfrE\n2EZu7nrZq2dsOa5Zb/A7a2OxWNqPDxyYmHkHxz21cKfPxt1/zsDeRtjBHmNr2EXFS5X+k5Jb\nfoftNOyWjTymTBRFUZRQvVP/q2+754H777x2QKfMwCjT61+/x2Kxjx84o2+vtpkv4pZrcUTf\nvn37Dh23PBaLxVJ/uL9n+cy7SihTfd+OndrUKbflzku2ufrD9Vtm/uWJw5IybylVp3OfAeec\ndkTzSllrJpuw20alAf9Njq1+6ZSMzSRUanPsgMEXnnfmUa0qZm6m+gXvpm9hS9hFURRFSc2P\n7H/+X09pv208Fa/W7sRBg07rUmdL9+13yzc7+7bn9vFueu+ufv36Hd2ibMYNtTqf3K9fv35X\nvDg/++3makfsYCfGkj8dtl9mXlVofsSZ55575mH7Zq5W8binFuRpta1hl/6QqrXs2bt3z5bV\ntkRwmc53fpeWOfiGKRc32aaXStfpfPKl9437dnXa1keX27uOrXizf52MpcX2vXz6xnkPHpK5\nVtUTXvgttvNn4+4/Z2BvI+xgj5El7HZup2E34eyMX3etsvTMHx/fcmSHw/5yzqU33T9hbsah\njHl3d8j42ibXZznotGJUnzKZv54PvH7i4vS1V31yf++6mb+f97/tm4xf8TOvaZqxsHjbKz/M\nODKW/NOLp9bfMmz2YZfY8Yq3561Zt+SL19/6/I9Y8rR7Tjp4v4aVShbb/4bP/siceeKFmfd4\n4P2LY7HYNmGX2POeb9PX3DTxwq1vdqt9yujf0mKxWCz1pxGdM37JJxz/zKYdf9fz9HhjsWlD\nMhcf/ujSne7OXO+IbHbiqudOzPjiKsc9+XPGwuTv/3lkRoHue81nablfbZuwK9fthhkrYrFY\nLJa27P0r22e2WbW/vv3H1im//ffJjf50FK1YtW6XjVmQmqcJY7FYbOV/B2V+z0q06tw+8z8C\n1U55+fct97ijZ+PuP2dgbyPsYI+RP2G3tTxK1ely+hV3j5owa/6alGzuLvtfpUseOSzjOE6p\nrv/4Kev6y148MfNFvH3//kksFovFfrq9fcaSiqe9si7LuosfPTzzmE+2YVfjwombY3+WvG5d\n5q/o1LU/j7+4dcbq+w//IRaLZQ278v3HbdnAmn8flbndptfO2rKxDy6unbH0oAd+j+1Inh5v\nLC9hl+sd8eed+Merp2TEZtK547McJksZc0bGi5j73f59rlfbJuwaXTUjy3d+06TzM79JDS6f\nus1Ua79/Y8S5hzYpv+1rySXaXD9zc+4nzLBmwl8bRtuqdcary7LeXc5ht4vPGdjbOHkC9kA5\nvMdupzoPPLdNeqokL5rx4j1XnXVMx4aVqzbp3nfIPa/PXpXjJ0N99dlnGWcbtD7++EZZb6na\n+/juGX/8YdasNVEURT/88EPGkhYdO5bLsm7N7t2b7OxeuhzYI5tT8jevmjd19EM3/u3MYzo3\nrVpln14Pzc64IS1t+7kbN2++ZQPlym2561atWm1ZJSkp89hQcnJ253yky9PjzZPd2BFzZ8/O\nOAFh3b96FUvYokSf/2xMX/7tN9/kerWsSnbq1jHLd7509wM7ZZTbL99//0fWNZP27XPtyEmP\nnVkhiuof9JeutdK/KuXr++8auyGvd13+6HufuqBRlkSsc9bIh0+qurNvwZ/t9nMG9g7CDgJT\nrN2N41+96pA6Wd/Vnrr6p2mvPXjlSW2b9rjh/RU7++q0FSsyz12sU6fOtreVq1Mn8xDWqlWr\noiiKrVmzNmNBtWrVtll3a1Vlp2KNGttF6obZT53XuXadtkf1u2T4w/95e+bcVWkV6teukPmY\nim3/T1WWmMtya/Hy5ROzWbwTeXq8ebMbO2LNmpw6cvOaNRtzu1pWVapX3/btc7VqZb63bd26\nHV16uMFJD3702XMnpe/h9dOnf53rCbdIOvCM3g22/K1M0w5tKuXw9dvb/ecM7B088yE4xeud\ncNd7Py/64s3HbvzrCV0bV9paFrHl02/td+OU1B1/bbEaNTILbdGiRdvetn7hwtUZf6xSpUoU\nRQnlymWeSbBy5cpt1l24cOFOJsyaZVEURevHX3L0oH/NXJ4SRZXannrVfc//79NfVq745IaO\nmVP96Zd0QkJ2p5uWKJHXS3Pm6fHm1S7viLJlM7+v9S/9YG22XjojMberZbVy6dJtLv+StmZN\nZs5VqFAhiqK1X7x0721Dr7jo7AH3fZxlvGK1Tj3r6PQXX9evX5/rCTMlfzr8/x6dv+WvmyZf\nP+iRn/L2ERO7/5yBvYNnPgQndf3iHz6d+klKp8E3PzFm2tzlqxZ98b8nz9s/IysWTZ4yd7sv\niMW2/o5t3qpVxj8LX7859qesay1/881pGX9s1aVLUhRFUfPmme95+2zy5LVbV0374n8Tl+xk\nwu0CbOObjz+b3oGVznpm2kt3XXrmke3qly++9ahQAf6SztPjzaM874hMTba80Lxw9nebkrZI\n+X7iWx9+NX9VWumkMiVyvVpWyZMnfpjlGm+xz6bPyHiZunGLFiWjKCrz02vXD73t3kefGTX8\nvjeyxvrSX35Jf/G1Xr16ebzr5M9uPnvE19sE5fpJ15zzyNxs0y7rszGLPeg5A3s0z3wISsrE\nIfXLlq/drMMhx5x086Q1URRFxcrW3u/IQecfnflW/qSk9EMfpUplnhG5atmylChKS0lJi6Ka\np5xxWPqFQjZPv3Xg0Em/p79Tac2nD5595Zj0A1glOp5zVtsoiqKo8bHHZpTdxjHXD37+h+Qo\niqLYkvdvGHx/5ludsrXd8bZlv/22NTYyb9n4+VMvZl76NyUld1cZ3gV5erx5kIcd8Wdljz7+\nsPSdk/bOndf+b1l66Wz+8oHzTjq9V/c29SvWPOONtblebRuLn7z4kvEL04/FrZl5++WPz0tf\nXqtXrwOiKIpKHnlS7/SIXfniBafcNWNFLIqitNWfPXzeiClRFEVRs97H7Zunu06edfPAu75K\niaIoKrbvlc/c1iX9FdX1k68554GtaZf9szGLPeg5A3u2wj5bA9iRfLlAcfKsa1pmXsEtad/e\nF91w5z13D79qQLfaGceJyhx4/8/pX7vxP30yVyxecZ/W+1brPPzbWCwW2/Tpje0zLwASJZSp\n3qxj5zZ1k7ZcFa50+xs+3rhlqN+e6bXlGnBRmRqtunRrm+UacFGU/VmxDa+YmfWRbX77nMxX\nOhNqHPh/dz326D+uPLF5lmNkja/+NBaLZT0rtsd9v239+lf6Zs428M2tS7eu3OXubC+wmyFP\njzf3Z8Xmfkdkdx27DZMv3XKyQcm63c782xVDTu9UNeMbW6z5tR+n5GW17a5jV6zCPl2O6nVI\nmxpbLixdtsd9P2YOnvLlrQdse8XpEiW3vIhc+6w3luVpwk0zr2+deait0f9NWh9L+eLmdpkL\nEg+6b07G5VN28Gzc/ecM7G2EHewx8umTJ5K/+/eJDbJ/r1nJhieP2voZA4sePnSbX+BVB7+T\nfkPqL28O6Vo1uzexle946VuL0rYZa/GYQY22/7SHeqcPHdQs4y8d/5HRLzv+JR1L+/7hw7b/\nPIQoKtugQcb730qe+OKGWKyAwi5vjzf3lzvJ9Y7I/pMn/vj6n71qZPOaSkLdE//1Yx5X2xp2\nlQ7vf0Ld7R5piUZnvDw/64NM/Xn0uS0S/7TJMo1PeezLLI2bi7v+Y9a1W7Ku7jlvr05fOPPq\nlplfltjt3oy0y/7ZuPvPGdjbCDvYY+RT2MVisdjq2a/cdn7vLs1qVSxToniJMhVqNO14zKCb\n//Plqm22mPrrhBv6dqhXoXTJxEq1W/Q49ZFPtv563/zbtGduGXR0+ya1K5YplVipTsuDT7vi\n8SmLsruSWGz5jCcuPu6AepUSSyfVaHHooPs+WJz61U1bEuyhzATb8S/pWCy24ZsXrz2pc6Mq\nZUqUKF2hdosDT7vu+a/WfHx5xpmU5U55aXWswMIuL483D2EXi+VuR+xoJ8bSls965rr+R7Rt\nUC2pdKmkGo3aHnTa0OdmLdv+Wng5r7Y17KoOfi/lt4l3n33QvtXLlipdsf4BvYc8PnN5Np++\ntfHniQ9ffdbRbasXi6KKrXqff+OT7/3851Da+V3/8fHf22Qehqve7/Wtn+y2fvKQLUf7Ervd\nlZ522T4bd/85A3ubhFj271MFyNnahXOWFq9Vr2bFUtsdBZp5TcPOd/0SRVHCCS9sHHNGrq7A\nR0FZ93Tv8ue8FUVRVHXwe8seOyT3X/nuBZWOfLzNfb9+eGm9ApoNyF9OngB23Vf3HNmkdqUy\npSvUatTh2ve2XOA2+cu33l2Q/semrVurOoDCkteLPgFsdcDBB1e47/k1m9f+/vOnd5/U/ovj\nujcos2nxN5PfnfZLWhRFUcJ+/fvtF+8h2XVHPLYq9li8hwDyQNgBu67siXc8duqU/i//khZF\naau+mfD8Np9gVbHLsJFXtd7R1wKQ77wUC+yOeme8+PnM54YNOq5ri3pVkkqXKFaiTPlqDfY7\n5NTL/znxq8k3dvrzyZUAFBgnTwAABMIROwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAI\nOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBA\nCDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCA\nQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsA\ngEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7\nAIBACDsAgEAIOwCAQAQQdmsXfffdd4vWxnsMAIA4CyDs3r6kZcuWl7wV7zEAAOKsRLwHyI0V\nP0z/fvkOb/1hRRRFK36cPn16FEVRVLVZ132rFNJgAAB7kIRYLBbvGXI0+uSEU17N7cp9X4mN\nPrkgpwEA2DMViSN23c654uAp909eklpqn6POOa1dpW1vnfPGnW/MaX7CNSe2jKIoitq2yMum\n09LSJk+enJKSspN1YrHYkiVL+vXrl9e5iaLot99+mz17drynKDwrVqyIoqhKlb3lqPHe9nij\nKGrdunXt2rXjPQVA9orEEbsoimLLP3700kHXPjcn6bArH37ypr80LrPlptEnJ5zyat//xEaf\nvgvbnTdvXpcuXXYedikpKWvXrk1OTi5ZsuQu3MVe7rzzznv22WcTExPjPUghWbduXfHixT3e\nUG3cuHHAgAEjR46M9yAA2SsSR+yiKEqo2vmiUZ/2PuO2wYNHnNT2tb63PPnPSw+uVXy3t9uo\nUaMlS5bsfJ2pU6f26NGjiBTwHic1NfXYY4+99dZb4z1IITnuuOPatWvn8YZq6NChqamp8Z4C\nYIeK1FmxpRoce/P42bOe7l9+0lWHtOp6wb+/XBPvkQAA9hhFKuyiKIqiCvsNfHzGNxNHHLLy\nmXM7tjpi6Nif/oj3SAAAe4KiF3ZRFEXFax16zWtffvHy3xp9NaLPNePiPQ4AwJ6gaIZdFEVR\nVLbZKfdO/uajB/96TM+ePVvXiPc4AABxVlROntiBhKpdL35i/MXxHgMAYA9QhI/YAQCQlbAD\nAAiEsAMACISwAwAIhLADAAiEsAMACISwAwAIhLADAAiEsAMACISwAwAIhLADAAiEsAMACISw\nAwAIhLADAAiEsAMACISwAwAIhLADAAiEsAMACISwAwAIhLADAAiEsAMACISwAwAIhLADAAiE\nsAMACISwAwAIhLADAAiEsAMACISwAwAIhLADAAiEsAMACISwAwAIhLADAAiEsAMACISwAwAI\nhLADAAiEsAMACISwAwAIhLADAAiEsAMACISwAwAIhLADAAiEsAMACISwAwAIhLADAAiEsAMA\nCISwAwAIhLADAAiEsAMACISwAwAIhLADAAiEsAMACISwAwAIhLADAAiEsAMACISwAwAIhLAD\nAAiEsAMACISwAwAIhLADAAiEsAMACISwAwAIhLADAAiEsAMACISwAwAIhLADAAiEsAMACISw\nAwAIhLADAAiEsAMACISwAwAIhLADAAiEsAMACISwAwAIhLADAAiEsAMACISwAwAIhLADAAiE\nsAMACISwAwAIhLADAAiEsAMACISwAwAIhLADAAiEsAMACISwAwAIhLADAAiEsAMACISwAwAI\nhLADAAiEsAMACISwAwAIhLADAAiEsAMACISwAwAIhLADAAiEsAMACISwAwAIhLADAAiEsAMA\nCISwAwAIhLADAAiEsAMACISwAwAIhLADAAiEsAMACISwAwAIhLADAAiEsAMACISwAwAIhLAD\nAAiEsAMACISwAwAIhLADAAiEsAMACISwAwAIhLADAAiEsAMACISwAwAIhLADAAiEsAMACISw\nAwAIhLADAAiEsAMACISwAwAIhLADAAiEsAMACISwAwAIhLADAAiEsAMACISwAwAIhLADAAhE\niXgPsHvSNi3/9adfVyZUb9ysboXi8Z4GACCeis4Ru9i6H955+t4RIx78z0eLkqMoipZOGt67\nRfVq+7Ru165VvWq1Ova/f/qKeA8JABA/ReSI3eZvH+971P+9uSA1iqIourHL8IlPVb6s941T\nNlXYp9MRbaon//r5rE+ev6zn5/PfmXrfwRXiPCwAQFwUjSN239074G9vLm3c98aRz4/655VH\nlZ057IQjhk8p3uOmKT/M/fidN9/64POf50y4skuJ2fefd/vMWLynBQCIiyJxxO6bF56ZFety\n19ujr2oURdGZp3cu1qbzXXP2u3nSsB41MlYpWe/ou166cfI+145+9Ys7Oh0Qz2kBAOKjSITd\nvHnzogYX92yU8dcSnU7v2/iuO1u1brHNWgkNu3etEz32669RlPuwmzdvXpcuXVJSUnayTvqt\nsZhDgcDeZdiwYQ8++GC8pyhUl1xyybBhw+I9Bey6IhF2NWvWjBbPnbs+6lwufUHT4y772+LV\nSaujqHKW1VZ8992SqGrVqnnZdMOGDV9++eWdh93s2bMvvfTShISEvE8OUITNnz+/ZcuW55xz\nTrwHKSRPPfXU/Pnz4z0F7JYiEXZtjjqq9j3/uu68p9o8cXbb8glRVL773x7qvs0qacum3nPW\nDf9LqfnX3h3zsulixYodcsghO1+nbNmyeZ0YIAzVq1fv2rVrvKcoJOPGjYv3CLC7isTJE2WO\nuvmhM+svfHHQfjUOuOWrP9286JULOtRv0OPqCcvqnfHILUeVisOEAADxVyTCLopq931u1nsP\nXtS7Wat69f90Y9Ka+Z8uK9/+tOETpo86qVYcpgMA2BMUiZdioyiKEqr1uPjhNy/O7qYKZ7yw\nrF/FqmWKSKQCABSMIhN2O1O2cp5OmAAACJKjXAAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYA\nAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2\nAACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQ\ndgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACB\nEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAA\ngRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYA\nAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2\nAACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQ\ndgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIHIMezeH9Hvuife\n/nZVamFMAwDALssx7FZ89p8Rg49pVatex5OveHDs50s2F8ZUAADkWY5h95en5k0ZddvgQ6vM\nG3PvkD7t6tber/ff7n5pxsJNhTEdAAC5lmPYJZRreGD/6x4bP3vxwk/feOCKvzRePvHhq0/v\n2qBm8yPPu/XZD35eFyuMMQEAyEnuT54oWaNdn0v+8fLHv/7+4wcv3nvFCdXnPHfDwEMa12x0\n8IAbnp78iyN4AADxldezYtNWz5314YcfTpky5fOFf0RRqRoNk+a+cus5PZs0PfyGSUsKZEQA\nAHIjt2G3adHM0fdd1rdTvZrNDj/7+n+O/6XGsZc/NO6rRQu++frXxd+N+Xv3dZNuPWHgv6Qd\nAEC8lMhphd+njXz48Rf+8/oHP65Ji6LE+t3PuHbgwIGnHdGiYvHMVSo2P+H2+we/1P6uyZM/\nic7tVbADAwCQvRzDbso9fx3+akK5RgeddenAgQNOObRJ+WwP8pWuWGeflm27Nsv/CQEAyJUc\nw67RCbc8c8mAvgc1LJews9VaXTdl3nX5NxYAAHmV43vsOgy4YcDBtVdPf+GOkR9uyFw4676z\nLrz12RlLXeoEAGCPkYuTJ1ZPu/nw/br3+/vd4+dkLFn/7bsvPHbDwB7tj3vgi40FOh4AALmV\nY9ilfHz72cM+TOlyyVPPXtQyY1m5s17//YsXL++0bvxlpw2flVbAIwIAkBs5ht3Xr736fekj\n73rzgbO71C2zZWmpavudds/rtx5aYs4LL35aoAMCAJA7OYbdwoULo9r77189m5tqdWhfJ1q0\naFEBjAUAQF7lGHa1a9eOfpk5M7srD6/88ssFUa1atQpgLAAA8irHsDvgpJMapX1wQ7+7P1m9\nzTmw6756qP9176TW69OnQ8FNBwBAruV4HbtiXa977LzXjx15dad6D3c6qEuLOhVKpqz5bc7M\nDz+et6ZYk0GjbupZPKdNAABQCHIMuyiqctTj06e0ve66h16ZMv7lj9OXFSvXoKAV78cAACAA\nSURBVMegW+6485Lu1Qp2PgAAcikXYRdFxap3u+TJ9y55dO3iXxYsXrmxWFKNfZrUq5CrLwUA\noJDkpc5KlK/VuKVTJQAA9ky5CbvUpZ+++uTI0VPnrtiQnJoW2+5zxHre/P7NPQtkNgAA8iDn\nsFv51uD2J/xrwQ4/X6La0nwdCACAXZNj2P0y8pZ/Lah6+K3P3H9Ot8bVkkr96fooxZwVCwCw\nJ8gx7L784ouo623/ur5Xw8IYBwCAXZXjBYrLli0bVaxYsTBmAQBgN+QYdp0PPzxp2muvLy6M\nYQAA2HU5vhSbdMqI+5/ucemxg1bfeN5RBzSqllRyuxYsXaFa+VIFNR4AALmVY9iNG3zg9TPX\nr1/+1GV/eSrbFfq+Eht9cr7PBQBAHuUYdlX37dr1wJ2t0Llu/k0DAMAuyzHsul39xhuFMQgA\nALsnx5Mntkpdt+Dr6R+8M+Gz36MoZfXKdbGcvwQAgEKTq7BLXTxpxBnta1Wu37bbIUf1um1K\nFP340OF1W/S548OVBT0fAAC5lIuw+33s2V2PvO7FOeU6nXBcuyrpy1KSKpScO/bvRx1xx1c7\n/KwxAAAKU45ht3nisAueW9Dk/LFzfpw65o4TaqcvbXPp+3Pevbxtyqcjho9eW9AzAgCQCzmG\n3ayxY38r1/e2+4+vt915FlUPuX3YyUlrpk37pqBmAwAgD3IMu6VLl0Y1GjZMzOam0rVrV4mW\nLl1aAGMBAJBXOYZd3bp1o/kzZmT3kWLzp3z0a1S3ruvYAQDsCXIMu3Z9T26cNnlY/xHTlme9\nvknKb5Ou7zd8RmyfPn32L8DxAADIrRwvUFys6/UjL37rmIeuO7DxEx1alVoQRZse7d/7no8m\nT/95bfHGgx65vnseLoUHAECByUWVVTr0wY8+euT8gyotmTX9+9VRNHfS829NX5jY8cw7J059\nslfVgp8RAIBcyPGIXRRFUVS544WPv3/hQyt++vaHhas2FU+q2bhls1plHaoDANiD5C7s0pWq\n0nj/Lo0LbJRciaVsTi1RMvupY8nrVm9IKVWuUtmShTwVAMAeIMewm3bXiXdO3dkK3a954+pu\n+TfQjmyc88oNV97y7DuzlyaXrr3fUf0uvem6ge0rJ2RdZfkTvatf/EHfV2KjTy74eQAA9jQ5\nht3Cj8eMGZP9TcWSqteuWKpmIXxebNrcx4/vesHEVSUrN96/Q+LS774e+49z/vvK6/e88cIl\nB5Qr+LsHACgKcnyfXJ9nV25n+e8Lfvzkf09e2r1KsQb9np/9+LEFPuTGMTdeM3FV/dOe/mrh\n3M9mfb1g8ezXrz+yxoKxQw45avjMdQV+9wAARUKOR+xKlq1Uqex2yypVqVG3Sftu9Vbu2+uM\na474+bEjSxXQdBlmvvPO6hLH3vfEwObpH4CR1PzEWyd07nzRkac+fuMxJ1aY8t8hrXZ1gpUr\nVw4dOjQlJWUn6/z++++7uPXsvPXWW2PHjs3HDe7hpk6d2rJly3hPAZCz33///dtvvx08eHC8\nByk8J5xwwnHHHRfvKchPeTl5YjtJx5x8dLlnXn9jxmNHHpR/A2Vn5cqVUe3mzStkXVaszgmP\nTnw5+aCTnrrsmIG1p79wap2EHX35nmX06NHvvfdep06d4j1IIVm0aJGwA4qERYsWJSQkrFix\nIt6DFJKZM2cmJycLu8DsRthFa5cu/SNat67gXwutWbNmtPCLL5ZH3be5al5CrROemPDo793P\nf3FArzrVP7in7S5sunLlyg8//PDO15k6deoO32e4Sw444ICbbropHze4J/v444/jPQJAbu1V\n/z4PHTo03iOQ/3J8j13a5k3b27h+7crfvn33voE3TUgp3rFjuwIfssOxx9ZMmzRswH0zV6Rt\ne0uJJn995a1hXUp+ee/xPc9/4ftNBT4KAMAeK8ewe+2MxO2VTapQpU6rIy9//dfi+w65ZVCt\nAh+y5FHDHjqlzrL/Xt6lfu1mpz/96zY3lu140/jxQ7uV/PLJIQ/NKPBRAAD2WDm+FFur3dFH\nb/9ia0KxEqWSarQ4+ORzzurVolAuN1LnlP/MrN7lpuFPvv7R2rSK299a+cDhk2bud915Qx7+\n4LfkwhgHAGAPlGPYHXj9hAmFMUhOitc55IonD7niydTU1OLZ3Fym6Sn3vn/C1V9+NH1d00Kf\nDQBgT7A7J0/ERfHi2XVdutK19jvsxEKcBQBgT7L7HymWVSF9vBgAAH+WY9itnDtr1sz1yxat\n+iOKomKlK1StWHLDiuXrs7+gb4n++T0fAAC5lONZscc+9uFd3SvEah585ahpP69Zv3rJ78vW\nbdqwaNYL1x5eN6FKz3tnLs/yaWPP9imMmQEAyEaOR+xWPTfk/NcrXf3pO8P32/KxXcUTa3c4\nY8S45qntOwy9+c2z3xxYuWCHBAAgZzkesZsxadL6tqf23+/PH8Zapv2Jx9TfMGnS9AIZDACA\nvMkx7EqXLh3Nnzs3u/fUrfnpp+VRUlJSAYwFAEBe5Rh2HQ87rMLKUZf+35sLt2279V8/cfY1\nYzbUOemkrgU2HAAAuZfje+yS+t561+ETL3jihH3Hdz3y8I771koq8cfKX7/+8O33Zi9PaDZ4\n3K2HlyyMOQEAyEHOFygu3mzw2KlVb77ihifeHvv0lvfTla7TfdCD9979ty5VCnQ8AAByK1ef\nPFG2+cl3jjv5tlU/z/5u/pLVm0tVrtuibYuaiQkFPRwAALmX43vstkooUaJ4FEWxqi06t6ya\nvGpdrMCGAgAg73IVdqmLJ404o32tyvXbdjvkqF63TYmiHx86vG6LPnd8uLKg5wMAIJdyEXa/\njz2765HXvTinXKcTjmuX8Za6lKQKJeeO/ftRR9zxVVrBDggAQO7kGHabJw674LkFTc4fO+fH\nqWPuOKF2+tI2l74/593L26Z8OmL46LUFPSMAALmQY9jNGjv2t3J9b7v/+HrbnWdR9ZDbh52c\ntGbatG8KajYAAPIgx7BbunRpVKNhw8Rsbipdu3aVaOnSpQUwFgAAeZVj2NWtWzeaP2PG4mxu\nmj/lo1+junXrFsBYAADkVY5h167vyY3TJg/rP2La8qzXN0n5bdL1/YbPiO3Tp8/+BTgeAAC5\nleMFiot1vX7kxW8d89B1BzZ+okOrUguiaNOj/Xvf89Hk6T+vLd540CPXd8/DpfAAACgwuaiy\nSoc++NFHj5x/UKUls6Z/vzqK5k56/q3pCxM7nnnnxKlP9qpa8DMCAJALufpIsahyxwsff//C\nh1b89O0PC1dtKp5Us3HLZrXKOlQHALAHyTHsFj562ulTm1996y3HNyxVpfH+XRoXxlQAAORZ\njkfdvpj25ofPT/+tUmEMAwDArssx7GrWrBnF1q1bVxjDAACw63J8KbbD358a+mHfYb3Pjl1/\nzhHtmtauXK7UtjFYqlylsiULbD4AAHIpx7AbP/TsUb/G/lj4zJWnPJPtCn1fiY0+Od/nAgAg\nj3IMuwoN2rQ5IGpzwA5X6Fg7XwcCAGDX5Bh2Pa4dN64wBgEAYPdke/LE4k/GjRs37efNhT0M\nAAC7Ltuw+3DE8ccf//dxq7cu2bzw8/fff3/20sIaCwCAvMrlp0esfv3SQw899KYPCnYYAAB2\nnY8FAwAIhLADAAiEsAMACISwAwAIhLADAAjEji9Q/NPY4dcuSMz4y8ZPf4qi6Ovnr7121vbr\nte1/R782BTQdAAC5tuOw+/WdB+98Z9tFc964884/rde3o7ADANgDZBt2nYeMGnVibrfQsHP+\nTQMAwC7LNuwaHNS//0GFPQkAALvFyRMAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYA\nAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2\nAACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQ\ndgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACB\nEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAA\ngRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYA\nAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2\nAACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQ\ndgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACBEHYAAIEQdgAAgRB2AACB\nEHYAAIEoEe8BcuP7N/8xdk5uV25+wpXHNyvIaQAA9kxFIuwWvHXHNY8vT8vdyn33yUvYpaWl\nTZ48OSUlZSfrzJ49O9fbA4CiYdOmTYsWLXr33XfjPUjhad26de3ateM9RcEqEmF32CPfTa5z\ncp+bPlhe9cgbH76oXemdrVy3c142PX/+/FNPPXXnYZd+aywWy8uGAWCP9t133y1atGj69Onx\nHqSQbNy4ccCAASNHjoz3IAWrSIRdVKxajxsnTCx+WI+h74768LqrHjokKb+23KhRoyVLlux8\nnalTp/bo0SMhISG/7hQA4i4Wix177LG33nprvAcpJEOHDk1NTY33FAWu6Jw8UWb/61999LiK\n8x6+YPgnOzu+BgCwlyo6YRdFUe2zHrijb5sS77343sZ4jwIAsMcpGi/FbtFk8OgvB8d7CACA\nPVKROmIHAMCOCTsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBA\nCDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCA\nQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsA\ngEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7\nAIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAI\nOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBA\nCDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCA\nQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsA\ngEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7\nAIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAI\nOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBA\nCDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCA\nQAg7AIBACDsAgEAIOwCAQAg7AIBAlIj3ALsgbePK35etXr9+Y0qxUokVqtaoXimxeLxnAgCI\nuyJ0xG7jTxMeuuyUHs2qJyVVqdOg0b4tW7Vs3nSf2pXLJVVr1qPv5Y9O+nljvEcEAIijInLE\nbvP3/z6z1wWjf9oclarauFXX/evWqFi2TOniqX9s3LB6ycJ538967b6prz3ywICn/jvyjMYl\n4z0tAEA8FI2w+3LEKYNHL2h02v1P3vnXgxuWTdj+9tiG+ZOfvPb8q58deGabDtOuavanFQAA\nwlckXor99Lmnv4x1vmX8C0N6ZlN1URQllG3Yc8gLE27rFpvxr2dnF/p8AAB7goRYLBbvGXI0\nrn+Z4z+6ePq8u7vsfL2Z1zTp/GD3cRtHHZfrTc+bN69Lly4pKSk7WSclJWXt2rXJycklS+bD\nq7znnXfes88+m5iYuPubKhLWrVtXvHhxjzdUe9vjXb9+fYkSJcqWLRvvQQrJhg0b0tLS9p79\nu7c9n/e2x7tx48YBAwaMHDky3oMUrCLxUmzDhg2jl2fMWBB1qbeTtWLzJ3/4S1S7b+08bvrl\nl1/eedjFYrElS5bkS9VFUTR8+PDTTz89XzZVJKxYsSKKoipVqsR7kELi8YbN4w2bxxu81q1b\nx3uEAlckjtjFvhvRue11s1ucddeDN5/ds1HSn14/jm1a8NG//37+5c/NaXrDrK9uaec9dgDA\nXqhIhF0UJf84atCx5z3/Q3JUolLDli2aNqhVqVxi6eKpyZs2rFqy8Oc53/y47I+oZIM+D4x/\n6cJWpeM9LQBAPBSRsIuiKNow9+2R9z/+8sRpn81ZvCFt6/JiZWs0OeCgY/oO/L/BxzcvF7/5\nAADiqwiF3Rapm1avWLF6zdr1m4uVKVe+co2alUp77RUAoCiGHQAA2SgS17EDACBnwg4AIBDC\nDgAgEMIOACAQwg4AIBDCDgAgEMIOACAQwg4AIBDCDgAgEMIOACAQwg4AIBDCDgAgEMIOACAQ\nwg4AIBDCDgAgEMIOACAQJeI9wF6nW7du06dPj/cUALDX6dq167Rp0+I9RcESdoWtcePG1atX\nv+mmm+I9CAXi5ptvjqLI/g2V/Rs2+zdsN998c/ny5eM9RYETdoWtVKlSVatW7dChQ7wHoUBU\nrVo1iiL7N1T2b9js37Cl79/geY8dAEAghB0AQCCEHQBAIIQdAEAghB0AQCCEHQBAIIQdAEAg\nhB0AQCCEHQBAIHzyRGErVapUvEegANm/YbN/w2b/hm0v2b8JsVgs3jPsXVauXBlFUeXKleM9\nCAXC/g2b/Rs2+zdse8n+FXYAAIHwHjsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsA\ngEAIOwCAQAg7AIBACDsAgEAIOwCAQAg7AIBACDsAgEAIOwCAQAi7OEj97YP7Bh+zX/3KiaVK\nl6/VoudZt7w1LzneQ7HbNi989+6BB7WoWzGxXLWmPfrdNn5BSrxHIv/4sd1r/DH1yhbFE+pd\nOT3eg5Cf1s1+4e99uzSuWi6xfO1mnf9y7cvfrI33SAUlIRaLxXuGvcyiV07pfProhWWaHvaX\no/arsmHOxNfGf7OmyhGPTp9wwb7F4z0cu+z3NwZ07TtqYZ2DTu/budLiKS+/8vGS2qe99MmL\np9SM92TkAz+2e40/Pv77Ad3v+C617hXTFvyja7ynIX+snzq055G3fZLQ9KjT++yXuOijV1+Z\n9lvZQ/45693/C/LnN0ah+mPCeTWiqNxh936zMWPJ5p+f/kvVKCp3/DPL4zoZu2Pj/86vGUX1\nznpjWfrfUxe9dFqdKKr517fXx3cw8oMf273GH5/8vU2JKIqiqO4V0+I9DPkk+eNr9k2IKh58\nxxcb0hekLRs7sE4UJZ3x6h/xnaxgeCm2kE19/fUlUY2zbhzSskzGkhINB95wbuNo/f/e/sjB\n06Jq3Sv/fOb3qP0lw/pUTV9QrPapd17RKfr9+Sff3BDf0cgHfmz3Eimf3Tbo7u9aHHdUg3hP\nQj5KnvDPx39IaH/dyKv3S0xfklD1+Gtvvejsfs1KLInvaAWjRLwH2Muk7nPqnQ81Kdb+gG2K\nOjExMYqSN25MtUOKqBmTp/wRNTz00MZZljU89NDG0cz3358ZndYzboORH/zY7h1SvhwxaMRX\nza6cdl3szLf+lxrvccgv08ePXxW1P/W0fROyLGxxzsNPnRO3kQqWI3aFq/g+h537t6vO6V4x\ny7LYnDfGfhdF+7Vv59dDEbVi7tyVUdS0adNtljZq1CiKln3//co4TUV+8WO7N0j95o5Bt365\nz5CRN3UuFe9ZyE9Lvv56aVS5Xbvq3710zYkd61dKTKzYoOPJQ9/8KdiTn4RdvKX9/PjFd3yW\nWu7YIeftG+9Z2EXLly+PoqhSpYrbLK1YsWIURatXr47LTBQkP7ahSfvmH4Nu/bTu354c3q1M\nzmtTlCxatCiKyi9+9pgupz/8WYn9j+p9cOPNX716W5+uJz41Ly3ewxUIYRdXsSX/vajXJe+s\nrnrcQ4+dXSve07CrNm/eHEWlSpdO2GZpQunSJaNo06ZNcZqKAuLHNjhp399/7rAZtc5//Lae\n5eI9C/lt/fr1UfTLm8/PPfzJT+dMH/fyK29/9t2Hw3qUWTr+4ktGrYj3dAVB2MVP6q+vntfz\nL49/l9ht2H9fPKd+Qs5fwR4qMTExijYnb3dgP/bHH5ujqFw5vylC4sc2PLG5D5x7w/SqZz92\n15FJ8Z6F/FesWLEoikoefvPj5zXLOBxbsdPQ+y9qEq0f//JbIV7MzttDCkryuCHtr52YZUHP\nOz59uPeW925s+PLBU3td9taiqofdMX7MNR38c1KkVa5cOYpiq1eviaIKW5emvwib/oIsQfBj\nG6DYTw+fN/TD8meNueeYCjmvTdFTsWLFKFrUpFu36lkWFm/XpUPJaO7cufOjqE3cRisgwq6g\npK2aP3v27CwL9lm15dX8lVOu73X87TM2NDn96fHPDNzXO3WLukrNm9eIJs6bNy+K9t+6dN68\neVFUp2VLvy3C4Mc2TMv/O/r9DVE0qk/VUdvecE+3hHui1sO//Xpoi/hMRv5o0qxZ8ejb2PYf\nxhCLRVFUtmzZuMxUsIRdQSnT/41Y/+xu2PT57b173z4jtf3l48b/46gaXsoJQYcDD0x85LUP\nPlgY7V83c9kv77//U1Tm5O7t4jkY+cSPbbBKN+3Zt2+1bRat/HrCpDlpzQ49tm2VBv5jVuSV\nPvCgTgljZrz33sKozZZ/n9O+/OTzzVHFtm3rx3O0ghLvKyTvbTZOuaxZsSih8V/Hr4z3KOSj\nNWPOqhJFDQaMWZIWi8VisbTFr55ZL4pqDn5nU5wnIx/4sd27zLymiU+eCMlvT/dOiqLaJz07\nPyV9waY5Dx5VPorqXjgxOb6TFQyfFVu4Fj9xWMPB7yUnNe7Sof72J9V3uGrcPcd5105RtWDU\niR0HjFlep8cZpx9YfcmU//xn6uJ6/UfPGHWSz4ot8vzY7mVmXdu0052bfFZsOGK/PH/KgQNe\nXVixbe/Tjm6S8t3/Xhk3e0Pjc8fMHNmrSrxnKwBeii1UqdMnT02OomjdTzM++Gn7G0v0T4nD\nSOSTeme9NDXptmtHjBr3yP0plRq2Pf32p26/4mhVFwA/tlC0JTTo99LH9f457NYnxz7/0NsJ\nVZt0GfzATcMv7hFi1UVR5IgdAEAgXMcOACAQwg4AIBDCDgAgEMIOACAQwg4AIBDCDgAgEMIO\nACAQwg4AIBDCDgAgEMIOACAQwg4AIBDCDgAgEMIOACAQwg4AIBDCDgAgEMIOACAQwg4AIBDC\nDgAgEMIOACAQwg4AIBDCDgAgEMIOACAQwg4AIBDCDgAgEMIOACAQwg4AIBDCDgAgEMIOACAQ\nwg4AIBDCDgAgEMIOIC/Svn5x2Itfp8V7DIDslIj3AABFxcaf3nt1zLsvPnj77NYpi/v3PvXk\nrnX8GwrsURyxA4qidSOPSUhI6P3cpsK6w+RvHv1Ly1aHnXX5XW8viH4ZP+KyM7o173D5/1bl\n2x2kLZvx6L/eX59v2wP2SsIOIEexr0ac/rc3fq172hOfLnrquOj4kT+Ov7xD2pf3nfm3N1bn\nyx1snnBBi+4XvfLj5nzZGrDXEnYAOfryhee+Sqt0+v3//mu7yiWiKEqsf8xdD1/UMFr+2vMT\nNubHHaQuW7Lc+/aA3eb9IQA5WrhwYRTt26JF2ShKyVhUvNNlY6b0K9+iRZm4TgaQlSN2/H97\ndx4UxZXHAfzXzAzMMIicKocHV5RrwKghxuCxBAEJR4yu14qgMYIgUVBEXVRcLI3ZiCsqKLio\nGLWsJIJGyJpEUXQRBATdqCF4FASWgBEI9/n2Dw5nImbG3UrtpP1+/pvXP97r6amivv26+zWA\nuur54cvtAX9wsjSWamnrmzu6Be7IetjxvOKGkiPrZr9maSjVEuuZy7xC9+XWMpULuo77c5xR\ncMbN5GA322FSsY6x5eT5W86WtfcXmJubE31/9WqNXIcapk5vOlsaiTkVhmjJWW2twXGj3v+m\nqb+4o2iDo4gT2ETkNKf/SSxZnEFE/1iuz3EOcfeIqO3I2xw3Zu2FK3/xsNIVaxtZzztaSURE\n9SWfbFo4w8HcQKopkgw1sZ26cEtG2XMPCwC8bBgAgDpqvLTaVkTiUdMWrYqKXhv8jkxfgziT\npefre7cmexCRd1orY4yxuq9X2YmJNEe7LghdF7Vy7gRjAYksFqdX93empKAzzY9Iy8hoiNRx\nyZ6zObkXT8Z6jRSQkfuh0p7ein9tHy8k0pEFJX61cyb5nex8Zn+VDNGcvcqaI+6ViGttjDHW\nlhdtLyCBXdQ/Wxl7cCExPkBGRLYLdyUknLhRxxhrTfUm0reyMtA0d/Hxd3eYHFPEGGu9HuMo\nIW6Ijfvi0Ii1HyybPWmYkIgzXXGh+Tf6GQDg9wXBDgDUUstJPyEJZuyv6W/ouL1VxpHAM6We\nMcVg13bhfRMiyZRthY19tT3VGUtGEhks+KyRqVLQmeZHRKQ/+8RP/cN13v1wkibp+qX1tXSW\nHV9ir913Rmwge2dl3JGrlR0Du6t0CMaaL4daciR0jC3pas2LHicgkeOmwva+6tY0PyLySK7r\n77A11ZuIyCzg7EATY48PuouIs9tY0Pq06fQ8faIhQef/y+MMAPyCYAcAaqnlhK+AyOq9rH8P\nTI41Vd2vqG/vnUKTC3btZ+ZpE41ek9ct//eVH08mEr59vFGFgr5gZxldkDn/ZgAABgxJREFU\nIF/ReNRHRMKZf38y0NJVU3BqZ/gMs/4rHjqysC97g6fyIRhjrPFSmAVHEtflAXYC0hwfV/I0\nGD4n2BmFZct3+PhmemrC4cs1Cm1J04nIO7WVAQAwPDwBAGpJ4h0cOOrc4RSvUZ+PmzLTy9PT\n03vWDAdz0bOVdwsLW4i0Hp7btjVTrrmsTYu6iou/pUViZQUuRETEOTo5yt92rOPoaEHnbt68\nTUFTe1sExhPmrXdiuXt1NhUv+3FDWFzWvsBI93vHfIcq3QcXIiKd6TtTQjLfOpB8jMSvf5gW\nLRvk2yiytraW/2jo7BfoTNTx5EHR7bvf3y8rvXO78OqFPCLq7u5W1hcAvAwQ7ABAPel6JV3/\nRhb318Offp19Kj77VHy0wNBp7pZDSateG6pQWF9fT0Sl6XGx6c90UldXRyRRVtDLYMQITYVt\nEomE6FHDIAvVaejb+4WcM2t6ZdLHnxw4k+AbqHQf+kjfeNfD5EBiFY1wecNSoOQIEJFUKlX4\n3FWeFRcZvffzW3U9RCQYYu44xdV6eF7FI8bY4D0AwMsFT8UCgLoSmswI33++pPrJD0VZqTtC\nvW3aS06Fe4dntSiW6ejoEGkvOtM9yEWJphRPFQp6NTU0KK4kV1tbS2RkZERUdXF31PKQlG8V\nNgsmuEwUUE95eaXKQ7Tl/zn0UJXEwEDyKOG9mLwXfW1Gd9FmT5/YT8vHrUj47FLh/cdNDRU3\ns+LfNX/BbgCAxxDsAEAdsQeZuzaGxZ2vIeKkZuM9A6P3fZF/wFeLHufk3FMstZXJRNRyLTu/\nS7616fLf1mzcfrSgQYWCXu3Xc4vkp73uX7lSRWIXFxlRW8Hxj1KSkjMrFAauKi/vJjI2NlZx\niPbrm4Pi72lM2px9ceOrGt/FB23OHVhPheM4UqrgZNrdbqHHR5kHwmZPf9XSUMwRsdLSMiLC\njB0AEBGCHQCoJ058P2Pn/m0x+0sGsk9X5cOKDhKMHv2LGSptn8C5BvQoMXxL7sCbW+uvxARH\n7tlx7DtNXRUK+jxKWh9/p+89Et0PU9fEF9PweUtnSYks/f1lHOXviTpd0T+px2q+iNl7nQQu\nPl7GKg3Rlr8laPcdziHqUKSjU9ShCFvu7u6lW/P7Zu2EIhERNQx23fcpLS0top7m5qdTlk3F\n2zccriaizk68jAwAiLCOHQCoqZ+z19iJiKRWbkvCo9ZHLPdx0ONIZBeR3cTYL9exq04PsBAR\nCU1d5qyIWBe+yNVMk0jiHJ1T39eZkoLep2L1DA0FQ8Z6LF0dFTbHWZ8jkWVARvXA3kSMExEJ\nhk36o5sN2bgvnGwiJJI4b85rUWmIthvr7ATEWUde63t6tflKuAVHAruo/DbGGGO5kWZEpGv1\n5lsLku6w/qdi3ZLlFjthXbdix2sRiS09Qjbv2BkbGeA6UkzSYcO0iWTbSn+rHwIAfk8Q7ABA\nXXVVX00KnzXBxlRPrKltYDHRb01y/k99CwYrBjvGemrzD672nzjGUKIp1jMbO3nOxpO35DPR\nrxf0Brtp+4ozN8yyN5aIdU3tZ4YkXPuxR76Dx7mJK72cTKQaRJxQOtzOPeTgjSeqDdGeF20v\nIBq94qump+U/Zy0zJxLYR+e1M8ZYZfoHU8cM1dLUGbHiXMfgwY6xrspLuwJcx5oOFUv0zGyc\n3RZvTS+tPeYrIu71PRX/29EGAF7gGG7MAICXXddxf9HijGmJtdnBRspqO074ap1e2Jk+H4sK\nAID6wT12AAAvQkMD/zcBQG3hlBMA4EUI56ez+f/vnQAAGBzOPAEAAAB4AvfYAQAAAPAEZuwA\nAAAAeALBDgAAAIAnEOwAAAAAeALBDgAAAIAnEOwAAAAAeALBDgAAAIAnEOwAAAAAeALBDgAA\nAIAnEOwAAAAAeALBDgAAAIAnEOwAAAAAeALBDgAAAIAnEOwAAAAAeALBDgAAAIAnEOwAAAAA\neALBDgAAAIAnEOwAAAAAeALBDgAAAIAnEOwAAAAAeALBDgAAAIAnEOwAAAAAeALBDgAAAIAn\nEOwAAAAAeALBDgAAAIAnEOwAAAAAeALBDgAAAIAn/gMCiU0VOcOZiwAAAABJRU5ErkJggg==", "text/plain": [ "Plot with title “Histogram of sleep$extra”" ] }, "metadata": { "image/png": { "height": 420, "width": 420 } }, "output_type": "display_data" } ], "source": [ "hist(sleep$extra)" ] }, { "cell_type": "markdown", "id": "9f692577", "metadata": {}, "source": [ "# Descriptive statistics by means of summarizing your data\n", "\n", "As we said earlier, descriptive statistics is about compressing the data just to give some relevant information about them. You can do this by summarizing them. Basically, you'll want to show a subset of the measures that we have just seen in a very compact and clean way." ] }, { "cell_type": "code", "execution_count": 34, "id": "af48e62f", "metadata": {}, "outputs": [], "source": [ "# Let's use these ACT data again for this section\n", "sat.dat<-read.csv(\"https://vincentarelbundock.github.io/Rdatasets/csv/psych/sat.act.csv\")\n", "sat.dat$gender<- as.factor( sat.dat$gender )\n", "sat.dat$education<- as.factor( sat.dat$education )" ] }, { "cell_type": "markdown", "id": "c3f8377a", "metadata": {}, "source": [ "### `summary` \n", "This is a built-in R function, which prints out some useful descriptive information. It can accept a vector or a data frame.\n", "\n", "- If a **continous** variable, It gives us the minimum and maximum values (i.e., the range), the first and third quartiles (i.e. the IQR), the mean and the median." ] }, { "cell_type": "code", "execution_count": 35, "id": "d8917d40", "metadata": {}, "outputs": [ { "data": { "text/plain": [ " Min. 1st Qu. Median Mean 3rd Qu. Max. \n", " 3.00 25.00 29.00 28.55 32.00 36.00 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "summary( sat.dat$ACT )" ] }, { "cell_type": "markdown", "id": "5606b6ff", "metadata": {}, "source": [ "- If a **logical** variable, it gives the number of Trues and Falses:" ] }, { "cell_type": "code", "execution_count": 56, "id": "f0203ac3", "metadata": {}, "outputs": [ { "data": { "text/plain": [ " Mode FALSE TRUE \n", "logical 367 333 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "summary( sat.dat$ACT>29 )" ] }, { "cell_type": "markdown", "id": "e83dd441", "metadata": {}, "source": [ "- If a **factor** variable, it gives the number of occurrences by each factor:" ] }, { "cell_type": "code", "execution_count": 81, "id": "7a19594e", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
0
57
1
45
2
44
3
275
4
138
5
141
\n" ], "text/latex": [ "\\begin{description*}\n", "\\item[0] 57\n", "\\item[1] 45\n", "\\item[2] 44\n", "\\item[3] 275\n", "\\item[4] 138\n", "\\item[5] 141\n", "\\end{description*}\n" ], "text/markdown": [ "0\n", ": 571\n", ": 452\n", ": 443\n", ": 2754\n", ": 1385\n", ": 141\n", "\n" ], "text/plain": [ " 0 1 2 3 4 5 \n", " 57 45 44 275 138 141 " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
1
247
2
453
\n" ], "text/latex": [ "\\begin{description*}\n", "\\item[1] 247\n", "\\item[2] 453\n", "\\end{description*}\n" ], "text/markdown": [ "1\n", ": 2472\n", ": 453\n", "\n" ], "text/plain": [ " 1 2 \n", "247 453 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "summary(sat.dat$education)\n", "summary(sat.dat$gender)" ] }, { "cell_type": "markdown", "id": "488b8816", "metadata": {}, "source": [ "**N.B.** This could also be obtained using the function `table`" ] }, { "cell_type": "code", "execution_count": 58, "id": "e47ada5b", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\n", " 0 1 2 3 4 5 \n", " 57 45 44 275 138 141 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "table(sat.dat$education)" ] }, { "cell_type": "markdown", "id": "ee3d3a5b", "metadata": {}, "source": [ "Here we are using all the time a vector, but as mentioned before, you can also pass a dataframe to `summary`. In this case, the same summary that we've just seen is generated but for every single column in the dataframe:" ] }, { "cell_type": "code", "execution_count": 36, "id": "2417e3bc", "metadata": {}, "outputs": [ { "data": { "text/plain": [ " X gender education age ACT \n", " Min. :29442 1:247 0: 57 Min. :13.00 Min. : 3.00 \n", " 1st Qu.:32117 2:453 1: 45 1st Qu.:19.00 1st Qu.:25.00 \n", " Median :34896 2: 44 Median :22.00 Median :29.00 \n", " Mean :34731 3:275 Mean :25.59 Mean :28.55 \n", " 3rd Qu.:37250 4:138 3rd Qu.:29.00 3rd Qu.:32.00 \n", " Max. :39985 5:141 Max. :65.00 Max. :36.00 \n", " \n", " SATV SATQ \n", " Min. :200.0 Min. :200.0 \n", " 1st Qu.:550.0 1st Qu.:530.0 \n", " Median :620.0 Median :620.0 \n", " Mean :612.2 Mean :610.2 \n", " 3rd Qu.:700.0 3rd Qu.:700.0 \n", " Max. :800.0 Max. :800.0 \n", " NA's :13 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "summary(sat.dat)" ] }, { "cell_type": "markdown", "id": "9f209c7e", "metadata": {}, "source": [ "### `describe` \n", "This function is contained in the **psych** library, and basically yields a data frame with most of the relevant statistics that we have studied:\n", "\n", "- item name\n", "- item number\n", "- number of valid cases\n", "- mean\n", "- standard deviation\n", "- trimmed mean (with trim defaulting to .1)\n", "- median (standard or interpolated\n", "- mad: median absolute deviation (from the median).\n", "- minimum\n", "- maximum\n", "- skew\n", "- kurtosis\n", "- standard error" ] }, { "cell_type": "code", "execution_count": 37, "id": "d9678e68", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
A psych: 7 × 13
varsnmeansdmediantrimmedmadminmaxrangeskewkurtosisse
<int><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl>
X170034731.0857143026.87002063489634739.8410713809.5407294423998510543-0.04625483-1.18853714114.40493322
gender*2700 1.647143 0.4782004 2 1.683929 0.0000 1 2 1-0.61452329-1.62467597 0.01807428
education*3700 4.164286 1.4253515 4 4.307143 1.4826 1 6 5-0.68079989-0.07489124 0.05387322
age4700 25.594286 9.4986466 22 23.862500 5.9304 13 65 52 1.64305735 2.42430535 0.35901510
ACT5700 28.547143 4.8235599 29 28.842857 4.4478 3 36 33-0.65640259 0.53496913 0.18231343
SATV6700 612.234286 112.9025659 620 619.453571 118.6080 200 800 600-0.64381115 0.32519458 4.26731588
SATQ7687 610.216885 115.6392972 620 617.254083 118.6080 200 800 600-0.59292125-0.01776025 4.41191437
\n" ], "text/latex": [ "A psych: 7 × 13\n", "\\begin{tabular}{r|lllllllllllll}\n", " & vars & n & mean & sd & median & trimmed & mad & min & max & range & skew & kurtosis & se\\\\\n", " & & & & & & & & & & & & & \\\\\n", "\\hline\n", "\tX & 1 & 700 & 34731.085714 & 3026.8700206 & 34896 & 34739.841071 & 3809.5407 & 29442 & 39985 & 10543 & -0.04625483 & -1.18853714 & 114.40493322\\\\\n", "\tgender* & 2 & 700 & 1.647143 & 0.4782004 & 2 & 1.683929 & 0.0000 & 1 & 2 & 1 & -0.61452329 & -1.62467597 & 0.01807428\\\\\n", "\teducation* & 3 & 700 & 4.164286 & 1.4253515 & 4 & 4.307143 & 1.4826 & 1 & 6 & 5 & -0.68079989 & -0.07489124 & 0.05387322\\\\\n", "\tage & 4 & 700 & 25.594286 & 9.4986466 & 22 & 23.862500 & 5.9304 & 13 & 65 & 52 & 1.64305735 & 2.42430535 & 0.35901510\\\\\n", "\tACT & 5 & 700 & 28.547143 & 4.8235599 & 29 & 28.842857 & 4.4478 & 3 & 36 & 33 & -0.65640259 & 0.53496913 & 0.18231343\\\\\n", "\tSATV & 6 & 700 & 612.234286 & 112.9025659 & 620 & 619.453571 & 118.6080 & 200 & 800 & 600 & -0.64381115 & 0.32519458 & 4.26731588\\\\\n", "\tSATQ & 7 & 687 & 610.216885 & 115.6392972 & 620 & 617.254083 & 118.6080 & 200 & 800 & 600 & -0.59292125 & -0.01776025 & 4.41191437\\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "A psych: 7 × 13\n", "\n", "| | vars <int> | n <dbl> | mean <dbl> | sd <dbl> | median <dbl> | trimmed <dbl> | mad <dbl> | min <dbl> | max <dbl> | range <dbl> | skew <dbl> | kurtosis <dbl> | se <dbl> |\n", "|---|---|---|---|---|---|---|---|---|---|---|---|---|---|\n", "| X | 1 | 700 | 34731.085714 | 3026.8700206 | 34896 | 34739.841071 | 3809.5407 | 29442 | 39985 | 10543 | -0.04625483 | -1.18853714 | 114.40493322 |\n", "| gender* | 2 | 700 | 1.647143 | 0.4782004 | 2 | 1.683929 | 0.0000 | 1 | 2 | 1 | -0.61452329 | -1.62467597 | 0.01807428 |\n", "| education* | 3 | 700 | 4.164286 | 1.4253515 | 4 | 4.307143 | 1.4826 | 1 | 6 | 5 | -0.68079989 | -0.07489124 | 0.05387322 |\n", "| age | 4 | 700 | 25.594286 | 9.4986466 | 22 | 23.862500 | 5.9304 | 13 | 65 | 52 | 1.64305735 | 2.42430535 | 0.35901510 |\n", "| ACT | 5 | 700 | 28.547143 | 4.8235599 | 29 | 28.842857 | 4.4478 | 3 | 36 | 33 | -0.65640259 | 0.53496913 | 0.18231343 |\n", "| SATV | 6 | 700 | 612.234286 | 112.9025659 | 620 | 619.453571 | 118.6080 | 200 | 800 | 600 | -0.64381115 | 0.32519458 | 4.26731588 |\n", "| SATQ | 7 | 687 | 610.216885 | 115.6392972 | 620 | 617.254083 | 118.6080 | 200 | 800 | 600 | -0.59292125 | -0.01776025 | 4.41191437 |\n", "\n" ], "text/plain": [ " vars n mean sd median trimmed mad \n", "X 1 700 34731.085714 3026.8700206 34896 34739.841071 3809.5407\n", "gender* 2 700 1.647143 0.4782004 2 1.683929 0.0000\n", "education* 3 700 4.164286 1.4253515 4 4.307143 1.4826\n", "age 4 700 25.594286 9.4986466 22 23.862500 5.9304\n", "ACT 5 700 28.547143 4.8235599 29 28.842857 4.4478\n", "SATV 6 700 612.234286 112.9025659 620 619.453571 118.6080\n", "SATQ 7 687 610.216885 115.6392972 620 617.254083 118.6080\n", " min max range skew kurtosis se \n", "X 29442 39985 10543 -0.04625483 -1.18853714 114.40493322\n", "gender* 1 2 1 -0.61452329 -1.62467597 0.01807428\n", "education* 1 6 5 -0.68079989 -0.07489124 0.05387322\n", "age 13 65 52 1.64305735 2.42430535 0.35901510\n", "ACT 3 36 33 -0.65640259 0.53496913 0.18231343\n", "SATV 200 800 600 -0.64381115 0.32519458 4.26731588\n", "SATQ 200 800 600 -0.59292125 -0.01776025 4.41191437" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "describe(sat.dat)" ] }, { "cell_type": "markdown", "id": "9bb0c2c5", "metadata": {}, "source": [ "**N.B.** Here the asterisk denotes a factor variable. For this kind of variables this information here is not useful, so we should ignore it" ] }, { "cell_type": "markdown", "id": "569385b7", "metadata": {}, "source": [ "### `describeBy` \n", "\n", "This function is also contained in the **psych** and yields the same output of `describe`, but separated for a given group (factor) variable." ] }, { "cell_type": "code", "execution_count": 38, "id": "dbb163ec", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\n", " Descriptive statistics by group \n", "group: 1\n", " vars n mean sd median trimmed mad min max range skew kurtosis se\n", "X1 1 247 28.79 5.06 30 29.23 4.45 3 36 33 -1.06 1.89 0.32\n", "------------------------------------------------------------ \n", "group: 2\n", " vars n mean sd median trimmed mad min max range skew kurtosis se\n", "X1 1 453 28.42 4.69 29 28.63 4.45 15 36 21 -0.39 -0.42 0.22" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "describeBy(x = sat.dat$ACT, group = sat.dat$gender)" ] }, { "cell_type": "code", "execution_count": 39, "id": "8d2e3bc2", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\n", " Descriptive statistics by group \n", "group: 0\n", " vars n mean sd median trimmed mad min max range skew kurtosis se\n", "X1 1 57 27.47 5.21 28 27.53 5.93 15 36 21 -0.1 -0.85 0.69\n", "------------------------------------------------------------ \n", "group: 1\n", " vars n mean sd median trimmed mad min max range skew kurtosis se\n", "X1 1 45 27.49 6.06 27 27.84 7.41 15 36 21 -0.39 -0.94 0.9\n", "------------------------------------------------------------ \n", "group: 2\n", " vars n mean sd median trimmed mad min max range skew kurtosis se\n", "X1 1 44 26.98 5.81 28 27.53 4.45 3 36 33 -1.63 4.59 0.88\n", "------------------------------------------------------------ \n", "group: 3\n", " vars n mean sd median trimmed mad min max range skew kurtosis se\n", "X1 1 275 28.29 4.85 29 28.56 4.45 16 36 20 -0.46 -0.6 0.29\n", "------------------------------------------------------------ \n", "group: 4\n", " vars n mean sd median trimmed mad min max range skew kurtosis se\n", "X1 1 138 29.26 4.35 30 29.47 4.45 16 36 20 -0.46 -0.26 0.37\n", "------------------------------------------------------------ \n", "group: 5\n", " vars n mean sd median trimmed mad min max range skew kurtosis se\n", "X1 1 141 29.6 3.95 30 29.79 4.45 18 36 18 -0.45 -0.5 0.33" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "describeBy(x = sat.dat$ACT, group = sat.dat$education)" ] }, { "cell_type": "markdown", "id": "96661f99", "metadata": {}, "source": [ "**N.B.** You can also do this in a more flexible way with the library **dplyr**. We will see this in the next tutorial (it'll be super cool!)" ] }, { "cell_type": "markdown", "id": "fda15130", "metadata": {}, "source": [ "### `summarise` \n", "\n", "The last three functions that we have seen allow you to summarize very important information for describing the data. However, they yield a very specific output. How could we choose which output to summarize? We can do this with `summarise` function, which is part of the **tidyverse** library." ] }, { "cell_type": "code", "execution_count": 40, "id": "09111a47", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "── \u001b[1mAttaching packages\u001b[22m ─────────────────────────────────────── tidyverse 1.3.2 ──\n", "\u001b[32m✔\u001b[39m \u001b[34mggplot2\u001b[39m 3.4.0 \u001b[32m✔\u001b[39m \u001b[34mpurrr \u001b[39m 1.0.1 \n", "\u001b[32m✔\u001b[39m \u001b[34mtibble \u001b[39m 3.1.8 \u001b[32m✔\u001b[39m \u001b[34mdplyr \u001b[39m 1.0.10\n", "\u001b[32m✔\u001b[39m \u001b[34mtidyr \u001b[39m 1.2.1 \u001b[32m✔\u001b[39m \u001b[34mstringr\u001b[39m 1.5.0 \n", "\u001b[32m✔\u001b[39m \u001b[34mreadr \u001b[39m 2.1.3 \u001b[32m✔\u001b[39m \u001b[34mforcats\u001b[39m 0.5.2 \n", "── \u001b[1mConflicts\u001b[22m ────────────────────────────────────────── tidyverse_conflicts() ──\n", "\u001b[31m✖\u001b[39m \u001b[34mggplot2\u001b[39m::\u001b[32m%+%()\u001b[39m masks \u001b[34mpsych\u001b[39m::%+%()\n", "\u001b[31m✖\u001b[39m \u001b[34mggplot2\u001b[39m::\u001b[32malpha()\u001b[39m masks \u001b[34mpsych\u001b[39m::alpha()\n", "\u001b[31m✖\u001b[39m \u001b[34mdplyr\u001b[39m::\u001b[32mfilter()\u001b[39m masks \u001b[34mstats\u001b[39m::filter()\n", "\u001b[31m✖\u001b[39m \u001b[34mdplyr\u001b[39m::\u001b[32mlag()\u001b[39m masks \u001b[34mstats\u001b[39m::lag()\n" ] } ], "source": [ "library(tidyverse)" ] }, { "cell_type": "code", "execution_count": 41, "id": "69e23906", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\n", "\n", "\t\n", "\n", "
A data.frame: 1 × 8
mupop_medsigmapop_iqrpop_minpop_maxpop_q1pop_q3
<dbl><dbl><dbl><dbl><int><int><dbl><dbl>
28.54714294.8235673362532
\n" ], "text/latex": [ "A data.frame: 1 × 8\n", "\\begin{tabular}{llllllll}\n", " mu & pop\\_med & sigma & pop\\_iqr & pop\\_min & pop\\_max & pop\\_q1 & pop\\_q3\\\\\n", " & & & & & & & \\\\\n", "\\hline\n", "\t 28.54714 & 29 & 4.82356 & 7 & 3 & 36 & 25 & 32\\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "A data.frame: 1 × 8\n", "\n", "| mu <dbl> | pop_med <dbl> | sigma <dbl> | pop_iqr <dbl> | pop_min <int> | pop_max <int> | pop_q1 <dbl> | pop_q3 <dbl> |\n", "|---|---|---|---|---|---|---|---|\n", "| 28.54714 | 29 | 4.82356 | 7 | 3 | 36 | 25 | 32 |\n", "\n" ], "text/plain": [ " mu pop_med sigma pop_iqr pop_min pop_max pop_q1 pop_q3\n", "1 28.54714 29 4.82356 7 3 36 25 32 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "summarise(sat.dat, \n", " mu = mean(ACT), \n", " pop_med = median(ACT), \n", " sigma = sd(ACT), \n", " pop_iqr = IQR(ACT),\n", " pop_min = min(ACT), \n", " pop_max = max(ACT),\n", " pop_q1 = quantile(ACT, 0.25), # first quartile, 25th quantile\n", " pop_q3 = quantile(ACT, 0.75)) # third quartile, 75th quantile" ] }, { "cell_type": "markdown", "id": "c6d3e5b5", "metadata": {}, "source": [ "**N.B.** This function `summarise` is even more flexible than this. We will see in the next tutorial the cool things that you can do with it!" ] }, { "cell_type": "markdown", "id": "0da96343", "metadata": {}, "source": [ "# Descriptive statistics by means of visualization\n", "\n", "Ok, so we have already seen what measures we can use to describe our data, and how to calculate these in R using several functions. \n", "\n", "However, although numbers are really appealing and at some point you will probably end up showing summary tables in your future research projects, it is also recommended to display these results as graphs. \n", "\n", "We are going to see here how to do this using **ggplot**, a library whose basics we have already studied. This is just an introduction of the things that you can do. Here **curiosity** and **creativity** are essential!" ] }, { "cell_type": "code", "execution_count": 43, "id": "b18bc295", "metadata": {}, "outputs": [], "source": [ "library(tidyverse)" ] }, { "cell_type": "markdown", "id": "a97f2470", "metadata": {}, "source": [ "One nice way of visualizing your descriptive statistics is by using **BOXPLOTS**. Boxplots are a kind of graphs that basically displays in boxes the tendency (the median) and spread (the IQR, maximum and minimum) in one or more groups." ] }, { "cell_type": "markdown", "id": "5f955946", "metadata": {}, "source": [ "In order to generate boxplots using **ggplot**, you can make use of the function `geom_boxplot`." ] }, { "cell_type": "markdown", "id": "2b527b4a", "metadata": {}, "source": [ "
Practice: Try it yourself with the ACT data, using \"education\" as the x variable, and \"ACT\" in the y variable in the aesthetics mappings
" ] }, { "cell_type": "code", "execution_count": 44, "id": "26f4a73f", "metadata": {}, "outputs": [], "source": [ "#YOUR CODE HERE" ] }, { "cell_type": "markdown", "id": "e3aa8bef", "metadata": {}, "source": [ "
Question : What if you make a boxplot but with SATV in the x-axis instead of education?
" ] }, { "cell_type": "code", "execution_count": 46, "id": "448414b4", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Warning message:\n", "“\u001b[1m\u001b[22mContinuous \u001b[32mx\u001b[39m aesthetic\n", "\u001b[36mℹ\u001b[39m did you forget `aes(group = ...)`?”\n" ] }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAA0gAAANICAIAAAByhViMAAAACXBIWXMAABJ0AAASdAHeZh94\nAAAgAElEQVR4nO3daXycdbnw8f+sSdMmbdNSoYUCIktLBQ+yyCaLsnlAhIIgUPblQA8+iCKi\nsrmA57ggCigIBYSyyi6oyKqogAXU0lY8bEUQKXRJ16TJzDwvAqVAm9K0yXSufr8v+DD3JJnr\nzn/umV/uSTqZSqWSAACofdlqDwAAwMoh7AAAghB2AABBCDsAgCCEHQBAEMIOACAIYQcAEISw\nAwAIIl/tAd6XOXPmdHR0VHuK5davX7958+bV9D8B3dTUVC6X586dW+1Bui+fzxcKhQULFlR7\nkO4rFot9+vRZsGDBwoULqz1L9zU0NLS1tZVKpWoP0n39+vXLZrOzZ8+u9iDdl8lk+vbtW+tH\ndN++fdva2lpbW6s9S/fV19eXSqX29vZqD9J9DQ0NhUJh9uzZNf0c19jYOGfOnGpPsdyy2Wz/\n/v2Xdm1thF25XK7F54NMJlMqlWr6Tp/L5SqVSi1+8xfJZrOdC1HtQbqvUqlks9laX4hUswfy\nItlsNpvN1vou1Prh0LkKtX44VCqVWt+FTCaTzWbL5XK5XK72LN1X60f0EnkpFgAgCGEHABCE\nsAMACELYAQAEIewAAIIQdgAAQQg7AIAghB0AQBDCDgAgCGEHABCEsAMACELYAQAEIewAAIIQ\ndgAAQQg7AIAghB0AQBDCDgAgCGEHABCEsAMACELYAQAEIewAAIIQdgAAQQg7AIAghB0AQBDC\nDgAgCGEHABCEsAMACELYAQAEIewAAIIQdgAAQQg7AIAghB0AQBDCDgAgiHy1BwCoDddff/28\nefNGjx5d7UEAlsoZO4D35Y477hg/fny1pwDoirADAAhC2AEABCHsAACCEHYAAEEIOwCAIIQd\nAEAQwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLAD\nAAhC2AEABCHsAACCyFd7gPelrq6urq6u2lMst1wu169fv0qlUu1BVkg2m+3Xr1+1p+i+bDbb\nuRDVHqT7crlcSqmuri6fr40Ddony+XxDQ0O5XK72IN2XyWRSSjV9X8pkMgGO6JRSsVjs/J8a\nlc/nK5VKoVCo9iDd1/m41Ldv35p+jstkMjV9OCxRbTxPlEqlWnw+KBaL7e3tNX2nr6+vr1Qq\n7e3t1R6k+3K5XCaTqeldSCkVCoVSqVTTe5HP5zs6OkqlUrUHWVE1vQqZTCafz9f0LuRyuWKx\nWC6Xa3ovstlsre9C58+Ztf4cV1dXV4ur0PlD5tLURth1dHTU4re+T58+bW1tNX2nb2xsrFQq\nbW1t1R6k+wqFQi6Xq+ld6DyGOzo6anovisXiwoULOzo6qj1I93UeyzW9Ctlstr6+vqZ3ofMs\nV60fDrlcrlQq1fQudL6MtnDhwlo87bJI3759a3EVOk+XLk0Nn8oGAGBxwg4AIAhhBwAQhLAD\nAAhC2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYA\nAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQwg4A\nIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC2AEA\nBCHsAACCEHYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEEIOwCA\nIIQdAEAQwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQ\nhLADAAhC2AEABJGpVCrVnmHZWlpa2tvbe/pWHnvssRtuuGElfsFcLlcqlVbiF+x9+Xy+UqnU\n9F5kMplMJlMul6s9SPdlMpnO+1JNHK1Lk81ma3oVUkrPP/98e3v7xhtvXO1BVkitPy51Hg7l\ncrmm707ZbLZSqdT0EZ3L5TKZTEdHR7UHWSH5fH7l7sIee+yx5557rsQvuES5XG7gwIFLuzbf\n0zdfQ1paWqZMmVLtKYBVmkcJYGm22GKLao8g7N7jq1/96n777VftKQCAmvHoo4+efPLJ1Z4i\nJWG3RJlMptojAAAsN388AQAQhLADAAhC2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQwg4AIAhh\nBwAQhLADAAhC2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC2AEABCHs\nAACCEHYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEEIOwCAIIQd\nAEAQwg4AIAhhBwAQhLADAAgi3+O3MO/Zuy+/7PbHX2jJfuDDux12wqEfG5JLKaVUevWRcT8Z\n/7u/v97etMEOB5507B7r1vf4LAAAgfX0GbuWh390zlUvfujIb1x4wVf+M//Q/5wz/v/KKaVU\nmnL1uRc83rjvGd+/4OwDP/C3n5592YTWHh4FACC2Hg67Gb+/+9HCHscfu/0GQ4dtutcJ+23y\n8v0PPJNSan309ntmbHv0yXuOXGedEXt8fuzupQdue7ilZ2cBAIith8Nu4O5fu/i8AzZ681bK\n5VIql8sppecnT1m44ahRb774mh81akTlmSnPVHp2GACA0Ho47DLF/muvNSCXUmnBa0/ffsmt\nz673qd1HpFSePmNWvrm58a0PyzU3Ny18Y/rsnh0GACC0nv/jiZRSmnnHGUdc8feUWWO7sdsN\nzaa0oK0tFfsU3/6AQqGQ2tvbF11+5JFHzjrrrEUXv/vd726xxRY9PWXfvn17+iYAgKgaGhoG\nDRrU07fS+drn0vRO2DXtcc7Ne5amPXnd97932jc6fnz+HnXF1N7xdsel9vb2VFdXt+hyfX39\nsGHDFl0sFAqlUqmnp6xUvBYMAHRTuVzunVzJ5XJLu7Z3wi5X35BLaZ1tjx+z4wPnPvToG5/a\ncNCA9qkz5qbUL6WUUmnmzDl1gwb1W/QJW2655TXXXLPoYktLy6xZs3p6yvnz5/f0TQAAUbW2\ntvZCruRyuYEDBy7t2p79HbvK5J+feMhZv1m0j21z57Zn6uoKaYORI4rPTp68sHNzafLTUzIb\nb7JhpkeHAQCIrWfDLvOhrbfsP3H8D29+cuq/X33+8Zu+c+Wfm3baZ/v+qW6rfXZv/N2lP7h7\n4kv//Ptvf3Txvbld99upf4/OAgAQXA+/FFvc5Mhzv1Z/xXUXn3HTnNzA4ZvtdfZ3R2/RmFIq\njjrq7M+XLrnu21+4Mtu84fYnnXvcR73xBADAiujx37HLDdny0DO2PPS9VxTW3uXE83Y5sadv\nHwBgddHTbykGAEAvEXYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYA\nAEEIOwCAIIQdAEAQwg4AIIh8tQdY5UyYMKG9vb3aUwAANeOFF16o9ghvEnbvdu+99957773V\nngIAYLl5KRYAIAhhBwAQhJdi323s2LF77LFHtacAAGrGk08+ec4551R7ipSE3Xv1799/6NCh\n1Z4CAKgZU6dOrfYIb/JSLABAEMIOACAIYQcAEISwAwAIQtgBAAQh7AAAghB2AABBCDsAgCCE\nHQBAEMIOACAIYQcAEISwAwAIQtgBAAQh7AAAghB2AABBCDsAgCCEHQBAEMIOACAIYQcAEISw\nAwAIQtgBAAQh7AAAghB2AABBCDsAgCCEHQBAEMIOACAIYQcAEISwAwAIQtgBAAQh7AAAghB2\nAABBCDsAgCCEHQBAEMIOACAIYQcAEISwAwAIQtgBAAQh7AAAghB2AABBCDsAgCCEHQBAEMIO\nACAIYQcAEISwAwAIQtgBAAQh7AAAghB2AABBCDsAgCCEHQBAEMIOACAIYQcAEISwAwAIQtgB\nAAQh7AAAghB2AABBCDsAgCCEHQBAEMIOACAIYQcAEISwAwAIQtgBAAQh7AAAghB2AABBCDsA\ngCCEHQBAEMIOACAIYQcAEISwAwAIQtgBAAQh7AAAghB2AABBCDsAgCCEHQBAEMIOACAIYQcA\nEISwAwAIQtgBAAQh7AAAghB2AABBCDsAgCCEHQBAEPlqD/C+FAqFfL7HRy0UCimliy666PLL\nL+/p2wIAwli4cGFKqVAo9OnTp6dvK5PJdHFtbYRd76ivrx88eHBKqVwur5QvmMlkKpXKSvlS\n1dJ57wmwF3ah6gLswqxZs0ql0qBBg6o9yAoJsBABdiEAzw7vks/nBw8e3NDQsFK+2oqojcOj\npaWlvb292lMstwEDBrS0tNTEd3hpBg8e3NHRMWvWrGoP0n2FQqG+vn7OnDnVHqT76uvr+/Xr\nN3fu3NbW1mrP0n2NjY0LFizo6Oio9iDdd+KJJ7722mu33nprtQfpvmw229TUVOtHdP/+/efP\nnz9//vxqz9J9DQ0NpVKpra2t2oN0X1NTU7FYnDFjxso6FVIVzc3NM2bMqPYUyy2Xyw0cOHBp\n1/odOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQwg4A\nIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC2AEA\nBCHsAACCEHYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEEIOwCA\nIIQdAEAQwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQ\nhLADAAhC2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC2AEABCHsAACC\nEHYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQ\nwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC\n2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEEI\nOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQXYXdqaNG\n7fjNR3ttFAAAVkRXYffSpElTXpnba6MAALAi8j19A5VZk24bd/Vv/vLijI7GoSN2/Nxxh3xs\nzWJKKZVefWTcT8b/7u+vtzdtsMOBJx27x7r1PT0LAEBkPf07dv++8/xzbnp1xOFnXvCj88fu\nVHn4/LOunNSWUipNufrcCx5v3PeM719w9oEf+NtPz75sQmsPjwIAENsyztgteHL8t761zF+z\nG/H1r49e8jUvP/zbKUP2veio7YenlIbtf+rnnjp03ENTTth0k0dvv2fGtl/84Z4j61Na5/Nj\nnz/q7NsePmLLPfp3YxcAAEhpmWE3/89XnfnnZX6R0UsNu0G7nnL+1s3rvHUxk0mprXVBOT0/\necrCDQ8Z9eaLr/lRo0ZU7p/yTGWPrTPve3IAAN5hGWHXtM/37jj1o8v6Imss9Zo+a3xo00XX\nlp6//a6/1W/5pQ9ny0/NmJVvbm5865pcc3PTwpenz07pzVN2EyZMuPDCCxd9mdNOO23kyJHL\nGmOVk8vl+vev+ZOQuVxuwIAB1Z6i+zKZTDabreldyGazKaWGhob6+hr+RdRcLpfP5yuVSrUH\n6b5MJpNSqun7UgpxRKeU6uvri8VitWfpvmw2W6lU+vTpU+1Bui+Xy6WUmpqaqj3ICqnRZ4eu\nH0iXEXaFoZvvvPPOK2WM1/9w4Xk3T//of399h36pta0tFfssdlAWCoXU3t6+6HJra+srr7yy\n6GJ7e3vnfai2ZDKZWhz7XQLsRYBdSLW/F52FXe0pVoKaXoVU+3ekTtlstrPwaleM+QPcl2px\nF8rlchfX9vhfxaaUUmp/+b4fnH3xXz8w5hunf3JIJqViXTG1d7Qv9gHt7amurm7R5R122OGB\nBx5YdLGlpWX69Om9MurKNGDAgJaWlpo+RTF48OCOjo5Zs2ZVe5DuKxQK9fX1c+bMqfYg3Vdf\nX9+vX7958+a1ttbwnxg1NjYuWLCgo6Oj2oN0X+exXIuPRYtks9mmpqZaP6L79+8/f/78+fPn\nV3uW7mtoaCiVSm1tbdUepPuampqKxeLMmTO7joxVXHNz84wZM6o9xXLL5XIDBw5c2rW9EHYL\n/vGLb5177SsjTjj/y3ut23mWLjdo0ID2qTPmptQvpZRSaebMOXWDBvXr+WEAAMLq6pWRvcaO\nPXaXdbr4gPehNPWOb541fvo2X/7uV9+qupRS2mDkiOKzkycvfPODJj89JbPxJhvW9nlpAIDq\n6uqM3TEXXdT5P62vvrpgrbUWP+v3/P3X/n3NT+25afMyfmXmlTt+eNWkfjucsH39P//y5D87\ntzUN3/xDg7faZ/fGMy/9wUZ9Dtms8MwvLr43t+vXdqr5PzMAAKimZb0UO+eJH590wjdvqJz2\njydOW3/R1hdvOn3MGU/03fjAc6++9IvbLPV13vTKH3/3XKmSHv7puQ+/vXGL/77+nN37jjrq\n7M+XLrnu21+4Mtu84fYnnXvcR2v47/0AAFYBXYbdvEe/vsvu335iwZCPfnbIO37Fs9+uY886\n/NLLrr35S7s+P+f3fzhni7olf4FhB/7wzgOX8sULa+9y4nm7nNitqQEAeI+uXkqd+L3jz3si\nbXfWQ89MGH/EJotfM3jro869+o9P/erkzdqf+NYJP/xHDw8JAMCydRV2N984sTLq1CvO2X7J\n/3hfds3dv/fTE9cvTbjuxp6ZDQCA5dBV2D33XBq8w46bdPG3qsVtd/t4Y/qHM3YAANXXVdj1\n65eW+a+J1tcXUy2/KQoAQBhdhd3GG6dZjz36TFef/swf/jg9DR++kocCAGD5dRV2Bx2yU27i\nj0678sXSkq8vT7369Ismps1G798jowEAsDy6Crthx3z3y5vPu+u4j+995g1/eaN98avapj11\n45n77HTcHTOGH/Htk0b28JAAACxbl/+OXZ+tvnXHDS2jj7rkW5/79bcb1tpk0w8Na67vmDP9\n1ecm/+O11kpq2vTwq2/72d6DemtYAACWbhnvPJFd9zMX/2mrz1558c+uu/XeCU/9fkpHSpli\n09qj9jpm7wOPGTtm2yG53pkTAIBlWNZbiqWUCsN2Ov68nY4/L6XKwjkz5+eaBjS847Pa33ij\nMHhwTw0IAMD709Xv2L1HptjY/HbVlWZNvueSLx2w5drDvC8YAED1vY8zdu9WmfPsAzdcOW7c\n1bc++kprSqm45rYrfSwAAJbX8oTd/Jd+94urxo278uaHX5yfUkr1w3c88viTxh43esseGg4A\ngPfv/YRd278eu+3qK8aNu/H+Z2eXU0p1A/rXz5r7ictfueOYZn87AQCwiugq7NqnPXXnNePG\njbvuN5NnlFLqM2yr/Y4YfcDoA/Ze57ZdN/jKgEGqDgBgFdJV2B2x9hbXtxcGbbL9QV/ad7/R\n+39qm+ENmZRSSi/2ymgAACyPrv4qtr09pb5DNxm50QfXHjKgb10+02tTAQCw3LoKu0sn3XPJ\nydtXJlz7rVMO3W2zoWt8aKfDTv/xbY+/vKDXpgMA4H3rKuyaR+514vnj//Dia88/dPU3jvnE\nmm/8Yfz/fn7/bYavsfX/TElp1hvTS702JgAAy/I+/oHiTL/1dzr8zMvvfea1fz528/dP/vTm\nDbPemJ9Kdx83bJ1tDv7qz+57bk6l5+cEAGAZluedJ+rW2vqAU390x1P/+vekX17ylYO3GzLr\n8RvPP363Ddc6qsfGAwDg/VqutxR7U7555H+eeP71f5j62nMPXnnuMbsO68bbVwAAsJKtSJNl\nGj+485Fn7XzkWSttGgAAuq07Z+wAAFgFCTsAgCCEHQBAEMIOACAIYQcAEISwAwAIQtgBAAQh\n7AAAghB2AABBCDsAgCCEHQBAEMIOACAIYQcAEISwAwAIQtgBAAQh7AAAghB2AABBCDsAgCCE\nHQBAEMIOACAIYQcAEISwAwAIQtgBAAQh7AAAghB2AABBCDsAgCCEHQBAEMIOACAIYQcAEISw\nAwAIQtgBAAQh7AAAghB2AABBCDsAgCCEHQBAEMIOACAIYQcAEISwAwAIQtgBAAQh7AAAghB2\nAABBCDsAgCCEHQBAEMIOACAIYQcAEISwAwAIQtgBAAQh7AAAghB2AABBCDsAgCCEHQBAEMIO\nACAIYQcAEISwAwAIQtgBAAQh7AAAghB2AABBCDsAgCCEHQBAEMIOACAIYQcAEISwAwAIQtgB\nAAQh7AAAghB2AABBCDsAgCCEHQBAEMIOACAIYQcAEISwAwAIQtgBAAQh7AAAghB2AABBCDsA\ngCCEHQBAEMIOACCIfLUHeF/y+Xw2W3sNms1m6+rqKpVKtQdZIZlMpq6urtpTdF8ul+tciGoP\n0n35fL7zvzW9F7lcrlgs5nK5ag/SfZlMJqVU06uQyWQCHNGp9g+HGn1SW1zn/MVisaaf42r0\ncOh8LFrqtTWxJK2trdUeoTvq6uoWLlxYE9/hpamvry+XywsXLqz2IN2XzWZzuVx7e3u1B+m+\nXC5XKBTa29tLpVK1Z+m+QqFQKpXK5XK1B+m+lpaWUqnU3Nxc7UG6L5PJFAqFWj+ii8ViR0dH\nR0dHtWfpvnw+X6lUav2IzuVybW1tNf0cV1dX19bWVu0puqO+vn5pV9XGGbu2trZafGLO5/Nz\n586t6Tt9Z9jNnTu32oN0X6FQqK+vr+ldqK+vLxQKbW1tNfoTTqfGxsYFCxbU9JPx4MGDs9ns\n9OnTqz1I92Wz2aamppo+HAqFQrFYXLhw4fz586s9S/c1NDSUSqUaTYpOTU1NuVxu3rx5Nf3T\nWrFYrMXDIZfLdRF2tX0qGACARYQdAEAQwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEEI\nOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQwg4AIAhh\nBwAQhLADAAhC2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC2AEABCHs\nAACCEHYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEEIOwCAIIQd\nAEAQwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLAD\nAAhC2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYA\nAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQwg4A\nIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC2AEA\nBCHsAACCEHYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEEIOwCA\nIIQdAEAQwg4AIAhhBwAQRK+F3ayHzjv01F+8+vaG0quP/OysE8d89oCDjz39ot9Mbe2tQQAA\nguqVsCvPmHDp2Rc9OmexTaUpV597weON+57x/QvOPvADf/vp2ZdNkHYAACuix8Ou9MojP/3K\n//v2nzJr9V9sa+ujt98zY9ujT95z5DrrjNjj82N3Lz1w28MtPT0LAEBkPR5286ZMmLr+If/z\noy9ut3jYPT95ysINR42q77yUHzVqROWZKc9UenoYAIDA8j19A02fPOX8T6aUXn5isY3l6TNm\n5ZubG9+6nGtublr48vTZKb1Zf5MmTbrmmmsWffyRRx65/vrr9/SoK10ul2tsbKxUajtYO/ei\n2lN0XzabrfVdyOVyKaX6+vpCoVDtWbqvUChks9lyuVztQbovm81mMpmavi9lMplaPxyy2WxK\nqa6urvO4qFH5fL5SqRSLxWoP0n35fD6l1K9fv5p+jqv1I3qJejzslqitrS0V+yx2jy4UCqm9\nvX3R5WnTpt13332LLu6///51dXW9OeHKUtPHbadMJlOj3/zFBdiFfD7f+UhauzqfkmtdgPtS\ngF3I5XI1HXadav2ITiGe42rxcOj6J+Tq3KuKdcXU3vF2x6X29vZ3fHN33HHHBx54YNHFUqk0\nffr03pxwpejfv//s2bNr+qeZQYMGdXR0tLTU8O8/FgqFurq6uXPnVnuQ7quvr+/bt+/cuXPb\n2tqqPUv39evXr7W1taOjo9qDdN+AAQOy2eyMGTOqPUj3ZbPZxsbGWj+im5qaFixYMH/+/GrP\n0n0NDQ2lUqmmj+jGxsZisThz5syaPg0/cODAmTNnVnuK5ZbL5QYMGLC0a6sTdrlBgwa0T50x\nN6V+KaWUSjNnzqkbNKjf22Pl801NTYsutrS0lEqlXh9zJahUKjUddp1qehc6hw+wC6nG9yI5\nHFYBYQ6HWr8vVd5S7UFWVIC9qMX5u565Sq+MbDByRPHZyZMXdl4qTX56SmbjTTbMVGcYAIAQ\nqhR2dVvts3vj7y79wd0TX/rn33/7o4vvze263079l/15AAAsTbV+c7M46qizP1+65Lpvf+HK\nbPOG25907nEfra/SKAAAMfRa2K39uYvu/NziGwpr73Liebuc2Fu3DwAQXYR/fQAAgCTsAADC\nEHYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQ\nwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC\n2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEEI\nOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEEIOwCAIIQdADDmk34AABFu\nSURBVEAQwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQ\nhLADAAhC2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC2AEABCHsAACC\nEHYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQ\nwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC\n2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEHk\nqz3A+5LNZvP52hh1cZlMJp/PVyqVag+yQjr3otpTdF8ulwuwC6lmj4JFan3+RWp6L7LZrMNh\nVZDNZlON35cymUxKKZ/Pl8vlas+yQmpxFTrvP0uTqYnsaGtr63o3Vk35fL6jo6PaU6yQQqFQ\nqVRqei8ymUw2my2VStUepPuy2WwulyuVSjX9AJrL5crlck084CxNPp/PZDLt7e3VHmSF1Prj\nUmeYlsvlmj6oc7lcpVKp9SM6m812dHTU+kFdi4dDpVIpFotLu7Y2QrW1tbUWH0wHDBgwe/bs\nmr7TDx48uFQqtbS0VHuQ7isUCvX19XPmzKn2IN1XX1/fr1+/BQsWtLa2VnuW7mtsbFywYEEt\nPoYuMnDgwGw2W9OHQzabbWpqquldKBQK/fv3b21tnT9/frVn6b6GhoZSqdTW1lbtQbqvqamp\nWCzOnj27pvO0ubm5Fg+HXC7XRdjV3mkwAACWSNgBAAQh7AAAghB2AABBCDsAgCCEHQBAEMIO\nACAIYQcAEISwAwAIQtgBAAQh7AAAghB2AABBCDsAgCCEHQBAEMIOACAIYQcAEISwAwAIQtgB\nAAQh7AAAghB2AABBCDsAgCCEHQBAEMIOACAIYQcAEISwAwAIQtgBAAQh7AAAghB2AABBCDsA\ngCCEHQBAEMIOACAIYQcAEISwAwAIQtgBAAQh7AAAghB2AABBCDsAgCDy1R4AYFVXKpV+9atf\nTZ48uVKpbLDBBp/+9KeLxWK1hwJYAmEH0JVSqXT66ac//fTTnRcffPDB3/72tz/84Q/79OlT\n3cEA3stLsQBduf322xdVXaepU6f+/Oc/r9Y8AF0QdgBdeeKJJ967ccKECb0/CcAyCTuArnR0\ndLzPjQBVJ+wAujJy5Mj3uRGg6oQdQFc++9nPrrXWWotvaWpqOvroo6s1D0AX/FUsQFcaGhou\nvPDCa6+9duLEiR0dHSNGjBgzZsygQYOqPRfAEgg7gGVoamo66aSTBg4cmM1mp0+fXu1xAJbK\nS7EAAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQ\nwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLADAAhC\n2AEABCHsAACCEHYAAEEIOwCAIIQdAEAQwg4AIAhhBwAQhLADAAgiU6lUqj0Dq67vfOc7Q4YM\nOfroo6s9yGrtySef/PWvf7333ntvttlm1Z5ltXbZZZfNnj37S1/6UrUHWa1NnTp1/Pjx2223\n3c4771ztWVZrt9xyyzPPPHPyySc3NjZWexbewRk7unL77bc/+OCD1Z5idffCCy/ceuutU6dO\nrfYgq7v77rvvzjvvrPYUq7tp06bdeuutkydPrvYgq7vHHnvs1ltvbW1trfYgvJuwAwAIQtgB\nAAQh7AAAgvDHEwAAQThjBwAQhLADAAhC2AEABJGv9gBU27xn7778stsff6El+4EP73bYCYd+\nbEgupZRS6dVHxv1k/O/+/np70wY7HHjSsXusW9/ldlaGGQ99++QflI77xVk7F1NKVqHX/f2K\nI758x8y3L4868drz9mqyEL2u0jLxlsuuvGfCS3OLa2z88cNOPGr7ofmULEQvevQHnznvofI7\ntw3/3I8v+ty6VmEVlzvnnHOqPQNV1PLw97982Sv/8V+n/7+DP9Y08fof3Tnno3ttPiiTSlOu\n+vJ5j615yGlfHLPzkKl3/PTGf35oz22G5pe6nZXhjfv/95u3vNg+dNvP7rReLiWr0OvmPnHH\n+KkjPv+VY/betdM2I9ZubshaiF5WmfqLr37l1rZdT/7K2NFb1k+8+Sf3zNvqU5s3ZyxEL2oc\nNnKLbXd580DYdYu+Lz7+4pB9j9l/5ICKVVjVVVidTb/rtE8f+bPJpTcv3fmVfQ7/6eRKpbLg\nkfNGH/jdPy7o3N7+l0sO2/erv5619O2suPK0e8489JSvnX7wPuc+2FapVKxCFTx92RH7fuvB\nhe/aaiF62cLHvnfQgd+4/61v5eu/+sZxZ931soWonhkPfuOQA75+z2vlilWoAX7HbvU2cPev\nXXzeARu9eTcol0upXC6nlJ6fPGXhhqNGvXkaPT9q1IjKM1OeqSx1Oyuq8u97Lrx27j6nHLj+\n28ekVeht8156acZaw4cX3rXZQvSyf/zlL20f3nG7/m9eHLznmZedu/cwC1EtrRN+fuWTaxxw\n/J5DMskq1ABnSVdvmWL/tddKKaXSgtem/ObSW59d71MnjEipPH3GrHxz86J3ds41NzctfHn6\n7HLbkren1H9JX533q/Kvuy64bv7+5x+wbsvlizZahV730ksvpVzd3WePnfCPmbnBG2174DFj\nPr5OvYXoZfNffbVl4Fr5idd88+f3TXoju8YmOx5ywpht1yxYiOp45ZfXPNCxy5mfWSeTksel\nWuCMHSmlmXeccdBxXx33RH7rvbcbmk2pra0tFQvFtz+gUCik9vb2pW3v/YlDKb986wU3VEZ/\nYb/hucU3W4XeNuell2ZmWtoG7HbS1885/bAtF/7+e1+98E+zLURvmz9/fpr70KXjXt70kC+f\n+7Ujt2p78Dtfv2JSm4WoitLEu+5+ce1PfeajdZ2XrcKqzxk7UkpNe5xz856laU9e9/3vnfaN\njh+fv0ddMbV3LHZEtre3p7q6uuJStvf+xIGUpt5ywc1p9Hc/PbRSKpXKlZRSuVwqV3JL+25b\nhZ7S+Mkzr96qvbG5MZ9S2nDjDQuvHPndO383c+v+FqJX5XP51Fq343+ftv+m+ZTSxmeUnj/y\nx/dMOO5UR0TvK/31/oenb3TQbsMzb27wuLTqc8aOlFKuvqGurnGdbY8fs2Nm0kOPvpEbNGhA\n+4wZc9+6vjRz5py6QYP6LW17daYOYtqjj/zf/GeuHrv/fvvtt99+Z94zJ0344Wf3+/Lt06xC\nr8vVD+ysupRSSg3rrbtGmj59uoXoZY3NzYW09nrrvbUSjeuu11ya9rqFqILKpMcmzPvQ9tsN\nWbTFKqz6hN1qrTL55ycectZvZr11uW3u3PZMXV0hbTByRPHZyZMXdm4uTX56SmbjTTbMLHU7\n3Td49y/94G0n7dCQRh76vz84ZZdmq9DLKpPGHffZL98x7a3Lc5577rX8uusOtRC9LDdi5EaZ\nqc892/Hm5XkvvTSjbq01B1qI3vfSpEmz1/rI5kMW22QVVnn+HbvVWqZ/3WsPXnfX07l11x+U\nZkz+1cU/+/WCbY49YZdhDWsOnHXfVXc80/jB4X1mPn7lj298fZvjx358aD6/lO3V3pFaluvT\nv/ltC56+7YGFHz9hv836ZtPSvttWoWdkBtS/9uBNd08sr/vBNdKMp3950WX3Z/c65b+2GlSw\nEL2rfu3mGb+56q7/a1xveGPb8/f+5NJflz9x0jFbDi5aiF7W/pc7r/jzGnsfv/3wtwPN49Iq\nL1Op+Gvk1Vpp2oQbrrjugYn/nJMbOHyzTxxy5Ogt1sinlFL7yw9efsl1Dz8zM9u84fYHnXjc\nJ4d3/lrs0razUky89NCvvfb2O09Yhd7V8e/Hr7/y+oee/uesjsbh/7HbmGMP3mJwNiUL0etK\n//7TNZeNv/9v/2rtM+zDnzjsv8Zs0/mWOBaiV037xanH/maz7/3syI3esdkqrNqEHQBAEH7H\nDgAgCGEHABCEsAMACELYAQAEIewAAIIQdgAAQQg7AIAghB2wOpn/7O3nHbP7R9Yd1LeuvnGN\n9f5jj2O+fduzC5b0kZPPHZnJZIqfvHza29se+u81M10Ydc7f07RLdi1kMlv/YOoSb/6p0z+U\nyQw/5dFyj+wcgHf7AFYbC548d5ePn/N46xof3nWX/XcbkGY8P+GhK79+7w23fvWB3317m77v\n+Ngnrhk/pW/fvvMeuOKaF4/94nqdG4fueMQJHS1vfsS/Hrn6rkl9tzrogC0GvLll2JYD05CD\nxux+yoP33HTzC6d+af133X7l8etvfC5teMbhH/MzNdBDKgCrh1cu2TWfGnY476l5izaVZ/3+\nK1vUpczm35z0jg8tP3rK8FTc/2tf3CSlkWdNXuKXe3DsoJQ2/+aUd2+fff2+fVLa6n9efPcV\n5UdOGZ7Spu/9DICVxY+NwOriiUcf7UjbHzX2Iw2LNmX67/DNcw/oV/nrL+95ebGPLP3u2hte\nSlt//MQD914/Tb5q3GPL89Jp475j9m1Kf77p5hfeub38h+tveimz1eGHbbJCewHQBWEHrC4G\nDRqU0l/v/+3r73iH7PzuF0564ZVfjR369qaO315z47/TRrvuOmzr0fuvnV76+RX3dSzH7fTZ\ne8zoAemJm25+fvGtpYevv/lf2Y8ffuh6K7IPAF0SdsDqYtvjv/CxvtOuO2iTj+x78neuvPvP\nL84upZRScdDw9YYO7PP2o2Hbr6+95fW0wX77fThlttl/v2Fp2o3jfjl/OW6obvcxnx2S/nzT\nTYuds+u4//qbpxU+cfjBw1bW7gC8l7ADVheZTb5w9wM/PuzDadKdF51x9N5br9/cvP62B3zh\nxw/+c+HiHzbvrmtvn5U+OHr0R1JKmW0P2H9Ymn37uF/MXI5byu982MHrvOOcXfv919/yep9P\nHX7A4JW1NwBLIOyA1Ujz1v99zVP/euWpe64475RDdhvV97XHbvnh53cdufWXfjv9rQ+Zc+s1\nd85Lw0eP3iqllFJmh9H7fSC1/Wrc+FeX43YyO4w5ZP301E03Pdt5ue3X1902o99nDt+vaWXu\nDcC7CTtgdVP3gY/sdfQZF4y/9y+vTJ/60EWHbtz+1++POf2+tpRSSjN+ce2vF6Q1dt56yIud\nXlrnYzsPSB0PX/HzZ5fjNjJbjjl0k/TETTc9n1JKrXdff3tL8+jD9+67rM8DWCHCDlg9TL/3\nG4cf9F9X/WPxbZk+6+w09pqrT/5geu2++yamlNK/b7z2/oUpvf7zA9d/ywaH3TgrpfSXK698\nanlub9MxY/7jzXN28+++/s7Zax50+G7Flbg/AEsg7IDVQ+OciTfe9LOLb33+3Vdk+vdvTKlY\nLKaUXrn+2odKacOjfnbbO1x+zEYpPXP1uD+UluMGNzpszNaZv95yy3Pz77n57nnrHHr4zrmV\ntzcAS+SdJ4DVQ/E/jz7sA78Y962DT9305vP3Wbfuzc1zJ1/+lUv+mj546j6bpvT8+Gv+WM5u\nfdzZx35m3Xd88tYvXX31//v9+Ct+873tP1X33q+9ZMM/N+bjp33+rluuWu9XCzYc6+0mgF7g\ngQZYTdTv9f0bv7h55c8XfHrDoSN23vdzhx9xyGd23mT45sfdMWuz0686e6tcmnLttU+lwieP\nPWLdd3/u0CNO2rdvmnnzuNvmLsctrnnwmN3yf/rO1++avfnhh2+2EncFYCmEHbDaGLDT9x6b\n9MvvnvSpjcr/98hdN15/x++fL251+Lfu+Nuj39mxKaWJ14yfmPruc+zBQ977qf1Hjz10zTT3\nrnE3vbEcNzj4gDF7FmfOnOPtJoBekqlUKsv+KAAAVnnO2AEABCHsAACCEHYAAEEIOwCAIIQd\nAEAQwg4AIAhhBwAQhLADAAhC2AEABCHsAACCEHYAAEH8f8XXfNGWt4BnAAAAAElFTkSuQmCC\n", "text/plain": [ "plot without title" ] }, "metadata": { "image/png": { "height": 420, "width": 420 } }, "output_type": "display_data" } ], "source": [ "ggplot(data = sat.dat, mapping = aes(x=SATV, y=ACT)) + geom_boxplot()" ] }, { "cell_type": "markdown", "id": "0b2c161d", "metadata": {}, "source": [ "Now, a cool way of depicting the shape of the data is by using **VIOLINPLOTS**. These plots are similar to boxplots, although they show the probability density of the data at different values. Typically, these plots are shown in addition to boxplots. In ggplot, this kind of plots can be rendered using the function `geom_violin` ." ] }, { "cell_type": "markdown", "id": "21c985d7", "metadata": {}, "source": [ "
Practice: Add to the previous graph a layer with the function `geom_violin`
" ] }, { "cell_type": "markdown", "id": "7e2136ce", "metadata": {}, "source": [ "**N.B.** Remember that layers are shown on top of previous ones. Sometimes you may need to play with the order of the layers or with the aesthetics in order to make the final graph more appealing." ] }, { "cell_type": "markdown", "id": "a4e51f8f", "metadata": {}, "source": [ "
(HOME) Practice : Things are getting more and more prettier, right? Take a couple of minutes to make your graph visually more appealing. You could play around with the colors, the width of the boxplots, transparencies... Take a look at the documenation of the function that we've just used and have fun!
" ] }, { "cell_type": "markdown", "id": "e6638f7e", "metadata": {}, "source": [ "Another nice way of summarizing your data visually is by using the function `stat_summary` from ggplot. Let's have look at its documentation." ] }, { "cell_type": "code", "execution_count": 51, "id": "296b6bb0", "metadata": {}, "outputs": [], "source": [ "?stat_summary" ] }, { "cell_type": "markdown", "id": "40964671", "metadata": {}, "source": [ "
Practice: Try it yourself replacing geom_boxplot with stat_summary in the first graph that you made (i.e. the one showing ACT versus education)
" ] }, { "cell_type": "code", "execution_count": 55, "id": "1041fabf", "metadata": {}, "outputs": [], "source": [ "# ggplot(data = sat.dat, mapping = aes(x = education, y = ACT)) + YOUR CODE" ] } ], "metadata": { "kernelspec": { "display_name": "R", "language": "R", "name": "ir" }, "language_info": { "codemirror_mode": "r", "file_extension": ".r", "mimetype": "text/x-r-source", "name": "R", "pygments_lexer": "r", "version": "4.2.2" } }, "nbformat": 4, "nbformat_minor": 5 }