--- title: 'Laboratory Exercise Week 4' author: "Your Name and Section, 10 pts" date: "Todays Date" output: word_document --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` *Directions*: * Write your R code inside the code chunks after each question. * Write your answer comments after the `#` sign. * To generate the word document output, click the button `Knit` and wait for the word document to appear. * RStudio will prompt you (only once) to install the `knitr` package. * Submit your completed laboratory exercise using Blackboard's Turnitin feature. Your Turnitin upload link is found on your Blackboard Course shell under the Laboratory folder. *** For this exercise, you will need to use the package `mosaic` to find numerical and graphical summaries. ```{r warning=FALSE, message=FALSE} # install packages if necessary if (!require(mosaic)) install.packages(`mosaic`) if (!require(dplyr)) install.packages(`dplyr`) if (!require(gapminder)) install.packages(`gapminder`) # load the package in R library(mosaic) # load the package mosaic to use its functions library(dplyr) # load the package dplyr to use its functions library(gapminder) # load the package gapminder for question 1 ``` 1. Using the gapminder data in the lesson, do the following: i) use `filter` to select all countries with the following arguments: a) life expectancy larger than 60 years. b) United Kingdom and Vietnam and years greater than 1990. ii) use `arrange` and `slice` to select the countries with the top 15 GDP per capital `gdpPercap`. Use the pipe `%>%` operator to string multiple functions. iii) use `mutate` to create a new variable called `gdpPercap_lifeExp` which is the quotient of `gdpPercap` and `lifeExp` and display the output. iv) use `summarise` to find the average or mean value of the variable `gdpPercap_lifeExp` created in part (iii). v) use `group_by` to group the countries by `continent`; and `summarise` to compute the average life expectancy `lifeExp` within each continent. Use the pipe `%>%` operator to string multiple functions. ### Code chunk ```{r} # load the necessary packages library(mosaic) library(dplyr) library(gapminder) # last R code line ``` 2. The data set `MLB-TeamBatting-S16.csv` contains MLB Team Batting Data for selected variables. Load the data set from the given url using the code below. This data set was obtained from [Baseball Reference](https://www.baseball-reference.com/leagues/MLB/2016-standard-batting.shtml). * Tm - Team * Lg - League: American League (AL), National League (NL) * BatAge - Batters’ average age * RPG - Runs Scored Per Game * G - Games Played or Pitched * AB - At Bats * R - Runs Scored/Allowed * H - Hits/Hits Allowed * HR - Home Runs Hit/Allowed * RBI - Runs Batted In * SO - Strikeouts * BA - Hits/At Bats * SH - Sacrifice Hits (Sacrifice Bunts) * SF - Sacrifice Flies Using the `mlb16.data` data, do the following: i) use `filter` to select teams with the following arguments: a) Cardinals team `STL`. b) teams with Hits `H` more than 1400 last 2016 season. c) team league `Lg` is National League `NL`. ii) use `arrange` to select teams in decreasing number of home runs `HR`. iii) use `arrange` to display the teams in decreasing number of `RBI`. iv) use `group_by` to group the teams per league; and `summarise` to compute the average `RBI` within each league. Use the pipe `%>%` operator to string multiple functions. ### Code chunk ```{r} # load the data set mlb16.data <- read.csv("https://raw.githubusercontent.com/jpailden/rstatlab/master/data/MLB-TeamBatting-S16.csv") str(mlb16.data) # check structure head(mlb16.data) # show first six rows # last R code line ```