## Working with dates: lubridate

https://rpubs.com/davoodastaraky/lubridate

## Working with Strings: stringr

http://edrub.in/CheatSheets/cheatSheetStringr.pdf

## Working with categorical data

In [1]:
data <- data.frame(Smoker=c(72L,34L),Non_Smoker=c(44L,53L),row.names = c("Male","Female"))
dataM <- as.matrix(data)

In [2]:
print(dataM)

       Smoker Non_Smoker
Male       72         44
Female     34         53


In [3]:
### calculate sum of table entries Using base functions

In [3]:
margin.table(dataM) # total members (marginal distribution)

In [8]:
print(margin.table(dataM, 1)) # row-wise summary total male and total females
print(margin.table(dataM, 2)) # column-wise summary total Smokers and total Non-Smokers

  Male Female 
   116     87 
    Smoker Non_Smoker 
       106         97 


In [16]:
addmargins(dataM) #adding marginal distribution summaries 

Unnamed: 0,Smoker,Non_Smoker,Sum
Male,72,44,116
Female,34,53,87
Sum,106,97,203


In [20]:
ftable(dataM)

        Smoker Non_Smoker
                         
Male        72         44
Female      34         53

In [19]:
ftable(addmargins(dataM))

        Smoker Non_Smoker Sum
                             
Male        72         44 116
Female      34         53  87
Sum        106         97 203

In [25]:
prop.table(table(dataM)) # this is better for single variable rather than two-way tables

dataM
  34   44   53   72 
0.25 0.25 0.25 0.25 

In [21]:
prop.table(dataM, 1)

Unnamed: 0,Smoker,Non_Smoker
Male,0.6206897,0.3793103
Female,0.3908046,0.6091954


In [23]:
prop.table(dataM, 2)

Unnamed: 0,Smoker,Non_Smoker
Male,0.6792453,0.4536082
Female,0.3207547,0.5463918


In [22]:
prop.table(dataM)

Unnamed: 0,Smoker,Non_Smoker
Male,0.3546798,0.2167488
Female,0.1674877,0.2610837


## Working with categorical data: using gmodels package

In [14]:
#install.packages("gmodels")
library(gmodels)

In [15]:
CrossTable(dataM) # total contingency table using CrossTable function from gmodels package


 
   Cell Contents
|-------------------------|
|                       N |
| Chi-square contribution |
|           N / Row Total |
|           N / Col Total |
|         N / Table Total |
|-------------------------|

 
Total Observations in Table:  203 

 
             |  
             |     Smoker | Non_Smoker |  Row Total | 
-------------|------------|------------|------------|
        Male |         72 |         44 |        116 | 
             |      2.156 |      2.356 |            | 
             |      0.621 |      0.379 |      0.571 | 
             |      0.679 |      0.454 |            | 
             |      0.355 |      0.217 |            | 
-------------|------------|------------|------------|
      Female |         34 |         53 |         87 | 
             |      2.875 |      3.142 |            | 
             |      0.391 |      0.609 |      0.429 | 
             |      0.321 |      0.546 |            | 
             |      0.167 |      0.261 |            | 
-------------

## data.table

https://s3.amazonaws.com/assets.datacamp.com/img/blog/data+table+cheat+sheet.pdf

## Piping with `%>%` from `magrittr` package

## dplyr

![Imgur](https://i.imgur.com/DNMxDWh.png?1)

**course:**  
https://courses.edx.org/courses/course-v1:HarvardX+PH125.6x+2T2019/course/  

**Data transformation**  
https://r4ds.had.co.nz/transform.html  

## working with fivethirtyeight data

http://www.storybench.org/getting-started-with-tidyverse-in-r/  
http://www.storybench.org/how-to-explore-a-dataset-from-the-fivethirtyeight-package-in-r/  

<span style="color:red; font-family:Comic Sans MS">References</span>  
<a href="https://tbrieder.org/epidata/course_reading/e_aragon.pdf" target="_blank">https://tbrieder.org/epidata/course_reading/e_aragon.pdf</a>

<span style="color:red; font-family:brandon">Further  Resources</span>  
<a href="https://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf"   target="_blank">https://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf</a>    
<a href="https://ugoproto.github.io/ugo_r_doc/dplyr.pdf" target="_blank">https://ugoproto.github.io/ugo_r_doc/dplyr.pdf</a>  
<a href="https://rpubs.com/bradleyboehmke/data_wrangling" target="_blank">https://rpubs.com/bradleyboehmke/data_wrangling</a>  