---
title: "Homework 1"
author: "Michael Pearce"
date: "`r Sys.time()`"
output: html_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)

##Loading data
data(swiss)

##Counting variables and observations
var_num <- ncol(swiss)
obs_num <- nrow(swiss)
var_names <- names(swiss)
```

## Explanatory Analyses


In this *swiss* dataset, we have `r obs_num` observations of `r var_num` variables. The variable names are: `r var_names`. In the interest of time, space, and mental exhaustion, let's just pick one to explore!

Let's see, maybe R can randomly pick something for us...

```{r random number, echo=FALSE}
random_num <- floor(runif(1, min=1, max=6))
```

A random number generator gives us: `r random_num`. That corresponds to `r var_names[random_num]`, cool! Now while it would be fun to do the rest of this assignment with a random variable, we can't really make comments on plots and such without knowing it in advance. Let's plot this variable!

```{r plots, echo=FALSE}
hist(swiss$Education, xlab="Education Index Value", ylab="Number of Municipalities", main="Histogram of Education Indices", col="cornflowerblue")
```

```{r values, echo=FALSE}
ed_mean <- round(mean(swiss$Education), 2)
ed_sd <- round(sd(swiss$Education), 2)
ed_med <- round(median(swiss$Education), 2)
```
Hmm. A couple of those municipalities are way up there in this right-skewed distribution, specifically. As it stands currently, this variables has a mean of `r ed_mean` and a standard deviation of `r ed_sd`. But since it's so skewed, the median (`r ed_med`) is the better indicator of its center. It would be worth doing a log transformation to get a more normally distributed variable as we see below. We would definitely want to work with this transformed variable in any modeling.

```{r transformation, echo=FALSE}
swiss$trans_ed <- log10(swiss$Education)
hist(swiss$trans_ed, xlab="Log of Education Index", ylab="Number of Municipalities", main="Histogram of log-transformed Education Indices", col="darkseagreen2")
```

And finally, a pairwise scatterplot to give us an idea of correlative relationships to explore in the future. It'll have to be tabled (heh) for now.

### Pairwise comparisons of *swiss* variables
```{r pairwise plot, echo=FALSE}
pairs(swiss, pch = 8, col = "cornflowerblue")
```