---
title: "Thoughts on Trial Homework"
author: "Eric C. Anderson"
output:
  html_document:
    toc: yes
  bookdown::html_chapter:
    toc: no
layout: default_with_disqus
---


```{r setup, echo=FALSE, include=FALSE}
# PLEASE DO NOT EDIT THIS CODE BLOCK
library(knitr)
library(rrhw)
# tell knitr where to find the inserted file in case
# jekyll is building this in the top directory of the repo
opts_knit$set(child.path = paste(prj_dir_containing("rep-res-course.Rproj"), "extras/knitr_children/", sep=""))

init_homework("Trial Homework Redux")
rr_github_name <- NA
rr_pull_request_time <- NA
rr_question_chunk_name <- "NotSet"
rr_branch_name <- "ex-test"
rr_hw_file_name <- "exercises/trial_homework.rmd"
```


# Comments and thoughts on Homework #1 (Trial Homework) {#trial-homework-thoughts}

## Preliminaries

### First off!
1. Woo-hoo!  Way to go everyone who got those in!
1. Woo-hoo!  Way to go everyone who is still working on it!

I'm pumped by how many people made their first pull request.


### What does a pull request look like to me?

* Check it out!
    * I get an email and gmail is github-aware
    * I can see the chnanges that you have made
    * I can comment, etc.
* You can all do this too! Just go to https://github.com/eriqande/rep-res-course and find the pull requests button.
    * In fact, if you aren't sure how to do the homework or what the 
    best answer is, feel free to browse what other people have done and get ideas.
        + I don't consider this cheating---especially if you view everyone's responses with a 
        scientific attitude.  You'll be learning about GitHub and reviewing lots of R code.
        + Keep in mind that some suggested answers you see from other people might not be optimal.
        + If you see that someone has made a mistake and want to let them know,
        just comment on their commit.
* Note, please keep your pull requests __Open__.  That way my scripts
can fetch your work easily.
* I will __Close__ them when we are done with them. You can always Re-open them.

### What if I want to change an answer?

*  By all means, feel free.  This is where GitHub really excels.
*  Just make your changes, commit them, and push them up and the pull request should
be automatically updated (I think...)


### My responses data base
Show it to them.  View(ans)


## General comments from what I saw

It is great to have everyone's responses.  Here are some comments that should be helpful to
everyone.

### Strive for Economy of characters
When you are writing code, usually, but not always) shorter is going to be

1. easier to read
1. easier to debug
1. easier to maintain

As long as it clearly expresses the intent of the program.  


Along those lines, (intermixed with some of my OCD code-style ideas) some guidelines are:

1. You don't have to define intermediate variables. Sometimes it is helpful to break up
long calculation with some intermediates, but not always. So:
    ```{r, eval=FALSE}
    # this is preferred
    gnames > "github"

    # this makes unnecessary variable assignments
    a <- "github"
    b <- a < gnames
    b

    # this also makes unnecessary variable assignments
    y <- c("github")
    x <- gnames > y
    x
    ```
The important take home is that an _expression_ basically behaves like a _variable_ anywhere in R.
1. Character vectors don't have to be a single character, so you can say what you want!
    ```{r, eval=FALSE}
    # this is preferred
    gnames > "github"

    # this is not so precise.  Might work in a certain
    # problem, but is not general:
    gnames > "g"
    ```
1. You don't have to repeat the question in the answer:
    ```{r, eval=FALSE}
    # here are some github names of people taking the course
    gnames <- c("cpetrik", "wildflowermt", "mad4mocha", "sjohnson216", "okisutch99", "sczTWilliams", "rbeas", "mtarjan", "aaronmams", "lslefebvre")

    # return a logical vector that gives TRUE for each name that comes after
    # the word "github" alphabetically
    submit_answer({
    gnames <- c("cpetrik", "wildflowermt", "mad4mocha", "sjohnson216", "okisutch99", "sczTWilliams", "rbeas", "mtarjan", "aaronmams", "lslefebvre")
    b <- c("github")
    gnames > b
    })
    ```
2. If doing comparisons, put the variable on the left and the constant (if there is one) on the right:
    ```{r, eval=FALSE}
    gnames > "github"  # eric prefers this
    "github" < gnames  # rather than this
    ```
3. Some things aren't necessary.  They aren't wrong, but they are not economical and make code
harder to read.  The top few from the last homework:
    1. If it is a vector, you don't have to put `c()` around it to make it a vector:
        ```{r, eval=FALSE}
        y <- c(gnames[x])  # gnames[x] is a vector already.
        y <- gnames[x]     # same things as above, but preferred
        ```
        The `c()` function is for _catenating_ vectors, (but beware of "growing vectors", see
        below.)
    2. Logical vectors index as logical vectors.  They don't have to be wrapped in `which()`.  The
    function `which(LL)` returns the indexes for which the logical vector `LL` is `TRUE`.  Many
    people wrap their logical vectors in it.  Don't.
        ```{r, eval=FALSE}
        gnames[which(gnames > "github")] <- "zzz"  # unnecessary which
        gnames[gnames > "github"] <- "zzz"   # same thing and simpler
        ```
    2. Also, if it is a logical vector, you need not coerce it to a logical---it already is:
        ```{r, eval=FALSE}
        as.logical(gnames > "github")  # unnecessary coercion
        gnames > "github" # the > comparison operator returns a logical vector anyway
        ```
    2. Get comfortable with precedence
        ```{r, eval=FALSE}
        isAfterGithub <- (gnames > "github") # parentheses unnecessary
        isAfterGithub <- gnames > "github"   # same as above but easier to read
        gnames > "github"  # best: no intermediate assignment when not needed
        ```

### Don't use a for loop if the vectorized operation will get you there
This was one of the hardest things for me as a C programmer, and I suspect
that python programmers might find it a difficult too.

* Remember. R is a vectorized language.  If you give it a vector it wants to 
operate elementwise on every element in that vector.  This means that quite
often you needn't write for loops for operations that you do have to write
for loops for in C or python.
    ```{r, eval=FALSE}
    # this is concise and precise (and computationlly efficient)
    gnames < "github"

    # this is how a C programmer things about it:
    x <- c()   # make an empty vector
    for (name in gnames) { # let name cycle over the values in gname
        if (name > "github")  # test each value
            x <- c(x, TRUE) # if it is true, "grow"" x with a TRUE
        else x <- c(x, FALSE) # if it if FALSE "grow" x with a FALSE
    }
    x # return x
    ```
The latter is clearly harder to write, harder to maintain, and easier to
hide bugs in than the former.
* BUT, did you know that it is also orders
of magnitude slower in R?  
    + Try this at home, comparing 10^5 numbers:
    ```{r, eval=FALSE}
    x <- rnorm(n = 10^5, mean=1.0, sd=5)  # make 10^5 numbers
    
    # test if any are greater than 2.

    # the fast, vectorized way
    g2_fast <- x > 2

    # slower for-loop way
    gt_slow <- c() 
    for(i in 1:length(x)) {
      gt_slow <- c(gt_slow, x[i]>2)
    }

    # see that you get the same result with either method:
    all(g2_fast == gt_slow)
    # but clearly the vectorized operation is faster
    ```
The much maligned "slowness" of R, is sometimes attributable to not doing
vectorized operations.