Consult the general homework guidelines.
Due anytime Wednesday 2015-10-28.
Goals:
.Remember the sampler concept. Your homework should serve as your own personal cheatsheet in the future for canonical tasks. Make things nice – your future self will thank you!
You can work with the Gapminder excerpt or take this chance to play with something else. In which case, you’ll have to create comparable tasks for yourself.
Drop Oceania. Filter the Gapminder data to remove observations associated with the continent of Oceania. Additionally, Use droplevels() to remove unused factor levels. Provide concrete information on the data before and after removing these rows and Oceania; address the number of rows and the levels of the affected factors. Use a figure that includes a legend to further explore the effects of filtering data and/or changing factor levels.
Reorder the levels of country or continent. Use reorder() to change the order of the factor levels, based on a summary statistic of one of the quantitative variables or another derived quantity, such as estimated intercept or slope. If you use a summary of, e.g., life expectancy, try something besides the default of mean().
Characterize the (derived) data before and after your factor re-leveling.
arrange(). Does merely arranging the data have any effect on, say, a figure?reorder() and reorder() + arrange(). What effect does this have on a figure?Remake at least one figure, in light of something you learned in the recent class meetings about visualization design and color. Maybe juxtapose before and after and reflect on the differences. Consult the guest lecture from Tamara Munzner and everything here.
Use ggsave() to explicitly write a figure to file. Then use  to embed it in your report. Things to play around with:
ggsave(), such as width, height, resolution or text scaling.p via ggsave(..., plot = p). Show a situation in which this actually matters.You have 6 weeks of R Markdown and GitHub experience now. You’ve reviewed 8 peer assignments. Surely there are aspects of your current repo organization that could be better. Deal with that. Ideas:
hw03 dplyr verbshw03Rmd but that could be md? Convert it.Play with the factor(, ... levels = ...) function for explicitly setting factor levels. It’s behavior can be surprising!
stringsAsFactors = FALSE in read.table() followed by an explicit call to factor(). When might you do this?write.table() / read.table() workflow?country based on estimated slope or intercept in j_coefs THEN apply those factor levels back to country in the raw Gapminder data.Revalue a factor
country factor levels into new ones.
plyr package: revalue() or mapvalues()car package: recode()Experiment with gluing two factors together. What if they have the same levels? Different levels? Try it gluing two “naked” factors together, then try it again when those factors are embedded in conformable two data.frames.
You’re encouraged to reflect on what was hard/easy, problems you solved, helpful tutorials you read, etc. Give credit to your sources, whether it’s a blog post, a fellow student, an online tutorial, etc.
Follow instructions on How to submit homework
Check minus: One or more elements are missing or sketchy. Missed opportunities to complement code and numbers with a figure and interpretation. Technical problem that is relatively easy to fix. It’s hard to find the report in this crazy repo.
Check: Hits all the elements. No obvious mistakes. Pleasant to read. No heroic detective work required. Well done! This should be the most typical mark!
Check plus: Exceeded the requirements in number of dimensions. Developed novel tasks that were indeed interesting and “worked”. Impressive use of R – maybe involving functions, packages or workflows that weren’t given in class materials. Impeccable organization of repo and report. You learned something new from reviewing their work and you’re eager to incorporate it into your work.