R is open-source software, which means using it is completely free. Second, open-source software is developed collaboratively, meaning the source code is open to public inspection, modification, and improvement.
R is widely used in the physical and social sciences, as well as in government, non-profits, and the private sector.
Many developers and social scientists write programs in R. As a result, there is also a large support community available to help troubleshoot problematic code. As seen in the Redmonk programming language rankings (which compare languages’ appearances on Github [usage] and StackOverflow [support]), R appears near the top of both rankings.
R, like any computing language, relies on programmatic execution of functions. That is, to do anything you must write code. This differs from popular statistical software such as Stata or SPSS which at their core utilize a command language but overlay them with drop-down menus that enable a point-and-click interface. While much easier to operate, there are several downsides to this approach - mainly that it makes it impossible to reproduce one’s analysis.
graphics
package is comprehensive and powerful, additional libraries such as ggplot2
and lattice
make R the go-to language for power data visualization approaches.Python was developed in the 1990s as a general-purpose programming language. It emphasizes simplicity over complexity in both its syntax and core functions. As a result, code written in Python is (relatively) easy to read and follow as compared to more complex languages like Perl or Java. As you can see in the above references, Python is just as, if not more, popular than R. It does many things well, like R, but is perhaps better in some aspects:
That said, there are also things it does not do as well as R:
matplotlib
, pygal
, and seaborn
), but are still behind R in terms of comprehensiveness and ease of use. Of course, once you wish to create interactive and advanced information visualizations, you can also used more specialized software such as Tableau or D3.numpy
), data analysis (pandas
), and machine learning (scikit-learn
).This course could be taught exclusively in Python (as it was in previous incarnations) or a combination of R and Python (as it was in fall 2016). From my past experience, I think the best introduction to computer programming focuses on a single language. Learning two languages simultaneously is extremely difficult. It is better to stick with a single language and syntax. Once you complete this course, you will have the basic skills necessary to learn Python on your own.
This work is licensed under the CC BY-NC 4.0 Creative Commons License.