2.1 A bit of R history
R is a programming language, a statistical programming language, designed by Ross Ihaka and Robert Gentleman as an implementation of the S programming language.
Originally, the underlying goal of the researchers of the University of Auckland, New Zealand, was to develop a language that was capable to do data analysis, statistics and graphical models in user-friendly way. The project was first conceived in 1992, with its first version released in 1995 and a stable beta version in 2000. Noways, R is the lingua franca of statistics, and it is currently developed by the R Development Core Team, of which Chambers is a member.
Curiously, R is named partly after the first names of the first two R authors and partly as a play on the name of S. ref
2.2 Advantages of R
There are many reasons why R is the language of Data Science and Statistics.
- It is Free and Open-source.
- It runs on UNIX, Windows and Macintosh.
- It is especially written for vector operations. [no need of for loops]
- It has one of biggest online communities, where you can ask questions, get help, etc.
- It offers 7000+ packages, which expands it capabilities - through allowing people to create content - giving R endless possibilities.
- It is a Programming language based on S, which allows for very fast operations, which is why it is considered to be the language of data science.
- There are several user-interfaces which you can use (e.g., R-Studio, Jupyter)
If you would like a more informative descrition of why you should learn R, there is one blog post that goes at length in explaining it. There also this one.
2.3 Advantages of R with RStudio
While R has a command line interface, there are several graphical front-ends available. In this course we will explore RStudio which has many (many!) features that will be useful in learning R. Here’s what the partnership between R & Rstudio can do.
2.4 R with RStudio vs. other statistical softwares
If you are interested in knowing how R (and RStudio) compare to other software, here’s a good source. The information contained in the link is summarized in the below table.
2.4.1 R with RStudio vs. Python
On the off chance that you are wondering whether to learn R or Python, both are great. Probably, most Statisticians have a slight preference for R, while Data Scientists prefer Python. In my personal view, R may be a tad easier, but Python can be really useful. For example, Python is particularly useful for deep learning, scripting, and big data-sets (> millions cases). One of the best resources discussing this issue freely available is on datacamp.com.
2.5 Getting Started with R & RStudio:
2.5.1 Downloading and Installing R
This is the website where you can download R, and many of the library packages that are available.
2.5.2 Updating R
If you have R already installed, you want update your R to the latest version. You can do so by running the below code. It will check for newer versions, and if one is available, it will guide you through the decisions you will need to make.
install.packages("installr") # Install R package that facilitates the process library(installr) # load the package in R updateR() # update R
2.5.3 Downloading RStudio
RStudio is a great interface that makes R a lot more accessible. RStudio includes a console, syntax-highlighting editor that supports direct code execution, as well as tools for plotting, history, debugging and work-space management.
2.5.4 Updating RStudio
If you have RStudio installed, you also want its latest version. Go to Help > Check updates in the menu.
2.6 Need more help?
Here’s a video depicting the installation of R and RStudio (link).
If you would like to learn R with video lessons, in this page you will find a collection of R online video courses on YouTube.
2.7 RStudio Settings: personal recommendations
Before we start the workshop, lets go through a number of settings which are worthwhile to know about.
The advantages of these setting will bring us:
- Code completion
- Inline documentation
- Live preview of R markdown documents
Click on Tools menu, find Global options (last option).