In this lesson, we will learn how to install R packages step-by-step from R CRAN repositories and GitHub using the install package
command.
The command syntax to install R packages is install.packages()
and we will learn exactly how to use it.
We will also look into how package management in R works, installs, and manages CRAN mirrors, and package masking for situations when two packages use the same function name. And if we learn to install R packages from the command line, we should also learn how to remove them. Right?
We will not cover installing R and RStudio on your computer; I assume you have already done that. However, if not, you can follow the guide in the link above and get R and RStudio up and running on your operating system before proceeding further with this guide.
But first, let’s clear some air regarding packages in R
Understand Package Management In R
Simply put, an R package consists of a series of R functions, such as datasets, support files, and compiled code, packed compact and well-defined.
These packages are just compressed files that must be unzipped and placed in the right location on your machine before you can use them in R.
This entire process happens automatically and does not require user intervention. Aside from the software itself, the R installation file includes about 30 default or recommended packages, where about seven are loaded into memory immediately once R is launched.
These packages are mounted in a designated folder on your computer and are used for a broad range of computational tasks, such as data management and statistical analysis.
Nevertheless, because R is an open-source programming language, many user-contributed packages are available for various purposes and are publicly accessible to anyone.
These packages can be accessed from the CRAN website or R’s repositories. The following segment provides further specifics regarding how to download and install the user-contributed R packages.
Launch R and execute the following command in the R console:
search()
And the output:
The R output above (second and third line) displays a collection of search path items. The numbers in square brackets show the positional index of the unit immediately to the right, e.g., the number [5] indicates that the fifth item in the list, respectively, is the grDevices
package.
You can observe that not all packages have a position number [x] assigned to them, but as long as you can count them, you can figure out the respective position yourself.
The above output may appear slightly different on your R system. For instance, on other displays, the R console window will be sized automatically to fit the supported resolution, especially in width. As a result, the R console will be scaled to match it.
Another reason might be that you may have other packages loaded in the system, either because you installed them yourself or because the R team has included (or removed) regular packages in newer versions of R.
Next, let’s have a look at some of the most important options in the search()
output command shown above:
The GlobalEnv entry is often located in the first position [1] of the search route and is not an R package. The GlobalEnv stands for Global Environment and represents the location where freshly created R objects are placed in the memory.
The package package:base
is always positioned last, and unlike the other packages, package:base
cannot be removed.
If you are using RStudio, you will see an additional entry here, respectively tools:rstudio
, in addition to the output above.
Remember that the 30 default R packages are not all loaded in the memory when R is launched. You can use the function library in R to load any of these or any other package installed but not yet loaded in R memory.
For instance, the package MASS
is one of the 30 packages installed but not loaded into the memory. I will use the command library
to load it:
library(MASS)
And the output:
As you can see, the package:MASS
is loaded in the position [2] once installed, and all the other packages are pushed to one position higher.
It is important to remember that the location [x] of each package in the search path is significant as it establishes the priority for replication of functions.
When we load packages into R memory, we ensure the functions in these packages are available throughout the session. Remember, memory management plays a crucial role in programming.
Packages no longer necessary in a session can be excluded from the search route. To unload a package from memory, you need to use the command detach
as follows:
detach(package:MASS)
You can also remove a package from memory by specifying its position number, as shown in the picture below:
detach(pos=2)
You will remove the stats
package from the R search path if you run the previous commands. Once removed, you will receive an error message if you want to use the functions in this package.
If you’re looking for more details, check the help page of detach
for more information. You can load the package anytime you want without any adverse effects on your system using the command below:
library(stats)
As mentioned before, the package:base
and GlobalEnv
packages cannot be removed.
An essential thing to remember is that packages that you load manually using the library
command will be automatically detached when you quit R and not be reloaded when you start another R session.
And one more thing. You may have heard about user-contributed R packages. If not, you should know that these packages are developed by R users worldwide and are entirely free.
The user-contributed packages are available on the CRAN project website and are quite a number for any possible scenario you may need. Go have a look.
There are over 14,000 community packages ready to install and use in R at the time of writing this guide – though the number increases by the day. The user-contributed packages aim to reduce the complexity of numerous commands required to accomplish specific tasks in R.
How To Install R Packages
When we choose to use a function or dataset from a user-contributed package in R, we must follow two basic steps:
1. Install R packages by executing the install package
function.
If you downloaded the package manually from the Internet, you can install it directly from the respective .zip or .tar file.
Let’s install the MySQL package for R this way using the command:
install.packages("RMySQL")
R will most likely show the following output message in your terminal:
# — Please select a CRAN mirror for use in this session —
This message means R cannot find the RMySQL package in its repository, and we must install a mirror first.
If you downloaded the package manually from the Internet, you can install it directly from the respective .zip or .tar file by simply introducing the path to the file location on your computer between the
“”
in the command above.
2. Install R packages using CRAN mirrors.
For instance, when installing the R RMySQL
package, you will be asked to select a CRAN mirror from where the package is downloaded.
You can get a repository list window or a text menu with a few choices. But if it doesn’t appear, you can still choose the mirror from which to import the packages using the repositories parameter repos=
, and after doing so, R will no longer bother you to select a CRAN mirror.
Here is an example of using the US mirror to get the RMySQL
package in my R system:
install.packages('RMySQL', repos='http://cran.us.r-project.org')
And the command output:
Here, you can find the list of all available R mirrors in various geographical locations. You should select the nearest CRAN mirror to your place, especially if you have a slow Internet connection.
Finally, it is essential to remember that a package that depends on another cannot be detached from the system.
Where Are R Packages Installed?
Depending on the operating system you are using (Windows, macOS, Linux/UNIX) or your user privileges, the location of the installed R packages may differ, as well as the access to the R package installation folder.
To find out the patch where R is storing its packages, type in R the following command:
.libPaths()
Typically, on a Windows machine, the R packages will be located in the “C:\Program Files\R” folder.
On a macOS machine, the R packages are typically installed in the “/Library/Frameworks/R.framework/Resources/library” folder.
If you prefer a custom location to install the R packages, you will need to define it in the .Rprofile
. For instance, on a Mac computer, we can instruct R to install R packages at a custom location using:
.libPaths( "/Users/tex/lib/R" )
The .Rprofile
file on Windows is typically located in the C:\Program Files\R\R-…\etc folder, but you can specify a custom location during the R installation wizard.
R will remember the new path and install the packages at this location from now on.
How To Install R Packages From GitHub
Some R packages developed by the R community are located on GitHub. To install R packages from GitHub, we will need to install the devtools
package in R first. To do this, type in the R console the following command:
install.packages("devtools")
The devtool
package and several dependencies are now installed in your system. However, devtool
is not loaded in R memory yet; therefore, we need to instruct R to do so using the following command:
library(devtools)
To install R packages from GitHub, head over to GitHub and take note of the package author and package name.
In this example, I will install Allison Horst’s palmerpenguins
package by using the install_github
function.
install_github("allisonhorst/palmerpenguins")
As you can see, the palmerpenguins
is now listed in the Packages tab in R.
And, as mentioned before, the palmerpenguins
package is not loaded into memory until we call the library function:
library(palmerpenguins)
How To Uninstall R Packages
We’ve seen that it is quite easy to install R packages. What about uninstalling them? Well, simple as well. The command to uninstall R packages is remove.packages()
. The package’s name must be placed between ""
as shown in the following example.
Earlier, we installed the palmerpenguins
package from GitHub. Using the command below, we can remove this package from R:
remove.packages("palmerpenguins")
And the output:
How To Mask Packages In R
Some user-contributed R packages may contain functions with the same name as functions in another package. A warning message will pop up in the R terminal when this situation occurs. This situation is called masking.
A function in the same package cannot have two names as well and you cannot create two files with the same name in a directory on your machine. However, functions in different packages can have the same name and do completely different things.
In the above example, we installed the dplyr package loaded the dplyr function in the memory and received objections from three objects (packages) with the same name loaded in the memory.
If you want to use a function that a recently loaded function has masked, you have the following options:
- Detach the package you don’t use using the
detach
function, or - Give the package you want to use a higher priority by loading it before other packages in your project.
To check which package has the highest priority, check the search path:
search()
The package with the smaller position number and closer to the GlobalEnv
has the highest priority.
In conclusion, user-contributed packages should be used only when you need them. If you plan not to use a package that is loaded in the memory, a good practice is to detach it to avoid further function conflicts.
Remember that when you quit R, all the packages loaded in the memory will be automatically detached. When starting a new session, R will load only the base packages.
Wrapping UP
Though there are definitely more scenarios to cover, by now you should be fairly confident with how to install R pages from various sources and manage R packages in your system.
However, if something does not go as planned, you can always refer to the R manuals available by executing the help.start()
command in the R console.