Follow

R

Description

R is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment, itself developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R.

R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, and so forth) and graphical techniques, and is highly extensible. The S language is often the vehicle of choice for research in statistical methodology, and R provides an Open Source route to participation in that activity.

The R home page is at http://www.r-project.org.

Available modules

Packages with modules

Module NeSI Cluster
R/3.0.1 fitzroy
R/3.4.2-gimkl-2017a pan
R/3.0.3-goolf-1.5.14 pan
R/3.1.1-goolf-1.5.14 pan
R/3.1.1-iomkl-6.5.4 pan
R/3.1.2-goolf-1.5.14 pan
R/3.2.1-intel-2015a pan
R/3.3.0-intel-2015a pan
R/3.4.0-gimkl-2017a pan

Licence

R is made available at no cost under the terms of version 2 of the GNU General Public Licence. The full text of the R licence is available at https://www.r-project.org/COPYING.

Example scripts

Example R scripts

Example serial R script

png(filename="plot.png")  # This line redirects plots from screen to plot.png file.

# Define the cars vector with 5 values
cars <- c(1, 3, 6, 4, 9)

# Graph the cars vector with all defaults
plot(cars)

Example array R script

jobid = as.numeric(Sys.getenv("SLURM_ARRAY_TASK_ID"))
jobid

Example parallel script using doParallel

library(doParallel)
registerDoParallel(strtoi(Sys.getenv('SLURM_CPUS_PER_TASK')))

# 50 calculations to be done:
foreach(z=1000000:1000050) %dopar% {
  x <- sum(rnorm(z))
}

Example parallel script using doMPI

library(doMPI, quiet=TRUE)
cl <- startMPIcluster()
registerDoMPI(cl)

# 50 calculations to be done:
foreach(z=1000000:1000050) %dopar% {
  x <- sum(rnorm(z))
}

closeCluster(cl)
mpi.quit()

Example parallel script using snow

library(snow)
# If there are multiple tasks only one reaches here, others become slaves.

# Select MPI-based or fork-based parallelism depending on ntasks
if (strtoi(Sys.getenv('SLURM_NTASKS')) > 1) {
    cl <- makeMPICluster()
} else {
    cl <- makeSOCKCluster(max(strtoi(Sys.getenv('SLURM_CPUS_PER_TASK')), 1))
}

# 50 calculations to be done:
x <- clusterApply(cl, 1000000:1000050, function(z) sum(rnorm(z)))

stopCluster(cl)

Example job submission scripts for the Pan cluster

Example job submission script for a serial R job on the Pan cluster

#!/bin/bash -e

#SBATCH --job-name    MySerialRJob
#SBATCH --account     nesi99999
#SBATCH --time        01:00:00
#SBATCH --mem-per-cpu 4G
#SBATCH --output      MySerialRJob.%j.out # Include the job ID in the names of
#SBATCH --error       MySerialRJob.%j.err # the output and error files

module load R/3.3.0-intel-2015a

# Help R to flush errors and show overall job progress by printing
# "executing" and "finished" statements.
echo "Executing R ..."
srun Rscript MySerialRJob.R
echo "R finished."

Example job submission script for an array R job on the Pan cluster

#!/bin/bash -e

#SBATCH --job-name    MyArrayRJob
#SBATCH --account     nesi99999
#SBATCH --time        01:00:00
#SBATCH --array       1-10
#SBATCH --mem-per-cpu 4G
#SBATCH --output      MyArrayRJob.%j.out # Include the job ID in the names of
#SBATCH --error       MyArrayRJob.%j.err # the output and error files

module load R/3.3.0-intel-2015a

# Help R to flush errors and show overall job progress by printing
# "executing" and "finished" statements.
echo "Executing R ..."
srun Rscript MyArrayRJob.R
echo "R finished."

Example job submission script for an MPI R job on the Pan cluster

#!/bin/bash -e

#SBATCH --job-name      MyMPIRJob
#SBATCH --account       nesi99999
#SBATCH --time          01:00:00
#SBATCH --ntasks        12
#SBATCH --cpus-per-task 1
#SBATCH --mem-per-cpu   2G
#SBATCH --output        MyMPIRJob.%j.out # Include the job ID in the names of
#SBATCH --error         MyMPIRJob.%j.err # the output and error files

module load R/3.3.0-intel-2015a

# Help R to flush errors and show overall job progress by printing
# "executing" and "finished" statements.
echo "Executing R ..."
# Our R has a patched copy of the snow library so that there is no need to use
# RMPISNOW.
srun Rscript doMPI
echo "R finished."

Example job submission script for the Fitzroy cluster

#!/bin/bash -e

#@ job_name         = MyRJob
#@ account_no       = nesi99999
#@ class            = General
#@ wall_clock_limit = 01:00:00
#@ initialdir       = /hpcf/working/nesi99999/MyRJob
#@ output           = $(job_name).$(jobid).out
#@ error            = $(job_name).$(jobid).err
#@ queue

# LoadLeveler has an annoying habit of transferring parts of the user's
# environment as it existed at the time of submission to the job. Clear any
# loaded modules.
module purge

module load R/3.0.1

# Help R to flush errors and show overall job progress by printing
# "executing" and "finished" statements.
echo "Executing R ..."
Rscript MyRJob.R
echo "R finished."

Further notes

Generating images and plots

Normally when plotting or generating other sorts of images, R expects a graphical user interface to be available so it can render and display the image on the fly. However, it is possible to instruct R to export the image directly to a file instead of displaying it on the screen, using code like the following:

png(filename="plot.png")

This statement instructs R to export all future graphical output to a PNG file named plot.png, until a different device driver is selected.

For more information about graphical device drivers, please see the R documentation.

Dealing with packages

Much R functionality is not supplied with the base installation, but is instead added by means of packages written by the R developers or by third parties.

Getting a list of available packages

It is best to view the list of available R packages interactively. To do so, go to an appropriate node, such as a build node on the Pan cluster, and call up the package library:

[jblo123@login-01 ~]$ ssh build-sb
[jblo123@build-sb ~]$ module load R/3.2.1-intel-2015a
[jblo123@build-sb ~]$ R
...
> library()

Please note that different installations of R, even on the same NeSI cluster, may contain different collections of packages. Furthermore, if you have your own packages in a directory that R can automatically detect, these will also be shown in a separate section.

Getting a list of available libraries

You can print a list of the library directories in which R will look for packages by running the following command:

> .libPaths()

Specifying custom library directories

You can add your own custom library directories by putting a list of extra directories in the .Renviron file in your home directory. This list should look like the following:

R_LIBS="/home/jblo123/R/foo:/home/jblo123/R/bar"

Note that, of the contents of the R_LIBS variable, only those directories that actually exist will show up in the output of .libPaths().

Downloading and installing a new package

To install a package into R, use the install.packages command on an appropriate interactive node, such as a build node on the Pan cluster.

For example, to install the sampling package:

[jblo123@login-01 ~]$ ssh build-sb
[jblo123@build-sb ~]$ module load R/3.2.1-intel-2015a
[jblo123@build-sb ~]$ R
...
> install.packages("sampling")

You will most likely be asked if you want to use a personal library and, if you have not previously done so, whether you wish to create a new personal library. Answer "y" to both questions.

Enter the number for New Zealand from the list of download mirrors that will appear.

R will then download, compile and install the new package for you.

You can confirm the package has been installed by using the library() command:

> library("sampling")

If the package has been correctly installed, you will get no response. On the other hand, if the package is missing or was not installed correctly, an error message will typically be returned:

> library("foo")
Error in library("foo") : there is no package called ‘foo’

Compiling a C library for use with R

You can compile custom C libraries for use with R using the R shared library compiler. It is best to do this from one of the build nodes:

[jblo123@login-01 ~]$ ssh build-sb
[jblo123@build-sb ~]$ module load R/3.2.1-intel-2015a
[jblo123@build-sb ~]$ R CMD SHLIB mylib.c

This will create the shared object mylib.so. You can then reference the library in your R script:

[jblo123@build-sb ~]$ R
...
> dyn.load("~/R/lib64/mylib.so")

Quitting an interactive R session

At the R command prompt, when you want to quit R, type the following:

> quit()

You will be asked "Save workspace image? [y/n/c]". Type n.

Comments

Powered by Zendesk