R

R+RStudio+Packages

Note: R and RStudio are separate downloads and installations. R is the underlying statistical computing environment, but using R alone is no fun. RStudio is a graphical integrated development environment that makes using R much easier. You need R installed before you install RStudio.

  1. Install R. You’ll need R version 3.1.2 or higher.1 Download and install R for Windows or Mac OS X (download the latest R-3.x.x.pkg file for your appropriate version of OS X).
  2. Install RStudio. Download and install RStudio Desktop. RStudio version 1.0 was released November 1 2016, and contains many improvements over the 0.99 release. Please download version 1.0+.
  3. Install R packages. Launch RStudio (RStudio, not R itself). Ensure that you have internet access, then enter the following commands into the Console panel (usually the lower-left panel, by default). A few notes:
    • Commands are case-sensitive.
    • You must be connected to the internet.
    • The tidyverse package is kind of a meta-package that automatically installs/loads many core packages that we use throughout the workshops.2
    • Even if you’ve installed these packages in the past, do re-install the most recent version. Many of these packages are updated often, and we may use new features in the workshop that aren’t available in older versions.
    • If you’re using Windows you might see errors about not having permission to modify the existing libraries – disregard these. You can avoid this by running RStudio as an administrator (right click the RStudio icon, then click “Run as Administrator”).
install.packages("tidyverse")

You can check that you’ve installed everything correctly by closing and reopening RStudio and entering the following commands at the console window (don’t worry about the Conflicts with tidy packages warning):

library(tidyverse)

This may produce some notes or other output, but as long as you don’t get an error message, you’re good to go. If you get a message that says something like: Error in library(packageName) : there is no package called 'packageName', then the required packages did not install correctly. Please do not hesitate to email me prior to the course if you are still having difficulty.

Bioconductor

For some classes (e.g., RNA-seq), you’ll need to install a few Bioconductor packages. These packages are installed differently than “regular” R packages from CRAN. Copy and paste these lines of code into your R console.

source("http://bioconductor.org/biocLite.R")
biocLite()
biocLite("DESeq2")

A few notes:

  • R may ask you if you want to update any old packages by asking Update all/some/none? [a/s/n]:. If you see this, type a at the propt and hit Enter to update any old packages.
  • If you see a note long the lines of “binary version available but the source version is later”, followed by a question, “Do you want to install from sources the package which needs compilation? y/n”, type n for no, and hit enter.
  • You can check that you’ve installed everything correctly by closing and reopening RStudio and entering the following commands at the console window:
library(DESeq2)

If you get a message that says something like: Error in library(DESeq2) : there is no package called 'DESeq2', then the required packages did not install correctly. Please do not hesitate to email me prior to the course if you are still having difficulty.

RMarkdown

Several additional setup steps required for the reproducible research with RMarkdown class.

  1. First, install R, RStudio, and the tidyverse package as described above. Also install the knitr and rmarkdown packages.

    install.packages("knitr")
    install.packages("rmarkdown")
  2. Next, launch RStudio (not R). Click File, New File, R Markdown. This may tell you that you need to install additional packages (knitr, yaml, htmltools, caTools, bitops, and rmarkdown). Click “Yes” to install these.
  3. Sign up for a free account at RPubs.com.
  4. Optional: If you want to convert to PDF, you will need to install a LaTeX typesetting engine. This differs on Mac and Windows. Note that this part of the installation may take up to several hours, and isn’t strictly required for the class.
    • Windows LaTeX instructions:
      1. Download the installer using this link (or this link if you’re using an older 32-bit version of Windows). It is important to use the full installer, not the basic installer. Run the installer .exe that you downloaded.
      2. Run the installer twice, making sure to use the Complete, not Basic, installation:
        1. First, When prompted, select the box to “Download MiKTeX.” Select the closest mirror to your location. If you’re doing this from Charlottesville, the United States / JMU mirror is likely the closest. This may take a while.
        2. Run the installer again, but this time select “Install” instead of “Download.” When prompted “Install missing packages on-the-fly”, drag your selection up to “Yes.”
    • Mac LaTeX instructions:
      1. Download the installer .pkg file using this link. This is a very large download (>2 gigabytes). It can take a while depending on your network speed.
      2. Run the installer package.

Interactive Visualization

The Interactive Visualization with JavaScript and R lesson requires installation of several R packages in addition to those mentioned above:

install.packages("highcharter")
install.packages("d3heatmap")
install.packages("leaflet")
install.packages("visNetwork")
install.packages("jsonlite")

To check that these are correctly installed, first close RStudio and then reopen it and run the following:

library(highcharter)
library(d3heatmap)
library(leaflet)
library(visNetwork)
library(jsonlite)

These commands may produce some notes or other output, but as long as they work without an error message, you’re good to go. If you get a message that says something like: Error in library(packageName) : there is no package called 'packageName', then the required packages did not install correctly. Please do not hesitate to email me prior to the course if you are still having difficulty.

Shiny

The Building Shiny Web Apps in R lesson requires installation of several R packages in addition to those mentioned above:

install.packages("shiny")
install.packages("shinythemes")
install.packages("lubridate")

To check that these are correctly installed, first close RStudio and then reopen it and run the following:

library(shiny)
library(shinythemes)
library(lubridate)

These commands may produce some notes or other output, but as long as they work without an error message, you’re good to go. If you get a message that says something like: Error in library(packageName) : there is no package called 'packageName', then the required packages did not install correctly. Please do not hesitate to email me prior to the course if you are still having difficulty.

Get Data

The data used in any of these classes can be found at the data link on the navbar at the top. Create a new folder somewhere on your computer that’s easy to get to (e.g., your Desktop). Name it bioconnector. Inside that folder, make a folder called data, all lowercase. Download datasets as needed by clicking here or using the link at the top. Save these data files to the new bioconnector/data folder you just made.

RNA-seq

Software setup: Follow instructions above for R+RStudio+Packages and Bioconductor. See the sections above for full instructions and troubleshooting tips, but in summary, after installing R+RStudio, you’ll need the tidyverse, Bioconductor core, and DESeq2 packages.

# tidyverse pkg installs dplyr, tibble, tidyr, ggplot2, readr, etc.
install.packages("tidyverse")
# Install Bioconductor core packages and DESeq2
source("http://bioconductor.org/biocLite.R")
biocLite()
biocLite("DESeq2")

Download data we’ll use in class. Create a new folder somewhere on your computer that’s easy to get to (e.g., your Desktop). Name it bioconnector. Inside that folder, make a folder called data, all lowercase. Download the 3 data files below, saving them to the new bioconnector/data folder you just made.

Prerequisites! This is not an introductory R class. This class assumes you’re comfortable working in R, using ggplot2 for visualization, and using dplyr verbs and the %>% for chaining together multiple operations. Work through the workshop materials below if you need a refresher.

Recommended reading prior to class:

  1. Conesa et al. A survey of best practices for RNA-seq data analysis. Genome Biology 17:13 (2016).
  2. Soneson et al. “Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences.” F1000Research 4 (2015).
  3. Abstract and introduction sections of Himes et al. “RNA-Seq transcriptome profiling identifies CRISPLD2 as a glucocorticoid responsive gene that modulates cytokine function in airway smooth muscle cells.” PLoS ONE 9.6 (2014): e99625.

Survival Analysis

Software setup: Follow instructions at the top for R+RStudio+Packages. Next, install the additional packages listed below from CRAN and Bioconductor by typing the commands below into your RStudio console window.

# Install from CRAN
install.packages("tidyverse")
install.packages("survminer")

# Install from Bioconductor
# Install Bioconductor core packages
source("http://bioconductor.org/biocLite.R")
biocLite()
# Install RTCGA and RTCGA data packages
biocLite("RTCGA")
biocLite("RTCGA.clinical")
biocLite("RTCGA.mRNA")

Prerequisites! This is not an introductory R class. Work through the workshop materials below if you need a refresher on R, data frames, data manipulation and visualization.




  1. R version 3.1.2 was released October 2014. If you have not updated your R installation since then, you need to upgrade to a more recent version, since several of the required packages depend on a version at least this recent. You can check your R version with the sessionInfo() command.

  2. Installing/loading the tidyverse tidyverse will install/load the core tidyverse packages that you are likely to use in almost every analysis: ggplot2 (for data visualisation), dplyr (for data manipulation), tidyr (for data tidying), readr (for data import), purrr (for functional programming), and tibble (for tibbles, a modern re-imagining of data frames). It also installs a selection of other tidyverse packages that you’re likely to use frequently, but probably not in every analysis (these are installed, but you’ll have to load them separately with library(packageName)). This includes: hms (for times), stringr (for strings), lubridate (for date/times), forcats (for factors), DBI (for databases), haven (for SPSS, SAS and Stata files), httr (for web apis), jsonlite (or JSON), readxl (for .xls and .xlsx files), rvest (for web scraping), xml2 (for XML), modelr (for modelling within a pipeline), and broom (for turning models into tidy data). After installing tidyverse with install.packages("tidyverse") and loading it with library(tidyverse), you can use tidyverse_update() to update all the tidyverse packages installed on your system at once.