Add new attachment

Only authorized users are allowed to upload new attachments.

This page (revision-14) was last changed on 10-Jul-2014 13:17 by Andrew Tikhonov

This page was created on 02-Jun-2011 09:10 by Rodrigo Santamaria

Only authorized users are allowed to rename pages.

Only authorized users are allowed to delete pages.

Difference between version and

At line 17 changed 3 lines
!!Overview of the pipeline
The pipeline starts from raw CEL files located in a single folder and produces standard Bioconductor R objects containing expression levels after background correction, normalization and summarization via RMA.
!! Running aPE on the EBI R cloud
At line 21 removed 8 lines
__The different steps of the pipeline are:__
# Preparing the data and experimental metadata
# Generate a quality report
# Align to a reference
# Filter reads
# Generate an alignment report
# Estimate expression
# Generate a compared report
At line 30 changed 16 lines
!!Contents of this documentation
* [Running AEHTS on the EBI R cloud|ArrayExpressHTS R-Cloud Configuration]
** Find a dataset to analyse or upload your own data
** Launch the Workbench, register and create a new project
** Run the whole pipeline with default options
* [Installing and running AEHTS on a local computer|ArrayExpressHTS Local Configuration]
** [Pre-requisites|ArrayExpressHTS Local Configuration]
** [Preparing your own data|ArrayExpressHTS Data Preparation]
** [Preparing custom references|ArrayExpressHTS Custom References]
** [Running the pipeline|ArrayExpressHTS Local Configuration]
** [Re-running the pipeline|ArrayExpressHTS Re-running the Pipeline]
* [Advanced Configuration Options|ArrayExpressHTS Advanced Configuration Options]
** [Default options|ArrayExpressHTS Advanced Configuration Options]
** [Examples|ArrayExpressHTS Advanced Configuration Options]
* [Downstream Differential Expression Analysis|ArrayExpressHTS Differential Expression]
# __Choose a dataset to preprocess or upload your own data__ \\ aPE can be used on the EBI R-Cloud to pre-process private datasets by uploading its corresponding CEL files to your R/Bioconductor Workbench account. \\ You can also pre-process public datasets available through ArrayExpress. \\ You can use this [interface|] to search for a dataset.\\ \\
# __Launch the R-Cloud Workbench, register and create a new project__ \\ The EBI R-Cloud is a new service at the EBI which allows R users to log in and run distributed computational jobs remotely on its powerful 64-bit linux cluster. This is available through a Java client called the ArrayExpress R/Bioconductor Workbench. Open the following address in a browser and follow the instructions on the page to download or launch the Workbench: [] \\ \\ When connecting for the first time you will be requested to register. You will need to set/provide a username, password and e-mail address to allow you to retrieve your long running projects the next time you log in. Once registered you can log in and create a new project.\\ \\ [Find your way around the workbench|] \\ \\
# __Pre-process your data__ \\ If your CEL files are in a folder 'td', running affyParaEBI within R is straightforward with the following simple code: \\ {{{
> library(affyParaEBI)
> cluster <- makeCluster(10,type="RCLOUD")
> e <- preproParaEBI(path=td, clust=cluster)
> stopCluster(cluster); rm(cluster)
}}} \\ \\ When the pipeline finishes, 'e' will contain an ExpressionSet object that can then be used for downstream analysis. \\ \\ The pre-processing will take time depending on the size of the dataset and how many computing nodes were allocated with "makeCluster" (10 in the above example). Typically, 50 nodes can pre-process 1000 CEL files in about 20 minutes. \\ \\ If you want to pre-process one or more experiments available in ArrayExpress, you can download the CEL files to a folder via ArrayExpress R/Bioconductor package: \\{{{
> library(ArrayExpress)
> library(affyParaEBI)
> #1) Create a two-node cluster
> cluster<-makeCluster(2,type="RCLOUD")
> #2) Retrieve CEL files from experiment E-MEXP-328
> td<-tempdir()
> emexp328.raw<-ArrayExpress(input = "E-MEXP- 328", path=td, save=TRUE)
> #3) Replace low performance nodes
> cluster<-clusterOptimization(cluster, subst=TRUE)
> #4) Perform parallel preprocessing of CEL files
> emexp328.proc<-preproParaEBI(path=td, clust=cluster)
> #5) Clean up cluster nodes and CEL files
> stopCluster(cluster); rm(cluster)
> file.remove(list.files(td, full.names=TRUE))
}}} \\
At line 47 removed 3 lines
* [Example Quality Reports|ArrayExpressHTS Quality Reports]
* [Use cases|ArrayExpressHTS Use cases]
At line 51 changed one line
Version Date Modified Size Author Changes ... Change note
14 10-Jul-2014 13:17 3.621 kB Andrew Tikhonov to previous
13 10-Jul-2014 09:44 3.611 kB to previous | to last
12 10-Jul-2014 09:40 3.647 kB to previous | to last
11 25-Apr-2014 13:01 3.89 kB ZnivFv to previous | to last
10 27-Sep-2013 17:17 3.713 kB Andrew Tikhonov to previous | to last
9 10-Jan-2012 11:43 3.643 kB Andrew Tikhonov to previous | to last
8 02-Jun-2011 10:04 3.635 kB Rodrigo Santamaria to previous | to last
7 02-Jun-2011 09:58 3.647 kB Rodrigo Santamaria to previous | to last
6 02-Jun-2011 09:57 3.651 kB Rodrigo Santamaria to previous | to last
5 02-Jun-2011 09:50 3.662 kB Rodrigo Santamaria to previous | to last
4 02-Jun-2011 09:48 3.731 kB Rodrigo Santamaria to previous | to last
3 02-Jun-2011 09:34 0.931 kB Rodrigo Santamaria to previous | to last
2 02-Jun-2011 09:17 2.498 kB Rodrigo Santamaria to previous | to last
1 02-Jun-2011 09:10 1.787 kB Rodrigo Santamaria to last
« This page (revision-14) was last changed on 10-Jul-2014 13:17 by Andrew Tikhonov