ENCODE virtual machine

A new approach to reporting and sharing Methods

The ENCODE consortium have published an integrated analysis of ENCODE genome-wide data. Each analysis depends upon specific software processing that has a series of source data files. These are transformed into output files relating to specific statements and figures in the paper and their corresponding analysis. 

We have established a virtual machine instance of this software, using the code bundles from, where each analysis program has been tested and run.

Where possible, the virtual machine enables complete reproduction of the analysis as it was performed, and generates figures, tables or other information. In cases where the analysis involved highly parallelised processing within a specialised multiprocessor environment, a partial example has been implemented, leaving it to the reader to decide whether and how to scale to a full analysis.

We hope that this structure provides the opportunity to run the same analyses in the wild.

ENCODE cloud instance

You can "instantiate" an Amazon EC2 instance to examine figures from the ENCODE analysis in the cloud. To get started, follow the instructions at


Velvet is a sequence assembler for very short reads.