Short Read Bioinformatics
The European Bioinformatics Institute (EMBL-EBI) announces a workshop that will take place at EBI, on 30th April - 1st May, 2009. The course will commence at 09:30 hrs on Thursday 30th April and conclude on Friday 1st May at 16:30 hrs, the venue for the course is the EBI Training Suite, EBI East Wing.
With the advent of next-generation DNA sequencing technologies, researchers encounter new challenges to analyse their data. As a primary nucleotide sequence resource, the EBI tackled these new challenges developing new tools for the European Reads Archive (ERA) in close collaboration with our colleagues at the Short Read Archive (NCBI).
These massively parallel platforms require finishing pipelines either relying on a reference genome (MAQ)or attempting de novo assemblies (Velvet); furthermore, enhancing the signal produced by the sequencers (AYB) will be covered in this eminently practical workshop.
At the end of this course, attendees will
- have a good understanding of ERA (submission tools, metadata, web services);
- have a working knowledge of running their annotation pipeline with MAQ or Velvet
The course will start with an introduction to the European Nucleotide Archive (ENA): schema (xml metadata format), submission routes available for next generation sequencing, formats supported and pipeline implementation at the Sanger Institute.
Tim Massingham (EBI) will briefly introduce AYB, a replacement for Illumina's "Bustard" base caller that uses explicit statistical models to correct for cross-talk and phasing, producing cleaner data from which bases can be called; this means more correct base calls for a given run and the potential to produce longer reads from the same machines.
MAQ (Mapping and Assembly with Quality) will be introduced by Li Heng (Sanger Institute) and there will be hands on exercises to become familiar with this useful software.
Daniel Zerbino (EBI) will introduce Velvet a sequence assembler for very short reads, and there will be a chance to run this algorithm to understand how this works.
|Day 1 -Thursday 30th April 2009|
|Workshop Outline (Xose M Fernandez)|
|European Nucleotide Archive (Guy Cochrane)|
|XML metadata format (Chris Hunter)|
|Submission to ERA: procedures, submission tools (Chris Hunter)|
|ERA data handling and storage (Rasko Leinonen)|
|ERA format (SRA, SRF, fastQ) (Rasko Leinonen)|
|SRF and the short read pipeline at the WTSI (James Bonfield)|
|ERA data retrieval: FTP sequence/metadata search, browser (Rasko Leinonen)|
|European Genotype Archive (Ilkka Lappalainen)|
|Talk: 1000 Genomes (Paul Flicek)|
|Tour of the Sanger Sequencing Facility|
|Day 2 - Friday 1st May|
|Improving read quality: base calling with AYB (Tim Massingham, EBI)|
|Assembly, alignment and other tools (Li Heng, Sanger Institute)|
|Talk: IT for Next Generation Sequencing (Tony Cox, Sanger Institute)|
|De novo genome sequencing strategies using Next-Gen (Daniel Zerbino, EBI)|
Participants can opt for residential or non-residential registration.
Residential registration costs £190. Charge includes the course fee, overnight accommodation (b&b 30th April) and transport to / from the hotel. Those choosing the residential option will be hosted at The Gonville Hotel in Cambridge from where there will be a shuttle service to the Hinxton campus. If you require accommodation the day before the workshop, you will have to arrange this directly with the hotel, or at time of registering at an additional £119.00 cost, b&b.
Non-residential registration is £50. Those applicants selecting this option will have to report to security at the campus main entrance Thursday (30th April) morning, prior to proceeding to EBI reception and registration.