Mosquito Informatics (INFRAVEC)


 Wednesday 5 Thursday 6 February 2014


European Bioinformatics Institute (EMBL-EBI) - Wellcome Genome Campus, Hinxton, Cambridge,  CB10 1SD, United Kingdom


Rebecca Greenhaff

This international workshop will be held at the European Bioinformatics Institute (EBI) at Hinxton, UK, as part of the INFRAVEC transnational access activities. The goal of the workshop is to offer a training and data analysis opportunity to external researchers and research teams applying via the INFRAVEC Access funding programme. The workshop will focus on the use of computational techniques for the exploration of vector genomics, especially programmatic access to data through VectorBase and analyses using high-throughput datasets including alignment of RNAseq or genome re-sequencing and subsequent measures of gene expression or variation respectively. This two-day course will be delivered using a mixture of lectures, hands-on practical sessions and an interactive session where users can discuss practical exercises with the trainers and fellow attendees. The trainers will be experienced vector bioinfomaticians and biologists from EBI and elsewhere.

Day 1 consists of introductions and summary talks with a guest lecture about the use of high-throughput genomic data sets in vector biology and then a walk through with exercises explaining programmatic access to VectorBase data using the Ensembl Application Programming Interface (API). Finally participants will initiate alignment of NGS dataset to a reference genome within the Galaxy platform

Day 2 explores builds on the aligned NGS data from Day 1 including the visualisation of the alignments and some simple analyses to get FPKM expression values for genes (to identify differential expression) and looking at calling variations (SNP detection). The session will end with discussion on how to improve data at VectorBase (community annotation efforts) and outreach activities (where to get help, what to expect from INFRAVEC/VectorBase).


The workshop is organised by the INFRAVEC Project as part of its Transnational Access activities aimed at providing funding to all researchers. Eligibility criteria for the provision of funding in the form of units of access are set by the EU Commission.


Time Topic Trainer
Day 1 - 5 February 2014
13:00-13:15 Workshop Introduction, information for participants Dan Lawson
13:15-14:00 Introduction to INFAVEC & VectorBase Dan Lawson
14:00-15:00 Introductory concepts - Overview of basic concepts (Genome sequencing/assembly, annotation & HTG data sets) Dan Lawson
15:00-15:30 Tea/coffee break  
15:30-16:30 Lecture on use of Galaxy Bob MacCallum/ Dan Lawson
16:30-17:30 Sequence alignments of reference genomes

Alignment of RNAseq transcriptome or genomic re-sequencing data to reference genomes. Participants are encouraged to use their own data for this exercise. Set up alignments at Pathogen Portal or VectorBase Galaxy sites

James Allen
19.30 Evening meal at the Hinxton Red Lion  
Day 2 - 6 February 2014
09:00-09:15 Review progress on NGS alignments from previous day Dan Lawson

RNAseq analysis. Analysis of aligned transcriptome data, FPKM expression values, visualisation in VectorBase

James Allen
10:30 - 11:00 Tea/coffee break  
11:00 - 12:30

Programmatic access to VectorBase data. Walkthrough & hands on exercises for programmatic access to datasets including BioMart queries, Perl API

Gareth Maslen
12:30 - 13:30 Lunch  
13:30 - 15:00

Genomic re-sequencing

Analysis of aligned genomic data, SNP calling, SNP consequences, visualisation in VectorBase
Gareth Maslen
15:00 - 15:30 Tea/coffee break  
15:30 - 16:30 Free time to work on participants datasets  
16:30 - 17:00

Submission of data and Community outreach efforts. Getting help (both offline and interactive), examples of interactions with VectorBase/INFRAVEC staff

Dan Lawson
17:00 - 17:30 Wrap up session and questions Dan Lawson