Student internships
The EMBL Visitors and Scholars Programme also offers opportunities at other EMBL sites.
Outreach and Training Team – Cath Brooksbank
The Outreach and Training Team offers internships to work on identifying and cataloguing courses and learning tools of relevance to the pharmaceutical industry.
We have an opening for an intern to help us to develop the content of courses for our new online training resource, Train online. This is an excellent opportunity for an early-stage researcher to become an expert user of some of Europe's most important freely available biological data resources, and to turn this knowledge into beautifully structured courses that will enable like-minded scientists to get to grips with their biological data.
During your time here you will gain a basic grounding in bioinformatics training, especially for online training materials. You will work closely with other teams at the EBI who will provide the expert content to be used in the course, as well as the Outreach and Training Team.
If you take the position we expect that you will find it challenging, enjoyable and very useful for your future employment opportunities.
Please send:
- a brief paragraph explaining why you are interested in this opportunity and a set of bullet points indicating your background
- any relevant experience you have with creating course (teaching/training) materials
- your CV
by email to: vicky@ebi.ac.uk, with the subject title 'Training Intern'.
Cheminformatics and Metabolism Team – Christoph Steinbeck
The Steinbeck group offers computational student projects embedded within the Services and Research team. All students are supervised by experienced staff members and have access to a broad array of local experts for the technologies used. All applicants should have a good computational background. The following projects are currently on offer:
Automatic Structure Diagram Generation (SDG)
In chemoinformatics it is often necessary to create 2D coordinates for compounds which either have wrong coordinates or none at all. This task is difficult because on the one hand the algorithm needs to lay out arbitrary structures, but on the other hand there are certain conventions on how particular cases should be handled. In this project, the student would work on the existing code for SDG in the Chemistry Development Kit (CDK) and improve it with respect to e. g. handling of stereochemistry, deterministic layout, IUPAC-conforming alignment of final structure (major ring system/largest chain horizontally placed) and globally optimised layout. This requires good programming skills in general, best in an object-oriented fashion, and a basic command of the Java language. The project would include assessment of the literature with respect to standards, specification of the requirements and implementation of a solution.
Kemia Javascript editor
We are looking for someone to contribute to Kemia, a pure Javascript editor for editing 2D chemical structures and reactions. Kemia uses the Google Closure library extensively. It is an open source project (shared on Github) to which our group is contributing. The editor is still to be finished, and many features could be added in the future. A student or visitor could work on implementing cheminformatics algorithms in Javascript or work on the GUI aspects of the editor.
Portal prototype biology/chemistry
We are interested in mining biological resources which reference our source database Chemical Entities of Biological Interest. ChEBI is a dictionary and ontology of small molecules. We would like to use ChEBI identifiers as a means to investigate biological knowledge stored in our sister databases but not easily visible to our users. One such example could be which species does a specific molecule exist in or whether a molecule occurs at pH 7. This project would involve firstly learning about the different resources and what information they have, then defining what information would be useful for users. The technical implementation would require a system which would query each data source and display the relevant information within a web interface. Different interfaces will be built for user testing. Such a project would require a 6 month internship with good Java skills.
Ontology visualisations
We would like to investigate different ontology visualisation methods to best harness the ChEBI ontology. ChEBI is a dictionary and ontology of small molecules. The ChEBI ontology is quite in depth and detailed for the average user to comprehend. Hence, we would like to develop prototypes which will simplify our visualisation of the ChEBI ontology. Each prototype will then be user tested and further alterations will be required to refine the results. Such a project would require a 4 month internship and preferably some programming skills.
ChEMBL Team – John Overington
Current projects include:
1) ChEMBL is a database containing bioactivity data on drug-like molecules. We are interested in analysing the distribution of aromatic rings and molecular frameworks within the database and to investigate whether the occurrence of these molecular fragments are related to activity at specific protein targets. A good knowledge of chemistry is required and knowledge of statistical methods and pipelining tools would be useful.
2) We are interested in characterising the target space of the molecular targets in the ChEMBL database by using knowledge of the properties of compounds that bind to specific targets. A knowledge of chemistry and biology is needed. Experience in manipulating large datasets and statistical analysis would also be useful.
3) The ChEMBL database is a rich source of pharmacological data. We are interested in mining these data to gain insights into the history of the development of drugs and pharmacological techniques. Of key interest is the identification of bioassays of greatest therapeutic relevance. Such a project would require a 4 month internship. Scripting skills (Perl, Python, etc) would be required. Some knowledge of SQL and an understanding of the fundamentals of pharmacology would be an advantage.
4) Enhancement of ChEMBLdb web interface:
ChEMBLdb is an online database of information on the properties and activities of drugs and drug-like small molecules and their targets. We are interested in extending the capabilities of the web interface to our large SAR (Structure Activity Relationship) database. A good knowledge of web and database programming is required, development and use of novel visualisation methods would greatly advantageous.
5) Protein structure analysis of drug target domains:
We are interested in extending the annotation of our drug targets to include Pfam and structural domain coverage. Experience of sequence searching strategies and structural bioinformatics is required.
Proteomics Services Team – Henning Hermjakob
Computational projects will usually implement new features for existing database systems, in particular the IntAct and PRIDE databases. The available projects range across a broad spectrum, from data analysis, evaluation, and statistics, to web interfaces and data visualisation. Projects will always be based on our open source, production quality database applications, and will contribute to the publicly accessible systems.
The following 2011 publications all result from a traineeship or visit in the Proteomics Services Team and have the trainee/visitor as first author:
- Ndegwa N, et al. Critical amino acid residues in proteins: a BioMart integration of Reactome
protein annotations with PRIDE mass spectrometry data and COSMIC somatic mutations.
Database (Oxford). 2011 Oct 23;2011:bar047.
- Villaveces JM, et al. Dasty3, a WEB framework for DAS. Bioinformatics.
2011 Sep 15;27(18):2616-7. Epub 2011 Jul 28.
- Griss J, et al. Published and perished? The influence of the searched protein database on the long-term storage of proteomics data. Mol Cell Proteomics.
2011 Sep;10(9):M111.008490.
- Salazar GA, et al. DAS writeback: a collaborative annotation system.
BMC Bioinformatics. 2011 May
10;12:143.
- Gel Moreno B, et al. easyDAS: automatic creation of DAS servers.
BMC Bioinformatics. 2011 Jan 18;12:23.
Quality control and visualisation of proteomics data in the PRIDE database
PRIDE holds proteomics data, acquired by mass spectrometry and is maintained by the Proteomics Services team. We have opportunities for Masters student projects, where the goal is to visualise summary and metadata, so that users get a quick idea of the quality of their data and can compare their data with other PRIDE experiments.
Saez-Rodriguez Research Group – Julio Saez-Rodriguez
We are broadly interested in how the dynamics of signal transduction, mediated for example by protein post-translational modifications, ultimately influence cell fate decisions. We build predictive mathematical models using high-throughput experimental data collected after applying many different perturbations to the pathways of interest to get at the underlying network structure. Specifically, research in our group aims to combine statistical methods with models describing the mechanisms of signal transduction either as logical or physico-chemical systems. We then use these models to better understand how signalling is altered in human disease and predict effective therapeutic targets.
Projects connected to our ongoing projects (http://www.ebi.ac.uk/saezrodriguez/research.html) are frequently available, can range from methods development to specific applications, but in most cases entail a bit of both.
Computational Systems Neurobiology Group – Nicolas Le Novère
Curation of Biochemical models
BioModels Database is the reference resource for mathematical models of biological interests encoded in SBML. One of the great strengths of the resource is to provide models which are fully curated and annotated, and therefore trustworthy. These models are easier to reuse, exchange and convert to other representations.
The aim of this project is to improve the quality of the models we distribute, as well as encode new ones, whilst working alongside our team of curators. This has proved in the past to be the best possible training in computational systems biology, and is the perfect preparation for a PhD involving mathematical modelling of biological processes.
The post-holder will have university-level experience in numerical analysis, a good background in biology, and some familiarity with bioinformatics and/or modeling software.
Our other projects are currently full, but we will consider applications from students who would be happy to start after September 2011. Please email Camille (see table) for more information.
UniProt Group – Rolf Apweiler
 |