|
OVERVIEW OF PROGRESS DURING THE REPORTING PERIOD
Main objectives of the project for this reporting period
The objective of the final year for the project was to consolidate the emdep deposition system, to provide the basic search interface to the data so far deposited, processed and released via emdep ( http://www.ebi.ac.uk/msd-srv/emdep/). Consolidation aims included further out-reach activities to gain acceptance for the database system and the idea for 3D-EM research workers to deposit data into a public database.
Overview of the scientific progress of the project as a whole in the period
The main achievement of the final year has been the involvement and acceptance of the developments and results of the IIMS project by the three-dimensional electron microscopy community. The Workshop held in Hinxton, (November 2002) promoting software development in the 3D-EM field brought together representatives of the main laboratories working in new methodological developments for 3D-EM and was particularly successful. The emdep 3D-EM deposition system has steadily gained acceptance with the 3D-EM community. Several journals, Nature, Science, Nature Structure Biology, now have editorial policies that require 3D-EM volume data deposition prior to publication. The emdep submissions come from 20 different laboratories and these groups represent many of the major European and USA groups.
The EM search facility (Deliverable 8 Query/Search interface specification document.) was designed to enable the user to use a text-based html form to specify search criteria and corresponding values. The search will include all deposited data in the EM database. Search criteria to be included are, by author name, by keyword, by aggregation type (e.g., single particle), by release date, and by sample name (e.g., E. Coli). The display of the results will be done on a separate html page and in several levels of detail. At the top level a brief summary of each matched deposition in the database will give the user an overview of that entry. It will include the accession code, sample name, release date and names of the authors as a minimum. A link from each such summary will lead to the next level, that of an individual deposition. At this lower level the available information (i.e., that already past its release date) will be presented and made available to download. This information will include the XML header file and the released map(s). A separate "help" page will be accessible from the top level of the query interface, in order to assist users unfamiliar with the interface.
The interface was implemented as per the specification (Deliverable 9 Query/Search interface implementation see Figure 1). It is linked under the general MSD searches page (http://www.ebi.ac.uk/pdbe/docs/Services.html ) with the title "EM Search". The interface can be accessed directly by pointing a browser to http://www.ebi.ac.uk/msd-srv/emsearch/. The help page ( http://www.ebi.ac.uk/msd-srv/emsearch/Search_EMDep_help.html ) is linked directly to the interface page, under the title "help". This link also appears in the results pages. In addition to the search criteria in the specification, an option also exists of retrieving all depositions. The search criteria were implemented allowing for the possibility of combining two different search values with an "and" or an "or" condition between them. This allows greater flexibility in a single search. The results page shows a summary for each matched deposition, and includes the following information: accession code, sample name, aggregation type, reported resolution, release date, submission date, author names. In the case of keyword search it is possible that only some of the keywords were matched, and so a relative hit score is also added to the results page. In the page containing the details of a single entry the files which are available for download appear as links with some additional comments (if provided by the depositor).
Figure 1: The EMDB query web interface http://www.ebi.ac.uk/msd-srv/emsearch/
Development of new validity measures for 3D-EM data as well as for combined studies: Several factors limit the quality of the structural results obtained by 3D-EM, from technical limitations imposed by the electron microscope, to artefacts introduced by the reconstruction algorithms. Nowadays assessment of the quality of a given reconstruction by standard measures and methods accepted among the 3D-EM scientists is still not possible. Nevertheless, efforts have been concentrated in a number of potential quality indicators:
CTF correction: During the last three years work has continued in the 3D-EM community towards achieving better methods and procedures for 3D-EM structure determination. Partner 2 has work in a new parametric technique for the determination of the Contrast Transfer Function (CTF) of EM micrographs (Velazquez-Muriel et al., 2003). Faithful CTF determination is an essential step prior to CTF correction. Several methods have been proposed in order to correct the effect of the CTF in a given reconstruction:
- Wiener filtering of the reconstructed volume (Frank and Penczek, 1995)
- Regularised steepest-descent technique (Zhu et al. 1997)
- Inverse CTF filtering of the reconstructed volume (Stark et al. 1997)
- Incorporation of CTF for each projection and maximum-entropy reconstruction (Skoglund et al. 1996)
- Incorporation of Wiener-like fashion CTF for each projection and Fourier reconstruction algorithm (Grigorieff 1998)
- CTF of projections and weighted Fourier reconstruction (Ludtke et al. 1999, Ludtke et al. 2001)
Partner 2 has also proposed a novel methodology based on Iterative Data Refinement (Sorzano et al., submitted). The existence of these approaches is indicative of the fact that there is no agreed standard method for the correction of the CTF in the 3D-EM field.
Distribution of projection directions (Euler Angle distribution): The experimental setup for collecting individual images in the electron microscope does not allow a definition of the desired angular distribution of projections. This technical limitation gives, in a great number of cases, an uneven distribution of projection directions in the 3D space. Behaviour of some reconstruction algorithms can be affected by this limitation. Partner 2 demonstrated that the exact behaviour of the algorithms depends on the free parameters chosen. If the parameters are well selected, an even distribution will not result in inferior reconstructions (Sorzano et al. 2001). Thus, knowledge of the exact distribution of projections of a given reconstruction does not provide by itself enough information to assess the quality of the structural data obtained in a given experiment.
Validity measurements and procedures for Resolution assessment 3D-EM (D10). Validation of 3D-EM data is closely linked to the available data. One of the most important numbers associated with a deposition is the depositors' claimed resolution level for their map. However, this resolution can not be fully independently verified without their entire raw data being submitted for analysis. In order to assess the resolution of a particular map deposited with EmDep a concensus view emerged, after an IIMS Workshop at the Genome Campus in November 2002 involving the leading laboratories in the field of 3D-EM. It was decided to agree a single method of determination of resolution, the Fourier Shell Correlation (FSC) method, as representative of the quality of the uploaded map. The FSC is calculated in concentric shells around the centre of the image and is therefore not a single number but rather a function that depends on the radius. It was agreed at the Nov. 2002 meeting that the plot of the FSC against radius is the best indicator of the quality of the data. An effective resolution level can be derived from this plot by deciding an acceptable minimal level of correlation and extracting the radius at which this level is achieved. (see Figure 2 M5 New validation measurements and procedures for 3D-EM.

Figure 2: Fourier Shell Correlation
The IIMS project group therefore developed the means for submitting the FSC plot as part of the deposition process. In addition, a dedicated program was written for calculating this plot from raw coordinates. This program is publicly available. When the point is reached that raw data are submitted directly as part of the deposition process, uploaded coordinate data, in xml format, is converted to an FSC plot. Means for validating maps is an on-going research process which will continue in other frameworks, such as the NOE program.
References:
J A Velazquez-Muriel, COS Sorzano, JJ Fernandez, JM Carazo
A method for estimating the CTF in electron microscopy based on ARMA models and parameter adjustment. Ultramicroscopy (2003) 96:17-35
C O S Sorzano, R Marabini, N Boisset, E Rietzel, R Schroder, G T Herman, J M Carazo
The effect of overabundant projection directions on 3D reconstruction Algorithms. J Struct Biol (2001) 133: 108-118
C O S Sorzano, R Marabini, R Marabini, G T Herman, Y Censor, J M Carazo
Transfer function restoration in 3D Electron Microscopy via Iterative Data Refiniment (submitted)
The IIMS 3D-EM data has been completely merged into the MSD database (D17 The transfer of 3D-EM aspects of IIMS implementation from Partner 2 to Partner 1 sites. and M6 A complete implementation of the IIMS running at Partner 1 site.). Figure 3 illustrates a proportion of the common schema.

Figure 3 : Part of the Combined MSD/3D-EM database Schema
Table 1 Workpackage List - STATUS
| Work-package No. |
Work-package title |
Responsible participant No. |
When |
Deliverable |
Status |
| WP1 |
Standardisation of metadescriptors for 3D-EM data as well as combined studies |
2 |
Months 1-12 |
1-3 |
Completed |
| WP2 |
Common data model |
1 |
Months 1-18 |
4-5 |
Completed |
| WP3 |
Development and revision of prototype deposition interface |
1 |
Months 12-21 |
6-7 |
Completed |
| WP4 |
Development and revision of prototype query/search interface |
1 |
Months15-24 |
8-9 |
Completed |
| WP5 |
Validation measures for 3D-EM and combined studies |
1* |
Months 1-36 |
10 |
Incomplete |
| WP6 |
Validation software |
3 |
Months 1-36 |
10 |
Incomplete |
| WP7 |
Initial data input for prototype testing |
2 |
Months 16-18 |
13-15 |
Completed |
| WP8 |
Input from 3D-EM community |
4 |
Months 6-36 |
16 |
Completed |
| WP9 |
Consolidation of IIMS at Partner 1 site |
1 |
Months 27-36 |
17 |
Completed |
*Transferred from ex-p4
Table 2 Deliverables list - STATUS
(see Supplementary CD containing Deliverable files)
| Deliverable No |
Deliverable title |
Date Due |
Status |
| 1 |
Draft document on metadata |
6 |
Completed |
| 2 |
Document on metadata |
9 |
Completed |
| 3 |
Revised document on metadata |
12 |
Completed |
| 4 |
Draft data model for 3D-EM metadata |
12 |
Completed |
| 5 |
Full data model |
18 |
Completed |
| 6 |
Deposition interface specification document |
15 |
Completed |
| 7 |
Deposition interface implementation |
21 |
Completed |
| 8 |
Query/Search interface specification document |
18 |
Completed |
| 9 |
Query/Search interface implementation |
24 |
Completed |
| 10 |
Validity measurements and procedures |
36 |
Incomplete |
| 11 |
Pilot study on data harvesting |
18 |
Completed |
| 12 |
Pilot study on data harvesting revision |
36 |
Completed |
| 13 |
New data sets for first prototype |
18 |
Completed |
| 14 |
New data sets for validation testing |
24 |
Completed |
| 15 |
New data sets for data harvesting testing |
18 |
Completed |
| 16 |
Reports on extended coordination meetings |
12, 24, 36 |
Completed |
| 17 |
Consolidation of IIMS database at Partner 1 site. |
36 |
Completed |
Table 3 Milestones List - STATUS
| Milestone No |
Milestone title |
Date |
Participants |
Description |
| 1 |
3D EM Metadata Description |
12 |
1, 2 |
Completed |
| 2 |
Fully integrated 3D-EM data model |
18 |
1, 2 |
Full integration of 3D-EM, X-ray and NMR data models. Completed |
| 3 |
Final deposition interface |
21 |
1, 2 |
Prototype deposition interface Completed |
| 4 |
Query/Search interface |
24 |
1, 2 |
Prototype query search interface Completed |
| 5 |
New validation measurements and procedures for 3D-EM |
36 |
2, 3, 4 |
Incomplete |
| 6 |
A complete implementation of the IIMS running at Partner 1's site |
36 |
1, 2 |
Completed |
|