DEN home page
The DEN is being developed as a resource for software development. It contains publicly available raw NMR data sets that were obtained from a number of websites and directly from NMR labs. The difference with the websites is that all acquisition and derived (peak lists, ...) information for the spectra has been organized within the CCP data model, so that automatic procedures can be run easily on many sets of data.
Setup
The DEN is organized by molecule and laboratory from which the data originates. Further organisation is obtained from experimental conditions and spectrum type. All information regarding a set of data is available as a CCPN XML file (except for the raw data itself).
Contribute data
You can contribute data to the DEN, either as publicly or privately accessible (i.e. the data sets will be available for the depositor and for use within the NMR projects at the MSD).
Data sets
The data sets themselves can be downloaded as an archive together with the CCPN XML files describing them.
More information and aims within the EBI
We want to collect raw (time domain) NMR datasets for a number of purposes.
- To extract general information about the datasets (average number of peaks per experiment type & number of residues,...)
- To derive 'peak quality' indicators based on statistical analysis over many different raw datasets that are processed and peak picked by different methods. It is hoped that a new peak picking algorithm will evolve from this.
- To test different methods (eg automated assignment,...) across a range of datasets.
- To test the CCPN data model, and further develop applications based on it. The aim is to integrate existing methods via CCPN so they can be used together. These methods are accessible via a GUI in order to be easy to use. An example is a Format Converter window where one can read in the acquisition parameters from Varian/Bruker data, add certain specifications and automatically write an nmrPipe or Azara processing script.
Software developed during this process will be publicly available with the CCPN distribution.
Aims for NMRQUAL
Dependent on permission from the contributor, the data will also be used by other groups working within NMRQUAL (or even outside of it). The aims here are more towards structure validation and require some additional data.
Data requirements
Ideally, we would like to get an extended set of raw data for one (or more) proteins. We need data on 'badly behaved' proteins as well as soluble folded ones - also everything from nice to noisy raw NMR datasets (2D, 3D, NOE type, through-bond,...). We think that having a broad range of data (from all sorts of spectrometers and all sorts of conditions) is the best way to derive general characteristics of peaks.
Click here for more information on which data is required for deposition. Contact Wim Vranken if you are interested in contributing data to this project.
What will happen to the data
The data will be stored within the EBI: we ask for the permission of the contributor if anyone outside the NMR part of the MSD group at the EBI requests the data.
BioMagResBank
We'd also like to remind you that the BioMagResBank now accepts time domain depositions. We have already integrated the available BMRB time domain sets into our data structure.
Current status
At this point a data structure has been developed that holds the time domain data in an organized way, allowing scripts to automatically access and write out information. In addition to the Bruker/Varian acquisition files, only a couple of additional (if any) settings have to be made for each spectrum in order to process them automatically from there onwards. Further development is concentrated on referencing the spectra correctly and running peak picking programs automatically on the processed spectra.
| Primary developers: Wim Vranken |
Last modified: Fri Dec 01 09:44:14 GMT 2006
|
 |