spacer

Reference data home page

Reference data for protein, DNA, RNA:

The core information for creating the CCPN chemComp files comes from the E-MSD database. This information was supplemented with information (mainly on atom naming systems) from the BioMagResBank and the AQUA and CYANA programs. The information was further refined within the CCPN framework by adding relevant atoms, atomSets and naming system information. Also, the chemical variants regarding protonation state and common linking were specified for the amino acids.

Below you can download the information from the CCPN XML data files decomposed into separate tab-delimited (.csv) or XML (.xml) format files. Because the data is highly organized and hierarchical inside the CCPN data model, it has to be split up into these separate files: the easiest way to access the full data is by directly using the CCPN Python or Java API.

Should you use this information and find inconsistencies or problems, please contact us.

Following is a short description of the table files (same applies to the XML files):

  • chemCompData.csv: Describes the attributes for the chemComps by CCPN ccpCode
  • chemCompNames.csv: Describes the names used in naming systems for the above.
  • atomData.csv: Describes the 'superset' of atoms per ccpCode for this molType.
  • atomNames.csv: Describes names used in naming systems for above atoms.
  • atomSetData.csv: Describes the 'superset' of atomSets per ccpCode for this molType. Also listed are 'variants' of atomSets (e.g. H* can refer to H1,H2,H3 when the N-terminus is fully protonated, or to H1,H2 when it is not)
  • atomSetNames.csv: Describes names used in naming systems for above atomSets.
  • torsionData.csv: Describes the torsion angles.
  • torsionNames.csv: Describes names used in naming systems for above torsion angles.
  • .csv files by CCPN ccpCode ('ALA','VAL','A',...): These describe the 'variants' of a particular chemComp (this is basically limited to differences in protonation and linking state)


spacer
spacer