Entities of External database links

The entities that belong to a specific mart
Entity Details: A=Number of attributes of the Entity
R=Number of relations of the Entity
T=Name of the database table
I=Approximation of the number of instances of the entity

External database links    Diagram  Marts
Cross reference on entry, chain or residue level with external databases like: the Swiss-prot protein sequence database (http://www.ebi.ac.uk/swissprot/), the EC enzyme database (http://ca.expasy.org/enzyme/), the GO gene ontology database (http://www.geneontology.org/), the Interpro database of protein families, domains and functional sites (http://www.ebi.ac.uk/interpro/) and Pfam database of protein families and hidden markov models (http://pfam.wustl.edu/) the Scop database for structural classification of proteins (http://scop.mrc-lmb.cam.ac.uk/scop/) the PubMed database that provides access to millions of MEDLINE citations (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed)
Swiss-Prot link per Chain
A:7, R:4, T:SWP_CHAIN, I:0

Links to Swiss-prot protein sequence database (http://www.ebi.ac.uk/swissprot/) on the chain level
EC Enzyme Database
A:6, R:2, T:EC_MAPPING, I:11600

Links to the EC enzyme database (http://ca.expasy.org/enzyme/) and Swiss-Prot on the molecule level
PubMed Citation Database
A:4, R:1, T:PUBMEDLIST, I:30800

Links to the PubMed database that provides access to millions of MEDLINE citations (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed) on the entry level
SCOP per chain
A:6, R:2, T:SCOP_INT, I:8800000

Links to SCOP database (http://nar.oupjournals.org/cgi/content/full/30/1/264) for structural classification of proteins (http://scop.mrc-lmb.cam.ac.uk/scop/) on the chain level
CATH per chain
A:6, R:2, T:CATH_INT, I:9040000

Links to the CATH Protein Structure Classification database (http://www.biochem.ucl.ac.uk/bsm/cath/) on a detailed chain level
PFAM per chain
A:5, R:2, T:PFAM_INT, I:0

Links to the PFAM database of protein families and profile hidden markov models (http://nar.oupjournals.org/cgi/content/full/32/suppl_1/D138) , (http://nar.oupjournals.org/cgi/content/full/30/1/276?ijkey=wfgjAWVRY.wto&keytype=ref&siteid=nar) on the chain level (http://www.sanger.ac.uk/Software/Pfam/help/index.shtml)
GO Gene Ontology
A:5, R:1, T:GO_SWP_MAPPING, I:311000

Links to the GO gene ontology database (http://www.geneontology.org/) and Interpro database on the entry level
Interpro Protein Family
A:5, R:1, T:INT_SWP_MAPPING, I:262000

Links to the Interpro database of protein families, domains and functional sites (http://www.ebi.ac.uk/interpro/) on the chain level
Representative Entries
A:5, R:1, T:REPRESENTATIVE, I:7620

Grouping of entries in set of representative entries for various purposes (ie SCOP, DALI). This is very convinient in research or statistics operations because it resolves the effect from the existence of many structures of the same or similar proteins in the PDB that can bias such results
Related Entry
A:7, R:2, T:RELATED_ENTRY, I:27200

Entries that are related in some way (type of relation)
Swiss-Prot link per Entry
A:4, R:3, T:SP_MAP_INT, I:0

Links to Swiss-prot protein sequence database (http://www.ebi.ac.uk/swissprot/) on the entry level
Swiss-Prot description
A:2, R:3, T:SP_DESCRIPTION, I:0

Descriptions provided by Swissprot database
Swiss-Prot keyword
A:2, R:3, T:SP_KEYWORD, I:0

Keywords referred in Swissprot database
Swiss-Prot Protein Knowledgebase
A:28, R:7, T:SWISS_PROT_MAPPING, I:9950000

Links to Swiss-prot protein sequence database (http://www.ebi.ac.uk/swissprot/) on the very detailed residue level
SCOP Structural Classification of Proteins
A:31, R:5, T:SCOP_MAPPING, I:8800000

Links to SCOP database (http://nar.oupjournals.org/cgi/content/full/30/1/264) for structural classification of proteins (http://scop.mrc-lmb.cam.ac.uk/scop/) on the residue level
CATH Protein Structure Classification
A:29, R:5, T:CATH_MAPPING, I:9040000

Links to the CATH Protein Structure Classification database (http://www.biochem.ucl.ac.uk/bsm/cath/) on a detailed residue level
PFAM per Residue
A:30, R:5, T:PFAM_MAPPING, I:0

Links to the PFAM database of protein families and profile hidden markov models (http://nar.oupjournals.org/cgi/content/full/32/suppl_1/D138) , (http://nar.oupjournals.org/cgi/content/full/30/1/276?ijkey=wfgjAWVRY.wto&keytype=ref&siteid=nar) on the detailed residue level (http://www.sanger.ac.uk/Software/Pfam/help/index.shtml)
PFAM Protein Family
A:9, R:1, T:PFAM_SWP_MAPPING, I:262000

Links to the Pfam database of protein families and hidden markov models (http://nar.oupjournals.org/cgi/content/full/30/1/276?ijkey=wfgjAWVRY.wto&keytype=ref&siteid=nar) on the chain level


Attributes/Relations of External database links Entities
The attributes and relations that belong to a specific entity
Attribute Details: Type of the attribute String:String, Integer:Integer, Number:Number, Date:Date, Unknown:Unknown
C=Name of the database column
S=Maximum size of the attribute
=Actual average size used for the attribute
Naming The attribute is a part of the name of an instance
Reference The attribute is a part of the reference key of an instance
Hidden The attribute is not supposed to be visible and used for queries
Summery The attribute is supposed to be used in summary reports (lists) for the entity

Relation Details: Cardinality of the relation Optional:Optional, Many:Many
Reverse Optional Reverse Many=Reverse relation of the entity that the relation refers to
Reverse Entity Reverse External Entity=Entity that this relation establishes an association (reverse entity)

Containment The relation is the containment relation of the entity
External The relation is associated with an external entity from a different mart
Hidden The relation is not supposed to be visible and used for queries
Swiss-Prot link per Chain    Entities  Marts
Links to Swiss-prot protein sequence database (http://www.ebi.ac.uk/swissprot/) on the chain level
Reference attributes:Chain Id,SwissProt accession - Naming attributes:Accession Code,Chain Code,SwissProt accession
C:CHAIN_ID, S:10, A:0.0
The database identifier of the Chain
C:SP_PRIMARY_ID, S:15, A:0.0
The accession number (AC) associated with a swissprot entry (http://ca.expasy.org/sprot/userman.html#AC_line)
C:SP_SECONDARY_ID, S:255, A:0.0
Swissprot entry name: The first item on the ID line is the entry name of the sequence. This name is a useful means of identifying a sequence (http://ca.expasy.org/sprot/userman.html#ID_line)
C:ACCESSION_CODE, S:8, A:0.0
The PDB accession code of the entry
C:CHAIN_CODE, S:8, A:0.0
The standard code of the chain that uniquely identifies it in the assembly. It is an extension of the PDB chain Id. In cases where symmetry operations have been applied to a chain, these chains are named with a numeric suffix, ie. A,A1,A2,A3 ... The chain id specified in the PDB file is also ignored for waters and bound molecules, and their codes are derived from the name of the chain that they are bound to. Finally there are no "null" chain codes and in cases where no id was specified in the PDB file, then arbitrary chain codes are assigned (i.e. A,B)
C:CHAIN_PDB_CODE, S:1, A:0.0
The original code of the chain as found in the PDB. There are problems with the chain code since it is not used in a consistent way in the PDB. Firstly in many cases this is null in cases where there is a single chain in the entry. Additionally very often the same chain code is used both for a polymer chain and a bound molecule (that is bound to it). So generally the PDB chain code is often not a distinct identified for a chain. For this reason the chain code was introduced which is consistent and uniform. The purpose of the chain code is to uniquely identify a chain in an assembly. So in cases where chain A is used 4 times in an assembly, the generated chains will have chain codes A, A1, A2, A3. Although for the chain that has been marked as non-symmetric valid (that should be used to extract the original asymmetric PDB data), then the original PDB code is used (if it is correct) i.e. A. In these cases where a chain in the PDB did not have a chain code, then the first not used letter is reserved (i.e. A). When 2 different chains (i.e. polymer chain and bound molecule chain) share the same PDB code, then the chain code of the bound molecule is consistently derived from the chain code of the polymer chain
C:ENTRY_ID, S:10, A:0.0
The database identifier of the Entry
related   Reverse Many Reverse Entity 
Reverse relation:for of Reverse entity:Swiss-Prot description - Relation attributes:SwissProt accession
related   Reverse Many Reverse Entity 
Reverse relation:for of Reverse entity:Swiss-Prot keyword - Relation attributes:SwissProt accession
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Chain - Relation attributes:Chain Id
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Entry - Relation attributes:Entry Id

EC Enzyme Database    Entities  Marts
Links to the EC enzyme database (http://ca.expasy.org/enzyme/) and Swiss-Prot on the molecule level
Reference attributes:Molecule Id,SwissProt accession,EC Number - Naming attributes:Accession Code,EC Number
C:ACCESSION_CODE, S:8, A:4.0
The PDB accession code of the entry
C:EC_NUMBER, S:12, A:8.0
EC Number of the Enzyme Nomenclature (http://www.chem.qmul.ac.uk/iubmb/enzyme/)
C:ENTRY_ID, S:10, A:4.0
The database identifier of the Entry
C:MOLECULE_ID, S:10, A:4.0
The database identifier of the Molecule
C:SP_PRIMARY_ID, S:15, A:6.0
The accession number (AC) associated with a swissprot entry (http://ca.expasy.org/sprot/userman.html#AC_line)
C:SP_SECONDARY_ID, S:255, A:9.0
Swissprot entry name: The first item on the ID line is the entry name of the sequence. This name is a useful means of identifying a sequence (http://ca.expasy.org/sprot/userman.html#ID_line)
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Entry - Relation attributes:Entry Id
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Molecule - Relation attributes:Molecule Id

PubMed Citation Database    Entities  Marts
Links to the PubMed database that provides access to millions of MEDLINE citations (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed) on the entry level
Reference attributes:Entry Id,Ordinal - Naming attributes:Accession Code,Ordinal,PubMed Id
C:ENTRY_ID, S:10, A:4.0
The database identifier of the Entry
C:ORDINAL, S:3, A:2.0
Refers to the location where the PubMed entry corresponds in the PDB data
C:ACCESSION_CODE, S:4, A:4.0
The PDB accession code of the entry
C:PUBMEDID, S:10, A:5.0
The PMID PubMed Unique Identifier Unique number assigned to each PubMed citation (http://www.ncbi.nlm.nih.gov/entrez/query/static/help/pmhelp.html#MEDLINEDisplayFormat)
related   Reverse Many Reverse Entity 
Reverse relation:referred of Reverse entity:Entry - Relation attributes:Entry Id,Ordinal

SCOP per chain    Entities  Marts
Links to SCOP database (http://nar.oupjournals.org/cgi/content/full/30/1/264) for structural classification of proteins (http://scop.mrc-lmb.cam.ac.uk/scop/) on the chain level
Reference attributes:Scop sunid,Chain Id - Naming attributes:Accession Code,Chain Code,Scop sunid,Sccs
C:ACCESSION_CODE, S:8, A:4.0
The PDB accession code of the entry
C:CHAIN_CODE, S:8, A:2.0
The standard code of the chain that uniquely identifies it in the assembly. It is an extension of the PDB chain Id. In cases where symmetry operations have been applied to a chain, these chains are named with a numeric suffix, ie. A,A1,A2,A3 ... The chain id specified in the PDB file is also ignored for waters and bound molecules, and their codes are derived from the name of the chain that they are bound to. Finally there are no "null" chain codes and in cases where no id was specified in the PDB file, then arbitrary chain codes are assigned (i.e. A,B)
C:CHAIN_ID, S:10, A:4.0
The database identifier of the Chain
C:ENTRY_ID, S:0, A:4.0
The database identifier of the Entry
C:SCCS, S:20, A:8.0
Scop family identifier
C:SUNID, S:0, A:4.0
A number which uniquely identifies each entry in the SCOP hierarchy, including leaves and entries corresponding to the protein level (http://scop.mrc-lmb.cam.ac.uk/scop/release-notes.html#sunid)
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Chain - Relation attributes:Chain Id
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Entry - Relation attributes:Entry Id

CATH per chain    Entities  Marts
Links to the CATH Protein Structure Classification database (http://www.biochem.ucl.ac.uk/bsm/cath/) on a detailed chain level
Reference attributes:CATH domain name,Chain Id - Naming attributes:Accession Code,Chain Code,CATH superfamily code
C:CHAIN_ID, S:10, A:5.0
The database identifier of the Chain
C:CHAIN_CODE, S:24, A:2.0
The standard code of the chain that uniquely identifies it in the assembly. It is an extension of the PDB chain Id. In cases where symmetry operations have been applied to a chain, these chains are named with a numeric suffix, ie. A,A1,A2,A3 ... The chain id specified in the PDB file is also ignored for waters and bound molecules, and their codes are derived from the name of the chain that they are bound to. Finally there are no "null" chain codes and in cases where no id was specified in the PDB file, then arbitrary chain codes are assigned (i.e. A,B)
C:ENTRY_ID, S:10, A:4.0
The database identifier of the Entry
C:ACCESSION_CODE, S:24, A:4.0
The PDB accession code of the entry
C:CATH_ID, S:18, A:6.0
Cath Domain Names: MUST be SIX characters (e.g. 1cuk03). CHARACTERS 1-4: PDB Code: The first 4 characters determine the PDB code e.g. 1cuk CHARACTER 5: Chain Character: This determines which PDB chain is represented.A chain character of zero ('0') indicates that the PDB file has no chain field. CHARACTER 6: Domain Number: A domain number of ZERO ('0') indicates that the domain is a whole PDB chain. (http://www.biochem.ucl.ac.uk/bsm/cath/formats/CathList.html)
C:CATHCODE, S:45, A:11.0
CATH superfamily code that provide information about the CATH hierarchy. (http://www.biochem.ucl.ac.uk/bsm/cath/cath_info.html) The hierarchy is build up by the following levels - Architecture, A-level: This describes the overall shape of the domain structure as determined by the orientations of the secondary structures but ignores the connectivity between the secondary structures. - Topology (Fold family), T-level: Structures are grouped into fold families at this level depending on both the overall shape and connectivity of the secondary structures. - Homologous Superfamily, H-level: This level groups together protein domains which are thought to share a common ancestor and can therefore be described as homologous. - Sequence families, S-level: Structures within each H-level are further clustered on sequence identity. (http://www.biochem.ucl.ac.uk/bsm/cath/formats/CathDomainDescriptionFile.html)
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Chain - Relation attributes:Chain Id
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Entry - Relation attributes:Entry Id

PFAM per chain    Entities  Marts
Reference attributes:Pfam Id,Chain Id - Naming attributes:Accession Code,Chain Code,Pfam Id
C:ACCESSION_CODE, S:24, A:0.0
The PDB accession code of the entry
C:ENTRY_ID, S:10, A:0.0
The database identifier of the Entry
C:CHAIN_ID, S:10, A:0.0
The database identifier of the Chain
C:CHAIN_CODE, S:24, A:0.0
The standard code of the chain that uniquely identifies it in the assembly. It is an extension of the PDB chain Id. In cases where symmetry operations have been applied to a chain, these chains are named with a numeric suffix, ie. A,A1,A2,A3 ... The chain id specified in the PDB file is also ignored for waters and bound molecules, and their codes are derived from the name of the chain that they are bound to. Finally there are no "null" chain codes and in cases where no id was specified in the PDB file, then arbitrary chain codes are assigned (i.e. A,B)
C:PFAM_ID, S:30, A:0.0
The PFAM accession number
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Chain - Relation attributes:Chain Id
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Entry - Relation attributes:Entry Id

GO Gene Ontology    Entities  Marts
Links to the GO gene ontology database (http://www.geneontology.org/) and Interpro database on the entry level
Reference attributes:Entry Id,SwissProt accession,GO Id - Naming attributes:Accession Code,GO Id
C:ACCESSION_CODE, S:8, A:4.0
The PDB accession code of the entry
C:GO_ID, S:10, A:10.0
The database identifier of the Go
C:ENTRY_ID, S:10, A:4.0
The database identifier of the Entry
C:SP_PRIMARY_ID, S:15, A:6.0
The accession number (AC) associated with a swissprot entry (http://ca.expasy.org/sprot/userman.html#AC_line)
C:SP_SECONDARY_ID, S:255, A:10.0
Swissprot entry name: The first item on the ID line is the entry name of the sequence. This name is a useful means of identifying a sequence (http://ca.expasy.org/sprot/userman.html#ID_line)
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Entry - Relation attributes:Entry Id

Interpro Protein Family    Entities  Marts
Links to the Interpro database of protein families, domains and functional sites (http://www.ebi.ac.uk/interpro/) on the chain level
Reference attributes:Entry Id,SwissProt accession,InterPro Id - Naming attributes:Accession Code,InterPro Id
C:ACCESSION_CODE, S:8, A:4.0
The PDB accession code of the entry
C:ENTRY_ID, S:10, A:4.0
The database identifier of the Entry
C:INTERPRO_ID, S:9, A:9.0
Interpro entry accession number (http://www.ebi.ac.uk/interpro/tutorial.html#N556)
C:SP_PRIMARY_ID, S:15, A:6.0
The accession number (AC) associated with a swissprot entry (http://ca.expasy.org/sprot/userman.html#AC_line)
C:SP_SECONDARY_ID, S:255, A:10.0
Swissprot entry name: The first item on the ID line is the entry name of the sequence. This name is a useful means of identifying a sequence (http://ca.expasy.org/sprot/userman.html#ID_line)
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Entry - Relation attributes:

Representative Entries    Entities  Marts
Grouping of entries in set of representative entries for various purposes (ie SCOP, DALI). This is very convinient in research or statistics operations because it resolves the effect from the existence of many structures of the same or similar proteins in the PDB that can bias such results
Reference attributes:Entry Id,Source - Naming attributes:Source,Accession Code
C:ENTRY_ID, S:10, A:4.0
The database identifier of the Entry
C:SOURCE, S:10, A:4.0
The source of the set of representative entries (i.e. SCOP,DALI). It also serves as the identifier and discriminator of representative sets
C:ACCESSION_CODE, S:8, A:4.0
The PDB accession code of the entry
C:DETAILS, S:100, A:1.0
C:REL_STATUS, S:1, A:1.0
A flag that specifies if the current entry is publicly released or not
related   Reverse Many Reverse Entity 
Reverse relation:in of Reverse entity:Entry - Relation attributes:Entry Id

Related Entry    Entities  Marts
Entries that are related in some way (type of relation)
Reference attributes:Related Entry Id - Naming attributes:Accession Code,Related Accession Code
C:RELATED_ENTRY_ID, S:10, A:4.0
The database identifier of the Related Entry
C:ACCESSION_CODE, S:8, A:4.0
The PDB accession code of the entry
C:RELATED_ACCESSION_CODE, S:8, A:4.0
The PDB accession code of the related entry
C:DRE_ID, S:10, A:4.0
The database identifier of the Dre
C:ENTRY_ID, S:10, A:4.0
The database identifier of the Entry
C:RELATION_TYPE, S:80, A:7.0
The type and source of the relation with the related entry (ie PDB remark 900)
C:RELATIONSHIP_DETAILS, S:255, A:59.0
Textual information about the relation
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Entry - Relation attributes:Entry Id
related   Reverse Many Reverse Entity 
Reverse relation:related of Reverse entity:Entry - Relation attributes:Related Entry Id

Swiss-Prot link per Entry    Entities  Marts
Links to Swiss-prot protein sequence database (http://www.ebi.ac.uk/swissprot/) on the entry level
Reference attributes:SwissProt accession,Entry Id - Naming attributes:Accession Code,SwissProt accession
C:SP_PRIMARY_ID, S:15, A:0.0
The accession number (AC) associated with a swissprot entry (http://ca.expasy.org/sprot/userman.html#AC_line)
C:SP_SECONDARY_ID, S:255, A:0.0
Swissprot entry name: The first item on the ID line is the entry name of the sequence. This name is a useful means of identifying a sequence (http://ca.expasy.org/sprot/userman.html#ID_line)
C:ACCESSION_CODE, S:8, A:0.0
The PDB accession code of the entry
C:ENTRY_ID, S:10, A:0.0
The database identifier of the Entry
related   Reverse Many Reverse Entity 
Reverse relation:for of Reverse entity:Swiss-Prot description - Relation attributes:SwissProt accession
related   Reverse Many Reverse Entity 
Reverse relation:for of Reverse entity:Swiss-Prot keyword - Relation attributes:SwissProt accession
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Entry - Relation attributes:Entry Id

Swiss-Prot description    Entities  Marts
Descriptions provided by Swissprot database
Reference attributes:SwissProt accession,Description text - Naming attributes:SwissProt accession,Description text
C:PRIMARYACC#, S:15, A:0.0
The accession number (AC) associated with a swissprot entry
C:TEXT, S:4000, A:0.0
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Swiss-Prot Protein Knowledgebase - Relation attributes:SwissProt accession
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Swiss-Prot link per Chain - Relation attributes:SwissProt accession
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Swiss-Prot link per Entry - Relation attributes:SwissProt accession

Swiss-Prot keyword    Entities  Marts
Keywords referred in Swissprot database
Reference attributes:SwissProt accession,Keyword - Naming attributes:SwissProt accession,Keyword
C:PRIMARYACC#, S:15, A:0.0
The accession number (AC) associated with a swissprot entry
C:KEYWORD, S:80, A:0.0
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Swiss-Prot Protein Knowledgebase - Relation attributes:SwissProt accession
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Swiss-Prot link per Entry - Relation attributes:SwissProt accession
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Swiss-Prot link per Chain - Relation attributes:SwissProt accession

Swiss-Prot Protein Knowledgebase     Entities  Marts
Links to Swiss-prot protein sequence database (http://www.ebi.ac.uk/swissprot/) on the very detailed residue level
Reference attributes:Swp Mapping Id - Naming attributes:Accession Code,Assembly Serial,Chain Code,Residue Serial,SwissProt accession
C:SWP_MAPPING_ID, S:0, A:5.0
The database identifier of the Swp Mapping
C:ACCESSION_CODE, S:8, A:4.0
The PDB accession code of the entry
C:ASSEMBLY_SERIAL, S:38, A:2.0
The serial identifier of the assembly in the entry
C:CHAIN_CODE, S:8, A:2.0
The standard code of the chain that uniquely identifies it in the assembly. It is an extension of the PDB chain Id. In cases where symmetry operations have been applied to a chain, these chains are named with a numeric suffix, ie. A,A1,A2,A3 ... The chain id specified in the PDB file is also ignored for waters and bound molecules, and their codes are derived from the name of the chain that they are bound to. Finally there are no "null" chain codes and in cases where no id was specified in the PDB file, then arbitrary chain codes are assigned (i.e. A,B)
C:RESIDUE_SERIAL, S:38, A:3.0
Serial number of the residue in the chain. Starts with 1 for the first residue (N-terminal or 5'-terminal) in the chain, and increases by 1 with each position along the chain uniquely identifying the residue in the chain.
C:SP_PRIMARY_ID, S:15, A:7.0
The accession number (AC) associated with a swissprot entry (http://ca.expasy.org/sprot/userman.html#AC_line)
C:ASSEMBLY_ID, S:10, A:4.0
The database identifier of the Assembly
C:CHAIN_CODE_1_LETTER, S:1, A:1.0
This is an additional 1 letter code that uniquely identifies it in the assembly. It is arbitrary and its purpose is to be able to export files in PDB format
C:CHAIN_ID, S:10, A:4.0
The database identifier of the Chain
C:CHAIN_PDB_CODE, S:1, A:1.0
The original code of the chain as found in the PDB. There are problems with the chain code since it is not used in a consistent way in the PDB. Firstly in many cases this is null in cases where there is a single chain in the entry. Additionally very often the same chain code is used both for a polymer chain and a bound molecule (that is bound to it). So generally the PDB chain code is often not a distinct identified for a chain. For this reason the chain code was introduced which is consistent and uniform. The purpose of the chain code is to uniquely identify a chain in an assembly. So in cases where chain A is used 4 times in an assembly, the generated chains will have chain codes A, A1, A2, A3. Although for the chain that has been marked as non-symmetric valid (that should be used to extract the original asymmetric PDB data), then the original PDB code is used (if it is correct) i.e. A. In these cases where a chain in the PDB did not have a chain code, then the first not used letter is reserved (i.e. A). When 2 different chains (i.e. polymer chain and bound molecule chain) share the same PDB code, then the chain code of the bound molecule is consistently derived from the chain code of the polymer chain
C:CHEM_COMP_CODE, S:12, A:7.0
The standard extended molecule code of the aminoacid or ligand. It is composed by the PDB 3 letter code with an optional topological indicator appended after an underscore
C:CIF_SERIAL, S:8, A:1.0
(obsolete)
C:CODE_1_LETTER, S:5, A:1.0
One code letter for the residue as specified in the PDB sequence and structure.
C:CONFLICT_TYPE, S:255, A:2.0
It marks and classifies conflicts between the PDBe (PDB) and Swiss-prot (for future use)
C:DSC_TYPE, S:3, A:3.0
Provides information about the location of the residue on the mapped segment with swiss-prot (ie begining, end)
C:ENTRY_ID, S:10, A:4.0
The database identifier of the Entry
C:MOLECULE_ID, S:10, A:4.0
The database identifier of the Molecule
C:NCBI_TAX_ID, S:15, A:4.0
The NCBI taxonomy identifier (taxid) that points to a node of the taxonomy tree
C:NON_ASSEMBLY_VALID, S:1, A:1.0
This item is to be used not only in an assembly context, but also to represent the original asymmetric unit
C:NOT_OBSERVED, S:1, A:1.0
The residue's coordinates are not available because the residue was not observed in the experiment data. There are no coordinates for any of its atoms.
C:RESIDUE_ID, S:0, A:5.0
The database identifier of the Residue
C:RESIDUE_PDB_CODE, S:3, A:3.0
The code of the residue or ligand as was defined in the PDB. The reference ligand (chem comp) code should be used instead, since in cases where these two are different there was some error in the original PDB data that was identified during clean up. Common cases of these are 1) The chemical structure implied by the PDB coordinates is entirely irrelevant with the ligand with this code in the chemical dictionary. A big mess in the PDB data. 2) The structure in the PDB is a structurally modified version of the ligand with this code for example an extra atom was introduced (i.e. modified aminoacids). In these cases a new ligand is defined in the chemical dictionary and is assigned to this residue or bound molecule. 3) The PDB coordinates imply a different stereoisomer. A new ligand is introduced in the chemical dictionary for the new stereoisomer.
C:RESIDUE_PDB_INSERT_CODE, S:1, A:1.0
The insertion code of the residue, as was originally found in the PDB. The residue serial should be used instead since the PDB SEQ and INSERT CODE are not consistently and uniformly used in PDB
C:RESIDUE_PDB_SEQ, S:4, A:3.0
The sequence of the residue, as was originally found in the PDB (has to be used together with insert code).
C:RESIDUE_TYPE, S:1, A:1.0
The type of the component R:residue, B:bound molecule, W:water. This normally has to correspond with the type of the chain where there residue belongs
C:SP_SECONDARY_ID, S:255, A:10.0
Swissprot entry name: The first item on the ID line is the entry name of the sequence. This name is a useful means of identifying a sequence (http://ca.expasy.org/sprot/userman.html#ID_line)
C:SP_SERIAL, S:10, A:3.0
The serial identifier of the residue in the swiss-prot sequence. This does not correspond to the PDB residue serial because often only partial fragments of the actual sequence are involved and observed in the PDB experiment
C:SP_1_LETTER_CODE, S:1, A:1.0
This is the 1 letter code of the aminoacid as specified in the Swiss-prot database. This can be different from the code in the PDB in very few cases where for several reasons the protein sequence and the actual structure observed in the experiment deviate.
related   Reverse Many Reverse Entity 
Reverse relation:for of Reverse entity:Swiss-Prot description - Relation attributes:SwissProt accession
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Assembly - Relation attributes:Assembly Id
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Entry - Relation attributes:Entry Id
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Residue - Relation attributes:Residue Id
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Molecule - Relation attributes:Molecule Id
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Chain - Relation attributes:Chain Id
related   Reverse Many Reverse Entity 
Reverse relation:for of Reverse entity:Swiss-Prot keyword - Relation attributes:SwissProt accession

SCOP Structural Classification of Proteins    Entities  Marts
Links to SCOP database (http://nar.oupjournals.org/cgi/content/full/30/1/264) for structural classification of proteins (http://scop.mrc-lmb.cam.ac.uk/scop/) on the residue level
Reference attributes:Scop Mapping Id - Naming attributes:Accession Code,Assembly Serial,Chain Code,Residue Serial,Sccs,Scop sunid
C:SCOP_MAPPING_ID, S:0, A:5.0
The database identifier of the Scop Mapping
C:ACCESSION_CODE, S:8, A:4.0
The PDB accession code of the entry
C:ASSEMBLY_SERIAL, S:38, A:2.0
The serial identifier of the assembly in the entry
C:CHAIN_CODE, S:8, A:2.0
The standard code of the chain that uniquely identifies it in the assembly. It is an extension of the PDB chain Id. In cases where symmetry operations have been applied to a chain, these chains are named with a numeric suffix, ie. A,A1,A2,A3 ... The chain id specified in the PDB file is also ignored for waters and bound molecules, and their codes are derived from the name of the chain that they are bound to. Finally there are no "null" chain codes and in cases where no id was specified in the PDB file, then arbitrary chain codes are assigned (i.e. A,B)
C:RESIDUE_SERIAL, S:38, A:3.0
Serial number of the residue in the chain. Starts with 1 for the first residue (N-terminal or 5'-terminal) in the chain, and increases by 1 with each position along the chain uniquely identifying the residue in the chain.
C:SCOP_ID, S:8, A:7.0
The old scop identifier (sid) (http://scop.bic.nus.edu.sg/release-notes.html)
C:ASSEMBLY_ID, S:10, A:4.0
The database identifier of the Assembly
C:BEG_RES, S:8, A:2.0
The PDB seq of the first residue of the chain segment that belongs in the same mapping with this SCOP domain
C:CHAIN, S:15, A:1.0
This field contains the chain identifier as in the original scop data, it should be the same as in the CHAIN_PDB_CODE , except in cases that, this particular entry has been cleaned up
C:CHAIN_CODE_1_LETTER, S:1, A:1.0
This is an additional 1 letter code that uniquely identifies it in the assembly. It is arbitrary and its purpose is to be able to export files in PDB format
C:CHAIN_ID, S:10, A:4.0
The database identifier of the Chain
C:CHAIN_MSD_CODE, S:8, A:5.0
An internal longer code for a chain (defined by MSD) that includes the type of the chain (protein, bound molecule etc). It does not identify uniquely a chain in an assembly; the chain code has to be used instead
C:CHAIN_PDB_CODE, S:1, A:1.0
The original code of the chain as found in the PDB. There are problems with the chain code since it is not used in a consistent way in the PDB. Firstly in many cases this is null in cases where there is a single chain in the entry. Additionally very often the same chain code is used both for a polymer chain and a bound molecule (that is bound to it). So generally the PDB chain code is often not a distinct identified for a chain. For this reason the chain code was introduced which is consistent and uniform. The purpose of the chain code is to uniquely identify a chain in an assembly. So in cases where chain A is used 4 times in an assembly, the generated chains will have chain codes A, A1, A2, A3. Although for the chain that has been marked as non-symmetric valid (that should be used to extract the original asymmetric PDB data), then the original PDB code is used (if it is correct) i.e. A. In these cases where a chain in the PDB did not have a chain code, then the first not used letter is reserved (i.e. A). When 2 different chains (i.e. polymer chain and bound molecule chain) share the same PDB code, then the chain code of the bound molecule is consistently derived from the chain code of the polymer chain
C:CHEM_COMP_CODE, S:12, A:7.0
The standard extended molecule code of the aminoacid or ligand. It is composed by the PDB 3 letter code with an optional topological indicator appended after an underscore
C:CODE_1_LETTER, S:5, A:1.0
One code letter for the residue as specified in the PDB sequence and structure.
C:CODE_3_LETTER, S:3, A:3.0
This attribute provides a code from the chem comp dictionary for standard residues. This attribute must be the same for small molecules that represent our variations on topology/chemistry for a polymer component e.g. All ALA's should have a code_3_letter of ALA. All adenosine nucleotides should have a 3 letter code of A, except for those that have a topology of 'free'. This code is now obsolete and the Comp Code should be used instead in most cases
C:END_RES, S:8, A:2.0
The PDB seq of the last residue of the chain segment that belongs in the same mapping with this SCOP domain
C:ENTRY_ID, S:0, A:4.0
The database identifier of the Entry
C:NCBI_TAX_ID, S:15, A:4.0
The NCBI taxonomy identifier (taxid) that points to a node of the taxonomy tree
C:NON_ASSEMBLY_VALID, S:1, A:1.0
This item is to be used not only in an assembly context, but also to represent the original asymmetric unit
C:ORDER_IN, S:0, A:2.0
The serial of the mapping to a cath domain when 2 different segments of the same chain map to the same SCOP domain
C:PDB_INSERT_CODE, S:1, A:1.0
The insertion code of the residue, as was originally found in the PDB.
C:RESIDUE_ID, S:0, A:5.0
The database identifier of the Residue
C:RESIDUE_PDB_CODE, S:3, A:3.0
The code of the residue or ligand as was defined in the PDB. The reference ligand (chem comp) code should be used instead, since in cases where these two are different there was some error in the original PDB data that was identified during clean up. Common cases of these are 1) The chemical structure implied by the PDB coordinates is entirely irrelevant with the ligand with this code in the chemical dictionary. A big mess in the PDB data. 2) The structure in the PDB is a structurally modified version of the ligand with this code for example an extra atom was introduced (i.e. modified aminoacids). In these cases a new ligand is defined in the chemical dictionary and is assigned to this residue or bound molecule. 3) The PDB coordinates imply a different stereoisomer. A new ligand is introduced in the chemical dictionary for the new stereoisomer.
C:RESIDUE_PDB_SEQ, S:4, A:3.0
The sequence of the residue, as was originally found in the PDB (has to be used together with insert code).
C:SCCS, S:20, A:8.0
Scop family identifier
C:SP_SECONDARY_ID, S:1, A:1.0
Swissprot entry name: The first item on the ID line is the entry name of the sequence. This name is a useful means of identifying a sequence (http://ca.expasy.org/sprot/userman.html#ID_line)
C:SP_PRIMARY_ID, S:15, A:6.0
The accession number (AC) associated with a swissprot entry (http://ca.expasy.org/sprot/userman.html#AC_line)
C:SP_SERIAL, S:0, A:3.0
The serial identifier of the residue in the swiss-prot sequence. This does not correspond to the PDB residue serial because often only partial fragments of the actual sequence are involved and observed in the PDB experiment
C:SUNID, S:0, A:4.0
A number which uniquely identifies each entry in the SCOP hierarchy, including leaves and entries corresponding to the protein level (http://scop.mrc-lmb.cam.ac.uk/scop/release-notes.html#sunid)
C:SP_1_LETTER_CODE, S:0, A:0.0
This is the 1 letter code of the aminoacid as specified in the Swiss-prot database. This can be different from the code in the PDB in very few cases where for several reasons the protein sequence and the actual structure observed in the experiment deviate.
related   Reverse Many Reverse Entity 
Reverse relation:belong of Reverse entity:NCBI Taxonomy - Relation attributes:Ncbi Tax Id
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Chain - Relation attributes:Chain Id
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Residue - Relation attributes:Residue Id
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Entry - Relation attributes:Entry Id
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Assembly - Relation attributes:Assembly Id

CATH Protein Structure Classification    Entities  Marts
Links to the CATH Protein Structure Classification database (http://www.biochem.ucl.ac.uk/bsm/cath/) on a detailed residue level
Reference attributes:Cath Mapping Id - Naming attributes:Accession Code,Assembly Serial,Chain Code,Residue Serial,Ligand code,CATH domain name
C:CATH_MAPPING_ID, S:0, A:5.0
C:ASSEMBLY_ID, S:10, A:4.0
The database identifier of the Assembly
C:ASSEMBLY_SERIAL, S:38, A:2.0
The serial identifier of the assembly in the entry
C:RESIDUE_ID, S:0, A:5.0
The database identifier of the Residue
C:NCBI_TAX_ID, S:15, A:4.0
The NCBI taxonomy identifier (taxid) that points to a node of the taxonomy tree
C:CHAIN_ID, S:10, A:5.0
The database identifier of the Chain
C:CHAIN_CODE, S:24, A:2.0
The standard code of the chain that uniquely identifies it in the assembly. It is an extension of the PDB chain Id. In cases where symmetry operations have been applied to a chain, these chains are named with a numeric suffix, ie. A,A1,A2,A3 ... The chain id specified in the PDB file is also ignored for waters and bound molecules, and their codes are derived from the name of the chain that they are bound to. Finally there are no "null" chain codes and in cases where no id was specified in the PDB file, then arbitrary chain codes are assigned (i.e. A,B)
C:CHAIN_MSD_CODE, S:24, A:5.0
An internal longer code for a chain (defined by MSD) that includes the type of the chain (protein, bound molecule etc). It does not identify uniquely a chain in an assembly; the chain code has to be used instead
C:CHAIN_PDB_CODE, S:3, A:1.0
The original code of the chain as found in the PDB. There are problems with the chain code since it is not used in a consistent way in the PDB. Firstly in many cases this is null in cases where there is a single chain in the entry. Additionally very often the same chain code is used both for a polymer chain and a bound molecule (that is bound to it). So generally the PDB chain code is often not a distinct identified for a chain. For this reason the chain code was introduced which is consistent and uniform. The purpose of the chain code is to uniquely identify a chain in an assembly. So in cases where chain A is used 4 times in an assembly, the generated chains will have chain codes A, A1, A2, A3. Although for the chain that has been marked as non-symmetric valid (that should be used to extract the original asymmetric PDB data), then the original PDB code is used (if it is correct) i.e. A. In these cases where a chain in the PDB did not have a chain code, then the first not used letter is reserved (i.e. A). When 2 different chains (i.e. polymer chain and bound molecule chain) share the same PDB code, then the chain code of the bound molecule is consistently derived from the chain code of the polymer chain
C:CHAIN_CODE_1_LETTER, S:3, A:1.0
This is an additional 1 letter code that uniquely identifies it in the assembly. It is arbitrary and its purpose is to be able to export files in PDB format
C:NON_ASSEMBLY_VALID, S:1, A:1.0
This item is to be used not only in an assembly context, but also to represent the original asymmetric unit
C:CODE_3_LETTER, S:9, A:3.0
This attribute provides a code from the chem comp dictionary for standard residues. This attribute must be the same for small molecules that represent our variations on topology/chemistry for a polymer component e.g. All ALA's should have a code_3_letter of ALA. All adenosine nucleotides should have a 3 letter code of A, except for those that have a topology of 'free'. This code is now obsolete and the Comp Code should be used instead in most cases
C:CODE_1_LETTER, S:15, A:1.0
One code letter for the residue as specified in the PDB sequence and structure.
C:CHEM_COMP_CODE, S:36, A:7.0
The standard extended molecule code of the aminoacid or ligand. It is composed by the PDB 3 letter code with an optional topological indicator appended after an underscore
C:RESIDUE_PDB_CODE, S:9, A:3.0
The code of the residue or ligand as was defined in the PDB. The reference ligand (chem comp) code should be used instead, since in cases where these two are different there was some error in the original PDB data that was identified during clean up. Common cases of these are 1) The chemical structure implied by the PDB coordinates is entirely irrelevant with the ligand with this code in the chemical dictionary. A big mess in the PDB data. 2) The structure in the PDB is a structurally modified version of the ligand with this code for example an extra atom was introduced (i.e. modified aminoacids). In these cases a new ligand is defined in the chemical dictionary and is assigned to this residue or bound molecule. 3) The PDB coordinates imply a different stereoisomer. A new ligand is introduced in the chemical dictionary for the new stereoisomer.
C:RESIDUE_PDB_SEQ, S:4, A:3.0
The sequence of the residue, as was originally found in the PDB (has to be used together with insert code).
C:PDB_INSERT_CODE, S:3, A:1.0
The insertion code of the residue, as was originally found in the PDB.
C:RESIDUE_SERIAL, S:5, A:3.0
Serial number of the residue in the chain. Starts with 1 for the first residue (N-terminal or 5'-terminal) in the chain, and increases by 1 with each position along the chain uniquely identifying the residue in the chain.
C:ENTRY_ID, S:10, A:4.0
The database identifier of the Entry
C:ACCESSION_CODE, S:24, A:4.0
The PDB accession code of the entry
C:SP_PRIMARY_ID, S:45, A:6.0
The accession number (AC) associated with a swissprot entry (http://ca.expasy.org/sprot/userman.html#AC_line)
C:SP_SECONDARY_ID, S:765, A:9.0
Swissprot entry name: The first item on the ID line is the entry name of the sequence. This name is a useful means of identifying a sequence (http://ca.expasy.org/sprot/userman.html#ID_line)
C:SP_1_LETTER_CODE, S:3, A:1.0
This is the 1 letter code of the aminoacid as specified in the Swiss-prot database. This can be different from the code in the PDB in very few cases where for several reasons the protein sequence and the actual structure observed in the experiment deviate.
C:SP_SERIAL, S:10, A:3.0
The serial identifier of the residue in the swiss-prot sequence. This does not correspond to the PDB residue serial because often only partial fragments of the actual sequence are involved and observed in the PDB experiment
C:CATH_ID, S:18, A:6.0
Cath Domain Names: MUST be SIX characters (e.g. 1cuk03). CHARACTERS 1-4: PDB Code: The first 4 characters determine the PDB code e.g. 1cuk CHARACTER 5: Chain Character: This determines which PDB chain is represented.A chain character of zero ('0') indicates that the PDB file has no chain field. CHARACTER 6: Domain Number: A domain number of ZERO ('0') indicates that the domain is a whole PDB chain. (http://www.biochem.ucl.ac.uk/bsm/cath/formats/CathList.html)
C:CATHCODE, S:45, A:11.0
CATH superfamily code that provide information about the CATH hierarchy. (http://www.biochem.ucl.ac.uk/bsm/cath/cath_info.html) The hierarchy is build up by the following levels - Architecture, A-level: This describes the overall shape of the domain structure as determined by the orientations of the secondary structures but ignores the connectivity between the secondary structures. - Topology (Fold family), T-level: Structures are grouped into fold families at this level depending on both the overall shape and connectivity of the secondary structures. - Homologous Superfamily, H-level: This level groups together protein domains which are thought to share a common ancestor and can therefore be described as homologous. - Sequence families, S-level: Structures within each H-level are further clustered on sequence identity. (http://www.biochem.ucl.ac.uk/bsm/cath/formats/CathDomainDescriptionFile.html)
C:ORDER_IN, S:0, A:2.0
The serial of the mapping to a cath domain when 2 different segments of the same chain map to the same CATH domain
C:BEG_RES, S:18, A:2.0
The PDB seq of the first residue of the chain segment that belongs in the same mapping with this CATH domain
C:END_RES, S:18, A:3.0
The PDB seq of the last residue of the chain segment that belongs in the same mapping with this CATH domain
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Assembly - Relation attributes:Assembly Id
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Chain - Relation attributes:Chain Id
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Entry - Relation attributes:Entry Id
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Residue - Relation attributes:Residue Id
related   Reverse Many Reverse Entity 
Reverse relation:belong of Reverse entity:NCBI Taxonomy - Relation attributes:Ncbi Tax Id

PFAM per Residue    Entities  Marts
Reference attributes:Pfam Mapping Id - Naming attributes:Accession Code,Assembly Serial,Chain Code,Residue Serial,Ligand code,Pfam Id
C:PFAM_MAPPING_ID, S:0, A:0.0
Database identifier of the PFAM mapping
C:ACCESSION_CODE, S:24, A:0.0
The PDB accession code of the entry
C:ENTRY_ID, S:10, A:0.0
The database identifier of the Entry
C:ASSEMBLY_ID, S:10, A:0.0
The database identifier of the Assembly
C:ASSEMBLY_SERIAL, S:38, A:0.0
The serial identifier of the assembly in the entry
C:RESIDUE_ID, S:0, A:0.0
The database identifier of the Residue
C:NCBI_TAX_ID, S:15, A:0.0
The NCBI taxonomy identifier (taxid) that points to a node of the taxonomy tree
C:CHAIN_ID, S:10, A:0.0
The database identifier of the Chain
C:CHAIN_CODE, S:24, A:0.0
The standard code of the chain that uniquely identifies it in the assembly. It is an extension of the PDB chain Id. In cases where symmetry operations have been applied to a chain, these chains are named with a numeric suffix, ie. A,A1,A2,A3 ... The chain id specified in the PDB file is also ignored for waters and bound molecules, and their codes are derived from the name of the chain that they are bound to. Finally there are no "null" chain codes and in cases where no id was specified in the PDB file, then arbitrary chain codes are assigned (i.e. A,B)
C:CHAIN_MSD_CODE, S:24, A:0.0
An internal longer code for a chain (defined by MSD) that includes the type of the chain (protein, bound molecule etc). It does not identify uniquely a chain in an assembly; the chain code has to be used instead
C:CHAIN_PDB_CODE, S:3, A:0.0
The original code of the chain as found in the PDB. There are problems with the chain code since it is not used in a consistent way in the PDB. Firstly in many cases this is null in cases where there is a single chain in the entry. Additionally very often the same chain code is used both for a polymer chain and a bound molecule (that is bound to it). So generally the PDB chain code is often not a distinct identified for a chain. For this reason the chain code was introduced which is consistent and uniform. The purpose of the chain code is to uniquely identify a chain in an assembly. So in cases where chain A is used 4 times in an assembly, the generated chains will have chain codes A, A1, A2, A3. Although for the chain that has been marked as non-symmetric valid (that should be used to extract the original asymmetric PDB data), then the original PDB code is used (if it is correct) i.e. A. In these cases where a chain in the PDB did not have a chain code, then the first not used letter is reserved (i.e. A). When 2 different chains (i.e. polymer chain and bound molecule chain) share the same PDB code, then the chain code of the bound molecule is consistently derived from the chain code of the polymer chain
C:CHAIN_CODE_1_LETTER, S:3, A:0.0
This is an additional 1 letter code that uniquely identifies it in the assembly. It is arbitrary and its purpose is to be able to export files in PDB format
C:NON_ASSEMBLY_VALID, S:1, A:0.0
This item is to be used not only in an assembly context, but also to represent the original asymmetric unit
C:CODE_3_LETTER, S:9, A:0.0
This attribute provides a code from the chem comp dictionary for standard residues. This attribute must be the same for small molecules that represent our variations on topology/chemistry for a polymer component e.g. All ALA's should have a code_3_letter of ALA. All adenosine nucleotides should have a 3 letter code of A, except for those that have a topology of 'free'. This code is now obsolete and the Comp Code should be used instead in most cases
C:CODE_1_LETTER, S:15, A:0.0
One code letter for the residue as specified in the PDB sequence and structure.
C:CHEM_COMP_CODE, S:36, A:0.0
The standard extended molecule code of the aminoacid or ligand. It is composed by the PDB 3 letter code with an optional topological indicator appended after an underscore
C:RESIDUE_PDB_CODE, S:9, A:0.0
The code of the residue or ligand as was defined in the PDB. The reference ligand (chem comp) code should be used instead, since in cases where these two are different there was some error in the original PDB data that was identified during clean up. Common cases of these are 1) The chemical structure implied by the PDB coordinates is entirely irrelevant with the ligand with this code in the chemical dictionary. A big mess in the PDB data. 2) The structure in the PDB is a structurally modified version of the ligand with this code for example an extra atom was introduced (i.e. modified aminoacids). In these cases a new ligand is defined in the chemical dictionary and is assigned to this residue or bound molecule. 3) The PDB coordinates imply a different stereoisomer. A new ligand is introduced in the chemical dictionary for the new stereoisomer.
C:RESIDUE_PDB_SEQ, S:4, A:0.0
The sequence of the residue, as was originally found in the PDB (has to be used together with insert code).
C:PDB_INSERT_CODE, S:3, A:0.0
The insertion code of the residue, as was originally found in the PDB.
C:RESIDUE_SERIAL, S:5, A:0.0
Serial number of the residue in the chain. Starts with 1 for the first residue (N-terminal or 5'-terminal) in the chain, and increases by 1 with each position along the chain uniquely identifying the residue in the chain.
C:CHAIN, S:3, A:0.0
This field contains the chain identifier as in the original pfam data, it should be the same as in the CHAIN_PDB_CODE , except in cases that, this particular entry has been cleaned up
C:SERIAL, S:5, A:0.0
(obsolete) same as the residue serial
C:PDB_ID, S:9, A:0.0
This is the 3 letter code of the aminoacid as specified in the PDB This can be different from the 3 letter code in the PDBe in very rare cases mainly when some cleanup was involved
C:PDB_SEQ, S:123, A:0.0
The sequence of the residue, as was originally found in the PDB (has to be used together with insert code).
C:SP_PRIMARY, S:45, A:0.0
The accession number (AC) associated with a swissprot entry (http://ca.expasy.org/sprot/userman.html#AC_line)
C:SP_COMPONENT, S:3, A:0.0
This is the 1 letter code of the aminoacid as specified in the Swiss-prot database. This can be different from the code in the PDB in very few cases where for several reasons the protein sequence and the actual structure observed in the experiment deviate.
C:SP_SERIAL, S:10, A:0.0
The serial identifier of the residue in the swiss-prot sequence. This does not correspond to the PDB residue serial because often only partial fragments of the actual sequence are involved and observed in the PDB experiment
C:PFAM_ID, S:30, A:0.0
The PFAM accession number
C:SP_RES_FROM, S:5, A:0.0
The Swissprot residue serial of the begining of the sequence segment maps to the same PFAM family
C:SP_RES_TO, S:5, A:0.0
The Swissprot residue serial of the end of the sequence segment maps to the same PFAM family
related   Reverse Many Reverse Entity 
Reverse relation:belong of Reverse entity:NCBI Taxonomy - Relation attributes:Ncbi Tax Id
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Residue - Relation attributes:Residue Id
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Entry - Relation attributes:Entry Id
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Assembly - Relation attributes:Assembly Id
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Chain - Relation attributes:Chain Id

PFAM Protein Family    Entities  Marts
Links to the Pfam database of protein families and hidden markov models (http://nar.oupjournals.org/cgi/content/full/30/1/276?ijkey=wfgjAWVRY.wto&keytype=ref&siteid=nar) on the chain level
Reference attributes:Entry Id,Chain Code,SwissProt accession,Ending Residue,PFAM Id - Naming attributes:Accession Code,Chain Code,PFAM Id
C:ACCESSION_CODE, S:8, A:4.0
The PDB accession code of the entry
C:CHAIN_CODE, S:8, A:2.0
The standard code of the chain that uniquely identifies it in the assembly. It is an extension of the PDB chain Id. In cases where symmetry operations have been applied to a chain, these chains are named with a numeric suffix, ie. A,A1,A2,A3 ... The chain id specified in the PDB file is also ignored for waters and bound molecules, and their codes are derived from the name of the chain that they are bound to. Finally there are no "null" chain codes and in cases where no id was specified in the PDB file, then arbitrary chain codes are assigned (i.e. A,B)
C:PFAM_ID, S:10, A:7.0
The database identifier of the Pfam
C:CHAIN_PDB_CODE, S:1, A:1.0
The original code of the chain as found in the PDB. There are problems with the chain code since it is not used in a consistent way in the PDB. Firstly in many cases this is null in cases where there is a single chain in the entry. Additionally very often the same chain code is used both for a polymer chain and a bound molecule (that is bound to it). So generally the PDB chain code is often not a distinct identified for a chain. For this reason the chain code was introduced which is consistent and uniform. The purpose of the chain code is to uniquely identify a chain in an assembly. So in cases where chain A is used 4 times in an assembly, the generated chains will have chain codes A, A1, A2, A3. Although for the chain that has been marked as non-symmetric valid (that should be used to extract the original asymmetric PDB data), then the original PDB code is used (if it is correct) i.e. A. In these cases where a chain in the PDB did not have a chain code, then the first not used letter is reserved (i.e. A). When 2 different chains (i.e. polymer chain and bound molecule chain) share the same PDB code, then the chain code of the bound molecule is consistently derived from the chain code of the polymer chain
C:ENTRY_ID, S:10, A:4.0
The database identifier of the Entry
C:FROM_SP_SERIAL, S:5, A:3.0
The Swissprot residue serial of the begining of the sequence segment maps to the Interpro IPR
C:SP_PRIMARY_ID, S:15, A:6.0
The accession number (AC) associated with a swissprot entry (http://ca.expasy.org/sprot/userman.html#AC_line)
C:SP_SECONDARY_ID, S:255, A:10.0
Swissprot entry name: The first item on the ID line is the entry name of the sequence. This name is a useful means of identifying a sequence (http://ca.expasy.org/sprot/userman.html#ID_line)
C:TO_SP_SERIAL, S:5, A:4.0
The Swissprot residue serial of the begining of the sequence segment maps to the Interpro IPR
related   Reverse Many Reverse Entity 
Reverse relation:has of Reverse entity:Entry - Relation attributes:Entry Id