Scroll to top
Page Contents

1. Introduction

This manual is designed to enable an annotator (curator) to follow the sequence of steps involved in checking and amending entries in ChEBI. All operations are carried out using the ChEBI annotator tool.

Inputting of User Name and Password into the login page takes the user to the 'Welcome' page.

2. Main Menu

Located on the left-hand side of the page.

2.1 Free Text Search

Enables the annotator to search the complete content of the database. Annotators may enter names or partial names, ChEBI IDs, other IDs (e.g. KEGG, Beilstein), synonyms, InChI, SMILES, etc. Searches are case insensitive unless the 'Case Sensitive?' box is checked. The wildcard character is *.

2.2 Search

Allows the annotator to perform searches of unchecked entries only or to base the search criteria on the classification within the ontology. Further options provide hyperlinked lists of pending, submitted and unsubmitted submissions. For the unchecked-entry search, a drop-down menu allows the source of the entry (Chemical Ontology, IntEnz, or KEGG COMPOUND) to be specified; the search is case-insensitive unless the 'Case Sensitive' box is checked. For the Ontology Classification search, the annotator is able to base the search on compound status (CHECKED or OK), classification (CLASSIFIED or UNCLASSIFIED) and relation status (CHECKED or OK).

2.3 Merge

Allows two entities to be merged by inputting the IDs. The Merge Compounds screen requires the annotator to select which of the original ChEBI names, definition, default structures and ontology trees are to be retained. Failure to check any of the options will result in an error notice. Full details of one or both entries may be displayed by clicking on the relevant 'show compound' links. The merge procedure can be cancelled at any time by use of the Cancel Changes tab. Care must be taken when merging entries as there may be far-reaching consequences, especially if one or more of the entries is already publicly visible. The annotator must make absolutely certain by checking all data sources that the two entries being merged relate to one and the same entity. If in doubt, or if for example there are unproven stereochemical differences or ambiguities between two entries, the annotator should not perform a merge but relate the entries to one another through the ontology.

2.4 Demerge

Takes the annotator to the Demerge Compounds screen, allowing selection of which children are to be demerged from the parent. As for 'Merge', the demerge procedure can be cancelled by use of the Cancel Changes tab. For the same reasons as given above for merging, care must also be taken when demerging entries. Examples of when demerging is justified are (1) when differences exist between names and synonyms for entries that have been subject to a previous automatic merging and (2) when it is desirable to distinguish between acids and their conjugate bases. Caution: all demerged entries will inherit the ontology structure of the parent entry. Therefore the annotator will need to modify or delete relationships which are no longer valid.

2.5 Add Compound Entry

Allows addition of a new entry to the database. Specifying the ChEBI name (see below) and submitting will allow the system to generate a new ChEBI ID on a Compound Result screen, with the new entry having an initial status OK. Before adding a new compound the annotator should always conduct a thorough search in the database for names, registry numbers and database links. If a compound is then found already to exist, the annotator should use this rather than creating a new one.

2.6 Logs

Allows the annotator to view logs relating to automated procedures (e.g. KEGG updates, incorporation of new sources, automated merges and demerges).

2.7 Help

Leads to the 'Special Characters List', a list of xml tags used to enhance the chemical mark-up.

2.8 Logout

Speaks for itself.

2.9 <<hide

Allows the side menu to be hidden, maximising the view of the main screen, a particularly useful feature when using a browser such as Mozilla Firefox which does not allow line wraps. The side menu may be reinstated via the <<show link.

3. Compound Result screen

This is the main screen on which the results for a ChEBI entity are displayed. It offers the annotator six tabs: View, View SC, Edit Compound, Edit Ontology, Edit Structure, Edit Comment.

3.1 View

Displays the following features:

3.1.1 General Information

ChEBI Name – The name recommended by the annotator for use within the biological community.

ChEBI ID – The stable ID assigned by the system. IDs are assigned in sequence and their absolute values have no inherent meaning.

ChEBI ASCII Name – The ChEBI name with any special characters rendered in ASCII format.

Definition – a definition of the entry, especially relevant for classes of entities, less so for instances.

3.1.2 Default Structure

The view of an entity selected by the annotator as being of prime importance and which will be the main structure displayed on the public web interface. If there is more than one graphical structure, these may be viewed via a 'more>>' link. The status of the structures (OK, CHECKED, DELETED, OBSOLETE) is indicated by a colour-coded frame. Also displayed are the SMILES string and the InChI, both derived from the MDL molfile corresponding to the default structure.

3.1.3 Status

The status of the entry is indicated as OK, CHECKED, DELETED or OBSOLETE. Details of type of merger (automatic or annotator), who and when created it, and who and when last modified it are supplied.

3.1.4 Formula

Assigned (manually by the annotator) whenever possible; generally the molecular formula. The use of subscripts is avoided. The source is stated, together with the status and an indication of which child from a merged entry was the source or whether this was from a parent.

3.1.5 Additional chemical data

Mass and charge, calculated automatically from the default structure, will appear here and should be checked.

3.1.6 ChEBI Ontology

Shows the relationships relevant to the entity being viewed. Clicking on 'Tree View' opens up a visual depiction of the tree with the different types of relationship being indicated by different symbols. Entries and relationships with status OK (i.e. unchecked) are shown in light grey, while those with status CHECKED are in dark grey. The line for the entity currently being viewed is shown in bold type. Clicking on 'Parents and Children View only' returns the annotator to the default (textual) display of relationships.

3.1.7 IUPAC Name(s)

Shows one or more names for an entity based on current IUPAC recommendations.

3.1.8 Synonym

Shows all synonyms and their sources. Only those with status CHECKED will be viewable to the public.

3.1.9 Database Accession

Provides accession numbers and (if available) links to source databases. Only those with status CHECKED will be viewable to the public.

3.1.10 Registry Numbers

Lists CAS, Beilstein and Gmelin Registry Nos., where these are available.

3.1.11 Comments

Shows comments added by an annotator, relating either to a single data entry or to a complete entity.

3.1.12 Submitter Remarks

For submitted entries, shows remarks added by the submitter or annotator and any subsequent follow-ups. All remarks are visible on the public website.

3.2 View SC

Displays the standard view but with the addition of xml tags around special characters.

3.3 Edit Compound

The main screen used by the annotators for editing the main text of an entry.

3.4 Edit Ontology

The screen used for editing the ontology.

3.5 Edit Structure

The screen used for editing the various structural representations of an entity.

3.6 Edit Comment

Allows the annotator to add and edit a comment, either to a single data entry or to an entity as a whole. For submitted entries, also allows the annotator to add or respond to Submitter Remarks.

4. Edit Compound screen

This is the screen upon which an annotator can edit all details of a ChEBI entry except for its structural data, comments and submitter remarks, and relationships within the ontology.

4.1 ChEBI Name

The recommended name may be changed by the annotator to bring it into line with current usage within the biological community. Although there is a limit on the number of characters in a ChEBI Name, this is enormous (around 4000) and it is highly desirable that such names are kept short – abbreviations (e.g. ATP, NAD) are acceptable. A good maximum number of characters to work to is 50. Special characters are encoded using the xml tags listed in the Help file. Care must be taken to use the correct tags with characters that can be used in more than one context, e.g. to distinguish between <stereo>alpha</stereo> and <locant>alpha</locant>. To aid in selection of the correct tags a Special Character tool has been incorporated, accessible via a link next to the ChEBI Name field; similar links to this tool are found next to all those fields on the Edit Compound screen into which free text can be input.

Unless it is an abbreviation (e.g. ATP, NADPH), a ChEBI Name should start with a lower case letter, not a capital (unless this is a special character relating to stereochemistry or denoting an element).

Changing the ChEBI Name has consequences for other databases and resources which use the ChEBI Name as a reference and hence great care must be taken when making changes to ChEBI Names.

NB. In the case of IntEnz the ChEBI Name may be used within the Reactions field if no IntEnz Name exists. However if an IntEnz Name exists then the changing the ChEBI Name will have no effect on IntEnz.

A singular name should always be used unless the entity is a class and a singular entity already exists within the database, in this case a plural can be used. For example: porphyrin (CHEBI:8337) is an entity, porphyrins (CHEBI:26214) is a class.

The curator tool will produce validation errors if:

  • Unicode characters are included in the name. All Unicode should be encoded with the Special Characters XML package.
  • Brackets in a name are incorrectly nested for example, [(1->4]-alpha-D-galacturonide)n will throw a validation error because the square end bracket has occurred before the round end bracket.
  • A ChEBI name is not unique because there is an already existing ChEBI name.
  • The XML tags are not valid Special Characters XML or they are not closed properly.
  • New line or tab character are present.

4.2 Definition

A definition may be added. This is especially relevant to classes of compounds which appear at the higher levels of the ontology. Good sources of definitions for the Chemical Entity Ontology are the IUPAC Gold Book (http://goldbook.iupac.org/ ) and the various IUPAC documents on nomenclature and terminology (see http://www.chem.qmul.ac.uk/iupac/ ), while for the Biological Function and Application Ontologies, (modified) definitions of MeSH terms can be adopted. No sources of definitions need to be cited.

The curator tool will produce validation errors if:

  • Unicode characters are included in the name. All Unicode should be encoded with the Special Characters XML package.
  • The XML tags are not valid Special Characters XML or they are not closed properly.
  • The length of the definition is less than 10 characters.
  • New line or tab character are present.

4.3 Status

The annotator should change the status of an entry to CHECKED only when all details, including those relating to structure and the ontology, have been edited to the annotator's satisfaction. An entry which has status CHECKED will be viewable on the public web interface and included in the downloadable files at the next release.

4.4 Formula

Any formula derived from a primary source should be checked and if correct its status changed to CHECKED. If a different formula is to be added, the status of the incorrect formula should be changed to DELETE and the new formula added with status CHECKED. Subscripts and hyphens should not be used. The order of atomic elements within molecular formulae should follow the Hill system (http://en.wikipedia.org/wiki/Hill_system ). The source must be specified using the dropdown menu – if arising from an annotator's own brain this should be indicated as 'ChEBI'. If an entry cannot be assigned a formula (typically in the case of a class of compounds), then a dot '.' should be entered into the formula field and its status kept as 'OK'.

The following conventions regarding ChEBI formulae should be followed:

  • Unless immediately following a dot '.' any numeral refers to the preceding element in the formula. Example: H2O really means there are two oxygen atoms and one oxygen atom.
  • The dot '.' convention is used when dividing a formula into parts. Any numeral following a dot refers to all the elements within that part of the formula that follow it. Example: C2H3O2.Na.3H2O (CHEBI:32138) really means that after C2H3O2 there is one sodium (Na), six hydrogen and three oxygen atoms.
  • Parentheses are used within ChEBI formulae to mean multiplication of elements.
  • The 'n' convention is used to show an unknown quantity by which a formula is multiplied. For example: (C12H20O11)n from CHEBI:15443 really means that a C12H20O11 unit is multiplied by an unknown quantity.
  • A comma can be used to indicate that there is one or more of the elements divided by the comma but that the exact stoicheiometry can vary. For instance, actinolite is a mineral with the chemical formula Ca2(Mg,Fe)5Si8O22(OH)2, which means that it could be anything in the continuous series between Ca2Mg5Si8O22(OH)2 and Ca2Fe5Si8O22(OH)2.

The curator tool will produce validation errors if:

  • Unicode characters are included in the formula.
  • Brackets in a formula are incorrectly nested.
  • The formulae contain symbols which do not belong in the allowed set of symbols for formulae. The allowed set of symbols for formulae include all element symbols from the periodic table which are case-sensitive:
    • H|Li|Na|K|Rb|Cs|Fr|Be|Mg|Ca|Sr|Ba|Ra|Sc|Y|La|Ce|Pr|Nd|Pm|Sm|Eu|Gd|Tb|Dy|Ho| Er|Tm|Yb|Lu|Ac|Th|Pa|U|Np|Pu|Am|Cm|Bk|Cf|Es|Fm|Md|No|Lr|Ti|Zr|Hf|Rf|Ku|V| Nb|Ta|Ha|Ns|Cr|Mo|W|Unh|Mn|Tc|Re|Uns|Fe|Ru|Os|Uno|Co|Rh|Ir|Une|Ni|Pd|Pt|Cu| Ag|Au|Zn|Cd|Hg|B|Al|Ga|In|Tl|C|Si|Ge|Sn|Pb| N|P|As|Sb|Bi|O|Db|Sg|Bh|Hs|Mt|Ds| Rg|S|Se|Te|Po|F|Cl|Br|I|At|He|Ne|Ar|Kr|Xe|Rn|Mu|D|T|
    In addition the following characters are allowed:
    • Numerals 0 through 9
    • Dot character '.'
    • The opening and closing parentheses '(' and ')'
    • The characters 'n', 'm' and 'R'
  • New line or tab character are present.
  • A formula item with a single dot as its formula has a status of 'CHECKED'. The single dot should only be used to indicate that a curator has verified that no formula can be added to an item and it should have a status of 'OK'.
  • A new formulae is added which already exists in the entry which is of the same source as the formulae being added.

4.5 Synonym

Synonyms derived from primary sources will be displayed along with details of the source and status. The annotator should check the status of each synonym and amend if necessary. NB. Annotators must take extra care when contemplating deletion of any synonym derived from IntEnz that has type NAME, as this will also in effect cause a similar deletion within IntEnz.

4.6 Add Names

Any new synonym which the annotator considers relevant should be added along with its source. Cross-reference to the source should be via links in the Database Accession or Registry Numbers sections (see below).

A synonym taken from an external source should not normally be altered when being entered into ChEBI. However, if there is a real need to make alterations (e.g. in order to rearrange an index style of presentation, or to correct errors in the nesting of brackets), then the 'Adapted' checkbox next to the synonym should be ticked.

IUPAC names should also be added here. An 'IUPAC name' is a name based on current recommendations of IUPAC. It need not be fully systematic as it can make use of 'retained' and 'preselected' names. Some relevant sources are:

  • IUPAC 'Revised Blue Book' – Awaiting publication, currently available as a draft document at http://www.och.bme.hu/ifj-nagy/Nevezéktan/CompleteDraft.pdf
  • A Guide to IUPAC Nomenclature of Organic Compounds, Recommendations 1993
  • Nomenclature of Organic Chemistry, Sections A, B, C, D, E, F and H, 1979 Edition. ('The Blue Book') – largely superseded but still useful for class names and older trivial names.
  • Compendium of Biochemical Nomenclature, 1993 Edition ('The White Book') – however many sections have been superseded.
  • Nomenclature of Inorganic Chemistry (recommendations 1990) ('The Red Book')
  • Nomenclature of Inorganic Chemistry II. Recommendations 2000
  • Nomenclature of Inorganic Chemistry - IUPAC Recommendations 2005 ('The Revised Red Book'; largely supersedes the 1990 and 2000 editions)
  • IUPAC Compendium of Chemical Terminology ('The Gold Book'), 1987. A revised version in electronic form is available at http://goldbook.iupac.org/index.html
  • Compendium of Macromolecular Nomenclature, 1991 ('The Purple Book')

Further details of these and other IUPAC nomenclature documents are available at http://www.iupac.org/publications/books/seriestitles/nomenclature.html and http://www.chem.qmul.ac.uk/iupac/

Annotators should bear in mind the following points when entering IUPAC Names:

  • An IUPAC Name should start with a lower case letter, not a capital (unless this is a special character relating to stereochemistry or denoting an element).
  • Forward locants should not be used. Example: 'octane-1-sulfonic acid' is the correct form, not '1-octanesulfonic acid'.
  • So-called 'systematic names' taken from external sources should not be assumed to have been constructed in strict adherence to IUPAC recommendations.
  • The following spellings are recommended by IUPAC and should be used in ChEBI IUPAC Names: sulfur (not sulphur), aluminium (not aluminum), icosane (not eicosane). The alternative spellings may be used as synonyms.

The curator tool will produce validation errors if:

  • Unicode characters are included in the name. All Unicode should be encoded with the Special Characters XML package.
  • The XML tags are not valid Special Characters XML or they are not closed properly.
  • New line or tab character are present.
  • Brackets in the names are incorrectly nested.
  • An IntEnz name of type 'NAME' and of status 'CHECKED' already has another IntEnz name of type 'NAME' and status 'CHECKED' in the entry.
  • An IUPAC name is not unique within the database because there is an already existing IUPAC name.

4.7 Database Accession

All database accessions listed should be checked and amended if necessary. Status must be 'CHECKED' for lines to be viewable on the public web interface. New entries are added using the 'Add Database Accessions' facility (see below).

The curator tool will produce validation errors if:

  • Any Chemical Ontology accession which is "CHECKED".
  • The database link does not conform to the specific database link format. The link formats for each database are listed in table 1.

    Database accession name Description Example
    PDBeChem accession A combination of characters and numerical digits and a minimum of one character. FAD
    COMe accession The prefix can consist of one of the following character sets, "MOL" or "BIM" or "PRX" followed by six digits. MOL000039
    KEGG COMPOUND accession The prefix character "C" followed by five digits. C00001
    KEGG GLYCAN accession The prefix character "G" followed by five digits. G00001
    KEGG DRUG accession The prefix character "D" followed by five digits. D00001
    PDB accession A combination of upper case characters of digits but a maximum and minimum of four characters. 1A7Y
    RESID accession The prefix character "AA" followed by four digits. AA0008
    UMBBD accession The prefix character "c" followed by four digits. c0779
    LIPID MAPS instance accession The prefix character "LM" followed by two upper case characters and then between two and eight digits. LMST01010087
    LIPID MAPS class accession The prefix character "LM" followed by two upper case characters and then between two and four digits. LMFA0301
    MolBase accession Any combination of digits of any length. 79
    WebElements accession Adheres to any periodic element, please see the formula validations. Fe
    Table 1: The list of database link validations for the curator application.

4.8 Registry Numbers

All numbers listed should be checked and amended if necessary. Status must be 'CHECKED' for lines to be viewable on the public web interface. New entries are added using the 'Add Database Accessions' facility (see below). Beilstein and Gmelin Registry Numbers can be added if known (but note that these numbers constitute the only data that ChEBI can include from these two sources, owing to the databases not being freely accessible).

The curator tool will produce validation errors if:

  • The registry number does not conform to the specific registry number format. The registry number formats for each database are listed in table 2.

    Database accession name Description Example
    CAS Registry Number Conform to the CAS validation system found on the CAS website. 107-07-3
    Beilstein Registry Number Any combination of digits of any length. 85288
    Gmelin Registry Number Any combination of digits of any length. 141781
    Table 2: The list of registry number validations for the curator application.

4.9 Add Database Accessions

Used by the annotator for the entering of new database accessions or registry numbers. The type and source must be selected from the dropdown menus.

Changes may be incorporated by clicking on 'Submit Changes'. Erroneous changes may be cancelled at any time up to submission by clicking on 'Cancel Changes'.

5. Edit Ontology screen

Using this screen, an annotator can both edit existing relationships between entities and create new ones.

In the two sub-ontologies "Role" and "Subatomic Particle" a singular ChEBI Name should always be used. A plural ChEBI Name is allowed within the "Chemical Entity" sub-ontology if the entity is a class and the singular ChEBI Name already exists.

5.1 Parents and Children View

This view lists all the parent and child relationships directly pertaining to an entity and their status [CHECKED, OK, DELETED or OBSOLETE (the OBSOLETE status can be created only by the system)]. Only relationships with status CHECKED and OK will be included in the tree structure and be visible on the public web interface. The annotator must check each existing relationship and amend if necessary.

When editing an existing entry, the annotator needs to check all its non-OBSOLETE relationships and leave these with status CHECKED or DELETED. No relationships may be deleted which would cause an entity to be separated from the tree: it is necessary to create a new relationship prior to deleting the last unwanted one.

Hint: When creating and editing relationships, it is useful to open the annotator tool in two or more tabs or separate windows to facilitate rapid copying and pasting of ChEBI IDs.

5.2 Tree View

Displays in graphical form the tree structure. All direct lines upwards are shown together with downward lines only as far as immediate children. Checked entries are shown in a darker grey with the line for the entity currently being viewed being in bold type. Annotators may navigate around the ontology in the tree view by clicking on any displayed line. A table of relationships and their shorthand symbols is displayed at the right-hand side of the tree view. Brief descriptions of the sub-ontologies and relationships in the ChEBI Ontology are provided in Sections 5.4 and 5.5 respectively, with fuller descriptions and examples being included in the ChEBI User Manual, accessible via the public web interface.

5.3 Add Relationship

Allows the annotator to add a new relationship. Dropdown menus are provided for selecting the type of relationship while the ID for the entity to which the new relationship refers is entered into the relevant box.

Changes may be incorporated by clicking on 'Submit Changes'. Erroneous changes may be cancelled at any time up to submission by clicking on 'Cancel Changes'.

The tool has general validations which apply to most relationship types. In general when the term "enabled" is used to describe a relationship it means that its relationship status is either "CHECKED" or "OK".

The curator tool will produce validation errors if:

  • All relationships which are cyclic, namely, "is tautomer of", "is enantiomer of", "is conjugate base of" and "is conjugate acid of" do not have the same status per relationship pair. For example, if A "is tautomer of" B with status "CHECKED" then B "is tautomer of" A should also have a status "CHECKED".
  • A redundant relationship is entered whether it be a parent or child relationship. By redundant we mean the same relationship already exists in that entry.
  • If the graph is disconnected. Disconnected graph validation is done when any changes are done to the ontology. Disconnected graph validation will ensure that no entry is disconnected from the graph. If all relationships for instance are deleted from an entry, the tool will try to create an "is a" relationship with the "unclassifieds" domain. It will not do this if the entry has children, in this case it will return errors. It is not allowed to have entries in the "unclassifieds" domain which have children themselves.
  • An attempt to create a cycle in the graph with a non-cyclic relationship. Directed acyclic graph validation ensures that any changes made will not create cycles in the ontology with relationships which are strictly non cyclical. If this occurs an error will be returned by the curator tool. Relationships which are not cyclic are, "is a", "has part", "is substituent group from", "has parent hydride" and "has functional parent".
  • Compound status has to be either "CHECKED" or "OK" in order for modifications to be made to the ontology.
  • A validation error will be generated if a parent relationship which is not cyclic is not enabled in the entry.

Validations when creating new ontology relationships:

  • You try to create a parent relationship with the root of the ontology namely "CHEBI:23091 ChEBI Ontology".
  • You try to create a child relationship with the following entities "CHEBI:23091 ChEBI Ontology", "CHEBI:27189 unclassifieds", "CHEBI:24431 chemical entity ontology", "CHEBI:24432 biological role ontology" and "CHEBI:33232 application ontology".
  • You try to create a parent relationship with the entry itself.
  • You try to enter an invalid ChEBI id format.
  • You try to create a new relationship which already exists in the entry.

5.4 The Sub-ontologies

ChEBI Ontology is subdivided into three separate sub-ontologies:

5.4.1 Chemical Entity

Classifies chemical entities according to their structural features and properties.

5.4.2 Role

The role ontology encompasses three sub-ontologies:

  • Biological Role. Classifies entries on the basis of their roles played within a biological context, e.g. as antibiotics, antiviral agents, coenzymes, enzyme inhibitors.
  • Chemical Role. Classifies entries on the basis of their roles played within a chemical context, e.g. as acids or bases, solvents, ligands, surfactants.
  • Application. classifies entities on the basis of their applications, e.g. as pesticides, detergents, healthcare products, fuel.
  • 5.4.3 Subatomic Particle

    Classifies particles which are smaller than atoms.


    5.5 The Relationships

    Relationships can be created between an entity and either a parent or a child. To create a new relationship between two entries, open the Edit Ontology feature for one of them and enter the ChEBI ID for the other in the appropriate box, selecting the type of relationship from the dropdown menu.

    The relationships used in ChEBI are:

    5.5.1 is_a Is a

    Used to imply that 'Entity A' is an instance of 'Entity B' or that 'Class A' is an instance of 'Class B'. This is the chief hierarchical non-cyclic relationship used thoughout the ontologies.

    • If A "is a" B then A cannot have the following relationships with B:
      • is conjugate base of
      • is conjugate acid of
      • has part
      • is enantiomer of
      • is substituent group from
      • is tautomer of
    • If A "is a" B then B cannot have the following relationships with A:
      • is conjugate base of
      • is conjugate acid of
      • is enantiomer of
      • is tautomer of

    5.5.2 has_part Has part

    Used to denote the relationship between the whole and a part, especially between a salt or an addition compound and its components, or between a class of compounds and a substituent group characteristic for that class. This relationship is a reverse of the formerly used relationship "is part of".

    • If A "has part" B then A cannot have the following relationships with B:
      • is a
      • is conjugate acid of
      • is conjugate base of
      • is enantiomer of
      • is tautomer of
      • is substituent group from
    • If A "is substituent group from" B then B cannot have the following relationships with A:
      • is a
      • is conjugate acid of
      • is conjugate base of
      • is enantiomer of
      • is tautomer of
      • is substituent group from

    5.5.3 is_conjugate_base_of Is conjugate base of and is_conjugate_acid_of Is conjugate acid of

    Cyclic relationships which are used mainly between acids and their conjugate bases. When creating a new relationship, only one of these needs to be entered, as the system will create the reverse relationship. Note that although the IUPAC definition of conjugate acid/base refers to a difference in charge of 1 unit only, for ChEBI this is relaxed to include multiple charge differences. This is especially relevant to di- and poly-carboxylic acids [e.g. ChEBI uses the relationship "succinic acid is_conjugate_acid_of succinate(2—)"].

    • If A "is conjugate base of" B then A cannot have the following relationships with B:
      • is conjugate acid of
    • If A "is conjugate base of" B then B cannot have the following relationships with A:
      • is conjugate base of
    • If A "is conjugate acid of" B then A cannot have the following relationships with B:
      • is conjugate base of
    • If A "is conjugate acid of" B then B cannot have the following relationships with A:
      • is conjugate acid of
    • In addition there should only be one pair of acid/base relationships per two items, for example, if A "is conjugate base of" B then A cannot have another "is conjugate base of" relationship with B.

    5.5.4 is_tautomer_of Is tautomer of

    A cyclic relationship used to show the interrelationship between two tautomers, where the differences between the structures are significant enough to warrant their separate inclusion in ChEBI.

    • If A "is tautomer of" B then A cannot have the following relationships with B:
      • is a
      • is conjugate acid of
      • is conjugate base of
      • has part
      • has functional parent
      • has parent hydride
      • is enantiomer of
      • is tautomer of
    • If A "is tautomer of" B then B cannot have the same relationships above with A except of course the "is tautomer of" relationship.

    5.5.5 is_enantiomer_of Is enantiomer of

    A cyclic relationship used when two entities are enantiomers of each other. An entity may have this relationship with only one other entity.

    • An entry can only have one "is enantiomer of" relationship with another entry.
    • If A "is enantiomer of" B then A cannot have the following relationships with B:
      • is a
      • is conjugate acid of
      • is conjugate base of
      • has part
      • has functional parent
      • has parent hydride
      • is tautomer of
    • If A "is enantiomer of" B then B cannot have the same relationships above with A.

    5.5.6 has_functional_parent_of Has functional parent

    Used to denote the relationship between two molecular entities (or classes of entities), one of which possesses one or more chacteristic groups from which the other can be derived by functional modification. This relationship is especially useful to demonstrate the relationships between a number of functionalised entities and a common less-functionalised parent.

    • If A "has functional parent" B then A cannot have the following relationships with B:
      • is enantiomer of
      • is substituent group from
      • is tautomer of
      • has part
    • If A "has functional parent" B then B cannot have the same relationships above with A.

    5.5.7 is_parent_hydride_of Has parent hydride

    Used to denote the relationship between an entity and its parent hydride (defined by IUPAC as "an unbranched acyclic or cyclic structure or an acyclic/cyclic structure having a semisystematic or trivial name to which only hydrogen atoms are attached").

    • If A "is parent hydride of" B then A cannot have the following relationships with B:
      • is enantiomer of
      • is substituent group from
      • is tautomer of
      • has part
    • If A "is substituent group from" B then B cannot have the same relationships above with A.

    5.5.8 is_substituent_group_from Is substituent group from

    Indicates the relationship between a substituent group (or atom) and its parent molecular entity, from which it is formed by loss of one or more protons or simple groups.

    • If A "is substituent group from" B then A cannot have the following relationships with B:
      • is conjugate acid of
      • is conjugate base of
      • has part
      • is enantiomer of
      • is tautomer of
    • If A "is substituent group from" B then B cannot have the following relationships with A:
      • is conjugate acid of
      • is conjugate base of
      • has part
      • is enantiomer of
      • is tautomer of

    5.5.9 has_roleHas role

    Indicates the relationship between a molecular entity and its role. For relationship A "has role" B to be valid, A should belong to the chemical entity ontology while B should belong to the role ontology.


    Validations If A "has role" B then A cannot have any other relationship with B.

    6. Edit Structure screen

    6.1 Graphics

    Structures are input and edited using the MarvinSketch Applet. To open this, first open the Edit Structure screen and click in the box at the right-hand side. Structures for inputting may be drawn manually or copied and pasted from other applications, e.g. ACD/Name.

    6.1.1 Stereochemistry

    The applet allows 2D structures to be drawn, with stereochemistry at chiral centres being indicated by bold and dashed wedges, with the points of the wedges directed towards the stereocentre. In cases where stereochemistry at a centre is possible but not specified, a plain bond linking the stereocentre and the substituent is generally used (although in certain cases a wavy bond may be used to provide emphasis). Where stereochemistry across double bonds is not defined, this is indicated by use of a wavy bond to H or, if fully substituted, to one of the substituents.

    Attention is drawn to the document 'Graphical Representation of Stereochemical Configuration (IUPAC Recommendations 2006)', published in Pure Appl. Chem. Vol. 78, No. 10, pp 1897-1970, 2006, which gives recommendations on preferred and acceptable ways of displaying 3D stereochemical information in 2D diagrams, along with examples for all types of stereochemical configuration.

    6.1.2 3D Structures

    A manipulatable 3D view (e.g. ball-and-stick or wireframe) may be generated from the 2D structure by use of the 3D viewer (go to View, Open 3D Viewer). Such structures may be added to the compound information via an extra MarvinSketch applet on the Edit Structure screen, but should not be used as the default structure.

    If 3D coordinates are available as a 3D molfile, e.g. from a crystal-structure determination, these may also be added directly to the molfile box on the Edit Structure screen to create an extra graphical structure.

    6.1.3 Atom labels

    It is possible to generate simple whole-integer atom labels on a structural diagram using the MarvinSketch applet. Right-clicking on an atom and then selecting 'Map' will allow an atom label between 1 and 99 to be selected and added to that atom. However, such labelled structures must never be used as a default structure.

    6.1.4 Group structures

    The structure of a group contains at least one pseudoatom (attachment point) which is indicated with an asterisk, *. See the section How to edit the entry for a group.

    6.1.5 Zero-valence atoms and isotopes

    Annotators may experience some difficulty in entering structures for single atoms and isotopes due to the Marvin applet's propensity for adding charges and/or hydrogen atoms. In order to generate a zero-valence monatomic structure, the number '15' must appear in the sixth atom block number of the molfile as shown in the example for sodium-23 below. As stated in the molfile specification (see Appendix 1, Fig. 4), this position in the connection table is used to define valences from 1 to 14 but with a value of 15 being used to indicate zero valence. If a single isotope is being defined, the mass number (23 in the example) appears as a positive integer in the subsequent Isotope line.

    
        Marvin  06230911292D
        1  0  0  0  0  0            999 V2000
        7.7076  -10.0151    0.0000 Na  0  0  0  0  0 15  0  0  0  0  0  0
        M  ISO  1   1  23
        M  END
    
    

    Annotators may generate a zero-valence monatomic structure by inserting a single atom symbol into the Marvin applet, generating the molfile and then making manual adjustment to the connection table to make it conform to the above description. However they may find it more convenient to generate the structure using an alternative drawing package and then transferring the structure by copy/paste to the Marvin applet.

    6.2 Molfiles

    The MDL molfile for a structure is displayed in a window on the left-hand side of the screen. Information between this window and the graphic display is transferred by use of left and right radio buttons. Molfiles may be entered directly by copy-and-paste from other external databases, e.g. KEGG COMPOUND.

    Every compound entry which has a structure should be assigned a default structure. The InChI and the SMILES will automatically be generated from this default structure if possible. Tautomer generation for the InChI has been switched on making InChIs generated by tautomers distinguishable.

    7. Edit Comment screen

    The annotator may find it useful to add one or more comments, either for public viewing or for internal use only. Such comments may be associated with a specific item of data or with the entry as a whole. The text is keyed into the Add Comment box and its association selected by checking the appropriate 'Select item' or 'General comment on compound' radio button. The comment is then incorporated by clicking on 'Submit Changes'. Erroneous comments may be cancelled at any time up to submission by clicking on 'Cancel Changes'.

    Changes to existing comments may also be made via the Edit Comment screen.

    8. Minimal requirements for a ChEBI entry

    The following are the minimal requirements for an entity or class to be checked.

    8.1 Entity

    Obligatory

    • IUPAC name
    • Formula
    • Classification

    Optional

    • Registry number(s)
    • Cross-reference(s)
    • Structure(s)
    • Synonym(s)

    8.2 Class

    Obligatory

    • Classification (non-cyclic, parent)
    • Classification (non-cyclic, child)

    Optional

    • IUPAC name
    • Definition
    • Cross-reference(s)
    • Structure(s)
    • Synonym(s)

    9. How to edit the entry for a group

    Example: sulfanediyl group (CHEBI:29830).

    It has:

    • one IUPAC name from the IUPAC Blue Book Appendix 2: 'sulfanediyl'
    • one structural formula which is put as SYNONYM with source IUPAC: <bond>1</bond>S<bond>1</bond> (these <bond>1</bond> characters should be interpreted as a dash if Unicode character respresentation is being used in the browser)
    • and two synonyms, 'sulfenyl' and 'thio' which are not recommended. So they have comments added: "This name is explicitly not recommended by IUPAC."

    Usually, ChEBI name is formed as 'name + group'. This name is not necessarily 'IUPAC name + group'.

    A structure for a group must contain at least one pseudoatom (attachment point) which is indicated with an asterisk, *. It is important that the annotator tick the 'Validation Off' box since otherwise ChEBI will not accept the structure.

    The group should be attached to its parent molecule via the relationship that we use only for groups:

    sulfanediyl group (CHEBI:29830) is substituent group from hydrogen sulfide (CHEBI:16136)

    10. Editing procedure for focus datasets

    10.1 ChEBI Drugs

    The main data resource for annotating ChEBI Drugs is DrugBank. It should be noted that individual DrugBank entries may be linked to multiple ChEBI entities, as different salts, hydrates and isomers from a given compound share a common ID in DrugBank but are annotated as different entities in ChEBI. As our primary source, we initially input all DrugBank brand names in ChEBI but soon realised that most of them are deprecated. We have made an effort trying to distinguish between brand names currently in use (those authorised and/or commercialised in at least one country) and those which are deprecated (indicated in ChEBI by a yellow triangle surrounding an exclamation mark). However, this is a very manual and time-consuming process, so currently drug brand names are not annotated in ChEBI. We have found RxNorm to be a potential resource from where to fetch automatically non-deprecated brand names. To date RxNorm includes FDA-approved brand names but only a few approved by Governmental Drug Administrations in other countries. To our knowledge, brand names in RxNorm are assigned correctly to salts, hydrates and isomers of a given compound, so in future we might use this resource to annotate brand names automatically to ChEBI Drugs.

    International Nonproprietary Names (INNs) facilitate the identification of pharmaceutical substances or active pharmaceutical ingredients. Each INN is a unique name that is globally recognized and is public property. The World Health Organization collaborates closely with INN experts and national nomenclature committees to select a single name of worldwide acceptability for each active substance that is to be marketed as a pharmaceutical. Nonproprietary names, also known as generic names, are annotated for all ChEBI Drugs from WHO MedNet services.

    A racemic mixture, or racemate, is one that has equal amounts of left- and right-handed enantiomers of a chiral molecule. The trend in the pharmaceutical development is increasingly moving towards the development of single isomers rather than racemates; however, there are around 400 racemic drugs approved at present. Any drug which is a racemic compound is annotated as three ChEBI entities: (1) a non-stereospecified entity, (2) the left-handed enantiomer, (3) the right-handed enantiomer. Each of the two enantiomers would be linked to the non-stereospecified entity via an "is_a" ontological relationship and to each other via an "is_enantiomer" relationship. A unique DrugBank ID is crossreferenced to the three entities except when a specific DrugBank ID exists for one or both enantiomers. The same applies to the INN of the racemic compound: this will be the same for all three ChEBI entities except when a specific INN exists for one or both isomers.

    11. Editing procedure for Submissions

    11.1 Structures

    Structures do not have a source. If a submission includes a structure, the annotator should, if possible, retain this in its original form. If a diagram needs modifying or completely redrawing in order to remove ambiguities, to match the appearance of related entries, to agree with IUPAC recommendations or merely for aesthetic reasons, then this new diagram should become the default and the submitter's original retained as an additional structure provided that this has no inaccuracies or ambiguities. If a significant change is made to a submitter's original, then the annotator should add a remark explaining the reasons for the change.

    11.2 Formula, mass and charge

    The formula, mass and charge information are derived from the submitted structure and will have as their source SUBMITTER. If a structure is modified then the source will become ChEBI. If a default structure is added/modified in ChEBI then a formula with source ChEBI will be added automatically. If the default structure has status=SUBMITTED and that is changed to status=OK/CHECKED/DELETED, then all chemical data with source=SUBMITTER will automatically be changed to source=CHEBI and the status changed to the relevant status=OK/CHECKED/DELETED. If the submitted default structure is changed to a secondary structure (because the annotator adds a new default structure) then the tool will regenerate the chemical data from the new default structure and update the status with source ChEBI.

    11.3 Nomenclature

    If the annotator modifies a submitted name or synonym, the 'Adapted' tickbox should be checked and the source retained as SUBMITTER. A submitted IUPAC Name, if correct, should be left with source=SUBMITTER. However if changes need to be made, then the source of the corrected name should be changed to IUPAC.

    11.4 Ontology

    The submitter's classification must be checked carefully and modified if necessary. If the submitter has selected from the Simple classification view in the submission tool (i.e. is_a organic molecular entity, is_a inorganic molecular entity, is_a group, is_a biological role or is_a application), then in almost all cases a new lower-level classification should be added and the status of the submitter's original changed to DELETED.

    11.5 Xrefs

    If xrefs are correct, then these should remain as source=SUBMITTER. If a submitter includes a CAS Registry Number with no external source, then if possible the annotator should provide an authorative (open-access) source for it. If none can be found, then source=SUBMITTER should be retained.

    11.6 Deletion of Entries

    If an entry is to be deleted then the annotator should always add a Remark with a clear reason why the entry is deleted. This Remark will be publicly visible.

    Appendix 1: MDL mol formats in ChEBI

    ChEBI follows the MDL mol format specification for its molfiles and what follows is a summary of this file.

    Below is a list of the types of file formats available from MDL.


    molfiles Molecule files: Each molfile describes a single molecular structure which can contain disjoint fragments.
    RGfiles Rgroup files: An RGfile describes a single molecular query with Rgroups. Each RGfile is a combination of Ctabs defining the root molecule and each member of each Rgroup in the query.
    rxnfiles Reaction files: Each rxnfile contains the structural information for the reactants and products of a single reaction. Elsevier MDL currently supports only the REACCS type of rxnfile. The CPSS type of rxnfile written by CPSS programs is no longer supported and is not described in this document.
    SDfiles Structure-data files: An SDfile contains structures and data for any number of molecules. Together with RDfiles, SDfiles are the primary format for large-scale data transfer between MDL databases.
    RDfiles Reaction-data files: Similar to SDfiles in concept, the RDfile is a more general format that can include reactions as well as molecules, together with their associated data. Although RDfiles are used primarily by ISIS and REACCS, MACCS-II can also read and write RDfiles except for the reaction structure information (indicated by the square brackets in the MDL Program table).
    XDfiles XML-data files: XML-based data format for transferring recordsets of structure or reaction information with associated data. An XDfile can contain structures or reactions that use any of the CTfile formats, Chime strings, or SMILES strings. (Chime is an encrypted format that is used to render structures and reactions on a Web page. SMILES is a line notation format that uses character strings and SMILES, Simplified Molecule Input Line Entry System, syntax to represent a structure.)

    In ChEBI we use the molfile format but as we will see later on it allows various properties from the other files.

    In the table below is a list of properties allowed in the properties block of a connection table. The molfile format allows all properties except the [Reaction] properties.

    
        Marvin  06230911292D
        1  0  0  0  0  0            999 V2000
        7.7076  -10.0151    0.0000 Na  0  0  0  0  0 15  0  0  0  0  0  0
        M  ISO  1   1  23
        M  END
    
    

    Please refer to Pg 15 of the format specification for an exact list of all the properties table. All the properties listed in the table under molfiles are allowed in molfiles but they will have restrictions on when they can be used. For example, the RGroup attachment point (APO) requires that an RGroup be present in the connectivity table.

    Appendix 2: Special Character XML tags

    The following XML tags are used to encode special characters into the database. They can then be converted either into their encoded or ASCII form, depending on who is using the data.

    XML tag Encoding ASCII format
    <apostrophe/> ʼ '
    <tilde>a</tilde> ã a
    <tilde>n</tilde> ñ n
    <cedil>c</cedil> ç c
    <macron>A</macron> Ā A
    <macron>a</macron> ā a
    <arrow>leftright</arrow> <->
    <arrow>left</arrow> <-
    <arrow>right</arrow> ->
    <infin/> infin
    <greaterequal/> >=
    <bracket>open</bracket> [ [
    <bracket>close</bracket> ] ]
    <acute>U</acute> Ú U
    <acute>E</acute> É E
    <acute>u</acute> ú u
    <acute>e</acute> é e
    <acute>A</acute> Á A
    <acute>a</acute> á a
    <acute>o</acute> ó o
    <acute>O</acute> Ó O
    <acute>I</acute> Í I
    <acute>y</acute> ý y
    <acute>i</acute> í i
    <bond>3</bond> #
    <bond>2</bond> = =
    <bond>1</bond> -
    <bond>4</bond> #
    <bond>(</bond> ((̶ -(-
    <bond>)</bond> ))̶ -)-
    <plusmn/> ± +-
    <lessthan/> < <
    <activated>D</activated> D̅ D
    <activated>42</activated> 4̅2̅ 42
    <activated>B</activated> B̅ B
    <activated>C</activated> C̅ C
    <activated>1s</activated> 1̅s̅ 1s
    <activated>1r</activated> 1̅r̅ 1r
    <activated>423</activated> 4̅2̅3̅ 423
    <notequal/> =/=
    <ring>A</ring> Å A
    <ring>a</ring> å a
    <umlaut>U</umlaut> Ü Ue
    <umlaut>E</umlaut> Ë E
    <umlaut>u</umlaut> ü ue
    <umlaut>e</umlaut> ë e
    <umlaut>A</umlaut> Ä Ae
    <umlaut>a</umlaut> ä ae
    <umlaut>o</umlaut> ö oe
    <umlaut>O</umlaut> Ö Oe
    <umlaut>I</umlaut> Ï I
    <umlaut>i</umlaut> ï i
    <slash>o</slash> ø o
    <slash>O</slash> Ø O
    <minus/> -
    <sharp>s</sharp> ß ss
    <greek>Rho</greek> Ρ Rho
    <greek>Beta</greek> Β Beta
    <greek>nu</greek> ν nu
    <greek>eta</greek> η eta
    <greek>Alpha</greek> Α Alpha
    <greek>xi</greek> ξ xi
    <greek>theta</greek> θ theta
    <greek>iota</greek> ι iota
    <greek>Delta</greek> Δ Delta
    <greek>Mu</greek> Μ Mu
    <greek>kappa</greek> κ kappa
    <greek>Chi</greek> Χ Chi
    <greek>beta</greek> β beta
    <greek>Sigma</greek> Σ Sigma
    <greek>Gamma</greek> Γ Gamma
    <greek>Kappa</greek> Κ Kappa
    <greek>lambda</greek> λ lambda
    <greek>gamma</greek> γ gamma
    <greek>psi</greek> ψ psi
    <greek>sigma</greek> σ sigma
    <greek>alpha</greek> α alpha
    <greek>Eta</greek> Η Eta
    <greek>upsilon</greek> υ upsilon
    <greek>epsilon</greek> ε epsilon
    <greek>Upsilon</greek> Υ Upsilon
    <greek>rho</greek> ρ rho
    <greek>Theta</greek> Θ Theta
    <greek>phi</greek> φ phi
    <greek>Omicron</greek> Ώ Omicron
    <greek>Phi</greek> Φ Phi
    <greek>Iota</greek> Ι Iota
    <greek>delta</greek> δ delta
    <greek>mu</greek> μ mu
    <greek>Tau</greek> Τ Tau
    <greek>Nu</greek> Ν Nu
    <greek>Psi</greek> Ψ Psi
    <greek>Omega</greek> Ω Omega
    <greek>omicron</greek> ο omicron
    <greek>Pi</greek> Π Pi
    <greek>omega</greek> ω omega
    <greek>Xi</greek> Ξ Xi
    <greek>chi</greek> χ chi
    <greek>tau</greek> τ tau
    <greek>Zeta</greek> Ζ Zeta
    <greek>Epsilon</greek> Ε Epsilon
    <greek>zeta</greek> ζ zeta
    <greek>Lambda</greek> Λ Lambda
    <greek>pi</greek> π pi
    <grave>U</grave> Ù U
    <grave>E</grave> È E
    <grave>u</grave> ù u
    <grave>e</grave> è e
    <grave>A</grave> À A
    <grave>a</grave> à a
    <grave>o</grave> ò o
    <grave>O</grave> Ò O
    <grave>I</grave> Ì I
    <grave>i</grave> ì i
    <frac12/> ½ 1/2
    <shy/> ­ -
    <caron>D</caron> Ď D
    <caron>c</caron> č c
    <caron>A</caron> Ǎ A
    <caron>C</caron> Č C
    <caron>a</caron> ǎ a
    <caron>n</caron> ň n
    <caron>N</caron> Ň N
    <caron>s</caron> š s
    <caron>r</caron> ř r
    <caron>S</caron> Š S
    <caron>R</caron> Ř R
    <caron>z</caron> ž z
    <caron>Z</caron> Ž Z
    <em_dash/> --
    <greaterthan/> > >
    <muchgreaterthan/> >>
    <ampersand/> & &
    <reversed_comma/> ʽ `
    <middle_dot/> · .
    <ligature>oe</ligature> œ oe
    <ligature>AE</ligature> Æ AE
    <ligature>OE</ligature> ΠOE
    <ligature>ae</ligature> æ ae
    <degree/> ° degree
    <locant>gamma</locant> γ gamma
    <locant>tau</locant> τ tele
    <locant>delta</locant> δ delta
    <locant>alpha</locant> α alpha
    <locant>epsilon</locant> ε epsilon
    <locant>beta</locant> β beta
    <locant>omega</locant> ω omega
    <locant>pi</locant> π pros
    <circ>u</circ> û u
    <circ>e</circ> ê e
    <circ>a</circ> â a
    <circ>o</circ> ô o
    <circ>O</circ> Ô O
    <quotes/> " "
    <trademark/> (TM)
    <scissile/> -|-
    <radical_dot/> .
    <asymp/> ~
    <parenthesis>open</parenthesis> ( (
    <parenthesis>close</parenthesis> ) )
    <muchlessthan/> <<
    <lessequal/> <=
    <element>Db</element> Db Db
    <element>Cs</element> Cs Cs
    <element>Cu</element> Cu Cu
    <element>Kr</element> Kr Kr
    <element>Cl</element> Cl Cl
    <element>Cm</element> Cm Cm
    <element>Co</element> Co Co
    <element>Cr</element> Cr Cr
    <element>Li</element> Li Li
    <element>Cd</element> Cd Cd
    <element>Cf</element> Cf Cf
    <element>Ce</element> Ce Ce
    <element>La</element> La La
    <element>Lu</element> Lu Lu
    <element>Tl</element> Tl Tl
    <element>Tm</element> Tm Tm
    <element>Th</element> Th Th
    <element>Ti</element> Ti Ti
    <element>Te</element> Te Te
    <element>Dy</element> Dy Dy
    <element>Lr</element> Lr Lr
    <element>Ta</element> Ta Ta
    <element>Mg</element> Mg Mg
    <element>Tc</element> Tc Tc
    <element>Tb</element> Tb Tb
    <element>Ds</element> Ds Ds
    <element>Md</element> Md Md
    <element>D</element> D D
    <element>F</element> F F
    <element>Fe</element> Fe Fe
    <element>B</element> B B
    <element>Mu</element> Mu Mu
    <element>C</element> C C
    <element>Mt</element> Mt Mt
    <element>N</element> N N
    <element>O</element> O O
    <element>H</element> H H
    <element>Eu</element> Eu Eu
    <element>I</element> I I
    <element>Mo</element> Mo Mo
    <element>Mn</element> Mn Mn
    <element>K</element> K K
    <element>U</element> U U
    <element>Er</element> Er Er
    <element>T</element> T T
    <element>W</element> W W
    <element>V</element> V V
    <element>Es</element> Es Es
    <element>Ni</element> Ni Ni
    <element>P</element> P P
    <element>S</element> S S
    <element>Nd</element> Nd Nd
    <element>Ne</element> Ne Ne
    <element>Nb</element> Nb Nb
    <element>Y</element> Y Y
    <element>Na</element> Na Na
    <element>Ge</element> Ge Ge
    <element>Gd</element> Gd Gd
    <element>Ga</element> Ga Ga
    <element>No</element> No No
    <element>Np</element> Np Np
    <element>Fr</element> Fr Fr
    <element>Fm</element> Fm Fm
    <element>Yb</element> Yb Yb
    <element>Pt</element> Pt Pt
    <element>Pu</element> Pu Pu
    <element>Pr</element> Pr Pr
    <element>Hg</element> Hg Hg
    <element>Hf</element> Hf Hf
    <element>He</element> He He
    <element>Pd</element> Pd Pd
    <element>Pa</element> Pa Pa
    <element>Ho</element> Ho Ho
    <element>Pb</element> Pb Pb
    <element>Pm</element> Pm Pm
    <element>Po</element> Po Po
    <element>Xe</element> Xe Xe
    <element>Hs</element> Hs Hs
    <element>Os</element> Os Os
    <element>Au</element> Au Au
    <element>Se</element> Se Se
    <element>In</element> In In
    <element>Sc</element> Sc Sc
    <element>Ar</element> Ar Ar
    <element>Si</element> Si Si
    <element>At</element> At At
    <element>As</element> As As
    <element>Sg</element> Sg Sg
    <element>Sn</element> Sn Sn
    <element>Sm</element> Sm Sm
    <element>Ba</element> Ba Ba
    <element>Sr</element> Sr Sr
    <element>Ir</element> Ir Ir
    <element>Ru</element> Ru Ru
    <element>Ag</element> Ag Ag
    <element>Ac</element> Ac Ac
    <element>Am</element> Am Am
    <element>Sb</element> Sb Sb
    <element>Al</element> Al Al
    <element>Rb</element> Rb Rb
    <element>Re</element> Re Re
    <element>Rf</element> Rf Rf
    <element>Br</element> Br Br
    <element>Rh</element> Rh Rh
    <element>Ca</element> Ca Ca
    <element>Rn</element> Rn Rn
    <element>Bh</element> Bh Bh
    <element>Bi</element> Bi Bi
    <element>Be</element> Be Be
    <element>Zn</element> Zn Zn
    <element>Bk</element> Bk Bk
    <element>Ra</element> Ra Ra
    <stereoref>r</stereoref> r r
    <oxs>3</oxs> III III
    <oxs>2</oxs> II II
    <oxs>1</oxs> I I
    <oxs>0</oxs> 0 0
    <oxs>7</oxs> VII VII
    <oxs>6</oxs> VI VI
    <oxs>5</oxs> V V
    <oxs>4</oxs> IV IV
    <oxs>8</oxs> VIII VIII
    <protein/>
    <smallsub/>
    <stereo>S*</stereo> S* S*
    <stereo>D</stereo> D D
    <stereo>E</stereo> E E
    <stereo>talo</stereo> talo talo
    <stereo>syn</stereo> syn syn
    <stereo>xi</stereo> ξ xi
    <stereo>allo</stereo> allo allo
    <stereo>altro</stereo> altro altro
    <stereo>meso</stereo> meso meso
    <stereo>L</stereo> L L
    <stereo>manno</stereo> manno manno
    <stereo>myo</stereo> myo myo
    <stereo>beta</stereo> β beta
    <stereo>gulo</stereo> gulo gulo
    <stereo>Myo</stereo> Myo Myo
    <stereo>lyxo</stereo> lyxo lyxo
    <stereo>S</stereo> S S
    <stereo>R</stereo> R R
    <stereo>alpha</stereo> α alpha
    <stereo>cis</stereo> cis cis
    <stereo>galacto</stereo> galacto galacto
    <stereo>arabino</stereo> arabino arabino
    <stereo>Z</stereo> Z Z
    <stereo>glycero</stereo> glycero glycero
    <stereo>anti</stereo> anti anti
    <stereo>c</stereo> c c
    <stereo>ambo</stereo> ambo ambo
    <stereo>ido</stereo> ido ido
    <stereo>all-cis</stereo> all-cis all-cis
    <stereo>gluco</stereo> gluco gluco
    <stereo>R*</stereo> R* R*
    <stereo>threo</stereo> threo threo
    <stereo>endo</stereo> endo endo
    <stereo>exo</stereo> exo exo
    <stereo>trans</stereo> trans trans
    <stereo>t</stereo> t t
    <stereo>s</stereo> s s
    <stereo>r</stereo> r r
    <stereo>all-trans</stereo> all-trans all-trans
    <stereo>xylo</stereo> xylo xylo
    <stereo>erythro</stereo> erythro erythro
    <stereo>rel</stereo> rel rel
    <stereo>ribo</stereo> ribo ribo
    <ital/>
    <ringsugar>f</ringsugar> f f
    <ringsugar>p</ringsugar> p p
    <smallsup/>