|
|
DopaNet Neuronal Ontology
What Is a Biological Ontology?
An ontology is:
An explicit formal specification of how to represent the objects, concepts and other entities that are assumed to exist in some area of interest, and the relationships that hold among them (Free On-line Dictionary of Computing, 27 Sept. 03).
In our case, the ontology is a relational vocabulary, that is, terms linked together used to completely describe an area of knowledge. Each term has a definition and a unique identifier. Terms are related by "is a" inheritance, which represents subclassing, and "part of" inheritances, which represent deepening knowledge. For instance, the α6 nicotinic receptor subunit "is a" nicotinic receptor subunit, and is "part of" the (α6)2(β2)3 nAChR. Each term can be a child of several others. Therefore the complete picture is not a genealogical tree, but rather a "direct acyclic graph". One can nevertheless extract hierarchical views of the ontology. For example, the network:
A C
\ /
\ /
B
/ \
/ \
D E
can be translated into:
term A
term B
term D
term E
term C
term B
term D
term E
In this way, each bit of knowledge can be mapped into a set of codes.
There are several biological ontologies, the most famous (and complete) being Gene Ontology. Numerous other projects can be found at the repository of the Open Biological Ontologies.
Why DopaNet Needs Such an Ontology?
Every single data stored in the databases of DopaNet will be attached to one or several ontological codes. The ontology will therefore act as a glue, relating the various pieces of data one to the other. The ontology will also affect the way we store the data and constraint the data itself. For instance, the "part-of" children of a molecular complex will become the components of DopaNet Molecular Page.
In addition, the inheritance relationhips present in the ontology could be mirrored in the structure of the future functional simulations (through class inheritance of object-oriented approaches for example).
Structure of the Ontology
The Gene Ontology consortium defined three different vocabularies:
Molecular function is what something does.
Biological process is a biological objective.
Cellular component is... a component of a cell.
Only the latter is currently relevant to DopaNet purposes, although it is anticipated that the Biological process will be needed at some point.
A cellular component may be for instance an anatomical structure, e.g. "rough endoplasmic reticulum" or "nucleus", but also a protein. Note that a "molecule" is defined in the neuronal ontology as a set of atoms covalently linked. A molecule cannot contain other molecules. Therefore, a protein made up of several subunits, or a polypeptide and a co-enzyme, are not "molecules", but "molecular complexes". This is somewhat different from the definition used in the "Molecular Pages".
The whole DopaNet ontology is stored within a big XML file, containing the list of all terms together with their definition. Since this is not really human readable, an HTML version is provided, displaying the tree view, as well as all the definitions.
How Can I Contribute?
Once you have consulted the ontology browser, and made your mind up about a change, just send an e-mail with the detailed changes you require. For instance:
-
To delete a branch (the whole subtree will be removed):
Term DA:0000003; molecule
Delete term DA:0000182; polypeptide
-
To add a an existing term to another existing term:
Term DA:0000182; molecule
Add term DA:0000153; noradrenaline transporter
-
To create a new term:
Term DA:0000014; integral membrane protein
Add new term "is a" neurotrophin receptor subunit
Definition: any proteic subunit which participate to the formation of the neurotrophin receptors.
-
To move a term down a level:
Move term DA:0000150; trkA receptor
From DA:0000014; integral membrane protein
To new term; neurotrophin receptor subunit
-
To move a term up a level:
Move term DA:0000040; M1 muscarinic acetylcholine receptor
From DA:0000039; muscarinic acetylcholine receptor
To DA:0000014; integral membrane protein
Guidelines
In order for the ontology to be useful for DopaNet purposes, a few rules should be followed.
-
Whenever possible, one should re-use terms, definitions or relationships present in other controlled vocabularies such as those of Cell Type ontology, Gene Ontology, InterPro or ChEBI.
-
Although this ontology is built for DopaNet purposes, it can be viewed as a more general "neuronal ontology". Therefore, if necessary, we can incorporate terms related to components present (or events taking place) in any neuron. It is particularly advised if it clarifies some hierarchical relationship.
-
A "molecular complex" (DA:0000259), defined as a "Stable assembly of molecules", contains one or several components. Those components should be defined as "molecule" (DA:0000003), defined as "Set of atoms linked together by covalent bounds". For instance (as of January 19th 2004), "(alpha4)_2(beta2)_3 nAChR" (DA:0000027) is a "nicotinic acetylcholine-gated receptor" (DA:0000022), itself a "protein" (defined as a "Stable assembly of one or several polypeptides, the alloprotein, possibly with a few non-peptidic component"). This "(alpha4)_2(beta2)_3 nAChR" is made up of two components: "alpha4 nicotinic receptor subunit" (DA:0000188) and "beta2 nicotinic receptor subunit" (DA:0000186). Those two components are defined in the "molecule" branch respectively as "polypeptide" (DA:0000182) and "nicotinic receptor subunit" (DA:0000189).
It can be considered redundant that all monomeric proteins are defined twice, as "molecule" and as "molecular complex". However, the meaning of the two branches are different. The "molecule" describes an ideal entity, while the "molecular complex" describes an actual physical object of the cell. Moreover, the hierarchical structure of the two branches is different. In addition, a lot of proteins have only recently been discovered as functional complexes (e.g. the polymeric G-protein coupled receptors), and more are to be discovered. Finally, the systematic dissociation between the functional molecular complex and its components is handy when it comes to write the Molecular Pages.
-
The "polypeptide" tree should be built following the sequence resemblance, and then the 3D structure, a bit like in the Structural Classification Of Protein (SCOP) database and InterPro. There should not be functional grouping as in the "protein" tree.
-
If a "molecular complex" contains only one polypeptyde, this one is suffixed "polypeptide". If a "molecular complex" contains several polypeptides, they are suffixed "subunit".
|