What is meant by a 'src_id' and a 'src_compound_id' in UniChem ?


src_id's are integers that are unique identifiers for sources in UniChem. A list of valid src_id's can be found either on the sources page or by using the web services.


src_compound_id's are the individual compound identifiers provided by each of the sources. For example 'CHEMBL12' is a src_compound_id from the 'chembl' source (src_id = 1). Since src_compound_ids may be ambiguous across different sources, querying UniChem with a src_compound_id also requires the specification of a corresponding src_id. src_compound_ids are dealt with in a case sensitive manner, as explained here.

'Current' vrs 'Obsolete' src_compound_id's.

All src_compound_ids for a source may be classified as either 'Current' or 'Obsolete'...

 'Current src_compound_id' = a src_compound_id which is currently assigned to one or more structures.
 'Obsolete src_compound_id' = a src_compound_id that is NOT currently assigned any structures.

Clearly, for an 'Obsolete src_compound_id' to exist within UniChem it must have been assigned to a structure in at least one previous data releases from the source, but in the most recent release from the source it is absent. Note that these definitions are distinct from the definitions of 'Current and Obsolete Assignments'.

Thus, for example, a given src_compound_id may have an 'Obsolete Assignment' to structure X, but the src_compound_id itself may be 'Current', because it is 'Currently Assigned' to structure Y. Another src_compound_id may also have an 'Obsolete Assignment' to structure X, and the src_compound_id itself may also be 'Obsolete', simply because it is not assigned to any structure in the most recent release (ie: it has no current assignment).

Using the 'Include non-mapped src_compound_ids' option on the Whole source mapping page. will return both Current and Obsolete src_compound_ids, as explained here.

Back to UniChem Home and Query page.