  base_id_url: https://www.ebi.ac.uk/chembldb/compound/inspect/
  description: A database of bioactive drug-like small molecules and bioactivities abstracted from the scientific literature.
  name: chembl
  name_label: ChEMBL
  name_long: ChEMBL
    - CHEMBL50588
  src_url: https://www.ebi.ac.uk/chembl/
  base_id_url: http://www.genome.jp/dbget-bin/www_bget?
  description: KEGG LIGAND is a composite DB consisting of COMPOUND, GLYCAN, REACTION, RPAIR, RCLASS, and ENZYME DBs, whose entries are identified by C, G, R, RP, RC, and EC numbers, respectively.
  name: kegg_ligand
  name_label: KEGG Ligand
  name_long: KEGG (Kyoto Encyclopedia of Genes and Genomes) Ligand
    - C09421
  src_url: http://www.genome.jp/kegg/ligand.html
  base_id_url: http://www.ebi.ac.uk/chebi/searchId.do?chebiId=CHEBI%3A
  description: ChEBI is a freely available dictionary of molecular entities focused on 'small' chemical compounds
  name: chebi
  name_label: ChEBI
  name_long: ChEBI (Chemical Entities of Biological Interest).
    - 4781
  src_url: http://www.ebi.ac.uk/chebi/downloadsForward.do
  base_id_url: http://www-935.ibm.com/services/us/gbs/bao/siip/nih/?sid=
  description: A massive, searchable database of chemical and pharmaceutical data, extracted from millions of patents and scientific literature. Identifiers in UniChem are IBM compound identifiers.
  name: ibm
  name_label: IBM Patent System
  name_long: IBM strategic IP insight platform and the National Institutes of Health
    - 594590FBFAF1B145F8C67C9FA370F645
  src_url: http://www-935.ibm.com/services/us/gbs/bao/siip/nih/
  base_id_url: http://worldwide.espacenet.com/searchResults?DB=EPODOC&locale=en_EP&ST=advanced&compact=false&PN=
  description: "Data, provided by IBM-NIH, was originally extracted from patents from three publishing bodies (US, EPO and WIPO) with publication dates through (including) 2000-12-31. For UniChem, these data were parsed to include only whole molecules present in either the title or claims fields. Further filters included removal of: 1. All molecules mapping to > 10,000 patents, 2. Non-organic molecules, 3. Small molecules (mw <90, number of atoms < 7). In addition, for structures mapping to >100 patents, only 100 randomly selected patents were selected. Identifiers in UniChem are patent number identifiers"
  name: patents
  name_label: Patent
  name_long: IBM strategic IP insight platform and the National Institutes of Health.
  src_url: http://www-935.ibm.com/services/us/gbs/bao/siip/nih/
  base_id_url: http://fdasis.nlm.nih.gov/srs/ProxyServlet?mergeData=true&objectHandle=DBMaint&APPLICATION_NAME=fdasrs&actionHandle=default&nextPage=jsp/srs/ResultScreen.jsp&TXTSUPERLISTID=
  description: "The primary goal of the FDA/USP Substance Registration System (SRS) is to unambiguously define all substances present in regulated products. Once a substance has been defined, the SRS assigns a strong identifier that is permanently associated with the substance: a UNII (Unique Ingredient Identifier). This is a a non-proprietary, free, unique, unambiguous, nonsemantic, alphanumeric identifier based on a substances molecular structure and/or descriptive information."
  name: fdasrs
  name_label: FDA SRS
  name_long: FDA/USP Substance Registration System (SRS)
    - X8D5EPO80M
  src_url: http://fdasis.nlm.nih.gov/srs/srs.jsp
  base_id_url: http://open.surechem.com/en/chemical?struct=
  description: "SureChem automatically extracts compounds from the full text of all major patent authorities. Compounds are derived from either chemical names found in text or the chemical depictions. Compounds deposited by SureChem originate from patents published before 11-March-2013. Compounds deposited have the following characteristics: 1) Molecular_Weight between >= 300 AND <=800. 2) Contain at least one or more rings. 3) Rotatable bonds <= 15. 4) Free of bad valencies. 5) Do not contain MedChem unfriendly structural toxicophores [https://support.surechem.com/knowledgebase/articles/169485-non-medchem-friendly-smarts]. 6) May contain salt, counter ions or fragments in a stoichiometric amount <= 1. Compounds have been further removed due to a very high rate of occurrence ( gt 100,000) seen within the SureChem corpus typically associated with very common or trivial chemistry."
  name: surechem
  name_label: SureChem
  name_long: SureChem
    - SureCN56277
  src_url: http://surechem.com
  base_id_url: http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?sid=
  description: "A subset of the PubChem DB: from the original depositor 'Thomson Pharma'."
  name: pubchem_tpharma
  name_label: " PubChem: Thomson Pharma "
  name_long: PubChem ('Thomson Pharma' subset)
    - 81075572
    - 15455076
  src_url: http://www.thomson-pharma.com/
  base_id_url: http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=
  description: A database of normalized PubChem compounds (CIDs) from the PubChem Database.
  name: pubchem
  name_label: PubChem
  name_long: PubChem Compounds
    - 10219
  src_url: http://pubchem.ncbi.nlm.nih.gov