Value Reporting Recommendations
Sample curation
BioSamples performs automatic curation and supports manual curation to improve sample data findability. It removes missing values, performs ontology annotation and text curation through automatic curation. BioSamples also imports curation from other services. The curation rules are described below and updated periodically.
The curation records are stored separately along with the original data. BioSamples applies the curation as separate layers on top of the original data.
Automatic curation, remove missing values
Missing values, e.g. "N/A", "none", are removed during submission. See details here.
For example, Field disease state contains N/A in the original data.
"characteristics":{
"disease state" : [ {
"text" : "N/A"
} ],
"organism" : [ {
"text" : "Homo sapiens"
} ]}
Field disease state is removed during submission.
"characteristics":{
"organism" : [ {
"text" : "Homo sapiens"
} ]}
Automatic curation, ontology annotations
Ontology annotation maps sample attributes to ontology terms. BioSamples uses ZOOMA to perform automatic ontology annotation. Only annotations with high mapping confidence are accepted.
For example, when users submit
"characteristics" :{
"disease state" : [ {
"text" : "hepatocellular carcinoma"
} ],
"organism" : [ {
"text" : "Homo sapiens"
} ]
}
BioSamples will automaticly mapp the text to ontology terms.
"characteristics" :{
"disease state" : [ {
"text" : "hepatocellular carcinoma",
"ontologyTerms" : [ "http://www.ebi.ac.uk/efo/EFO_0000182" ]
} ],
"organism" : [ {
"text" : "Homo sapiens",
"ontologyTerms" : [ "http://purl.obolibrary.org/obo/NCBITaxon_9606" ]
} ]
}
The automatic ontology curation skips fields with user-provided ontology terms.
💡 BioSamples only selects high confidence ontology annotation. However, the automatic annotation might be inaccurate. Users are recommended to examine the ontology annotation results manually.
Automatic curation, text curation
Text curation in Biosamples removes unnecessary special characters, and corrects typos. For example, when users submit
"characteristics" :{
"disease_state" : [ {
"text" : "hepatocellular_carcinoma",
"ontologyTerms" : [ "http://www.ebi.ac.uk/efo/EFO_0000182" ]
} ],
"Organism" : [ {
"text" : "Homo sapiens",
"ontologyTerms" : [ "http://purl.obolibrary.org/obo/NCBITaxon_9606" ]
} ],
"tissu": [{
"text":"liver"
}]
}
BioSamples removes the underscore in disease_state, changes Organism
to lower cases, and correct typo in tissu.
"characteristics" :{
"disease state" : [ {
"text" : "hepatocellular_carcinoma",
"ontologyTerms" : [ "http://www.ebi.ac.uk/efo/EFO_0000182" ]
} ],
"organism" : [ {
"text" : "Homo sapiens",
"ontologyTerms" : [ "http://purl.obolibrary.org/obo/NCBITaxon_9606" ]
} ],
"tissue": [{
"text":"liver"
}]
}
💡 The automatic text curation is limited to the attribute names, the attribute values will not be changed.
It takes up to 24 hours to generate the curation.
Manual curation
Users can also provide their manual curation. See details here.
How to find all curation records?
Users can access all curation records by adding /curationlinks to the
sample link.
For example, https://www.ebi.ac.uk/biosamples/samples/SAMEA1607017/curationlinks.
returns all curation records of sample SAMEA1607017
How to get uncurated data
Biosamples returns the curated data by default. It is also possible to
download the original data without curation by adding
.json?curationdomain= to the sample link.
For example,
https://www.ebi.ac.uk/biosamples/samples/SAMEA1607017.json?curationdomain= returns the original data of sample SAMEA1607017.