 |
Protein Databases
Primary protein sequence databases - UniProtKB/TrEMBL
There is a tremendous increase in the amount of sequence data available due to technological
advances such as sequencing machines, the use of new biochemical
methods such as PCR technology as well as the implementation of projects
to sequence complete genomes.
These advances have produced a
flood of sequence information. Maintaining the high quality of UniProtKB/Swiss-Prot
requires, for each entry, a time-consuming process that involves the
extensive use of sequence analysis tools along with detailed curation
steps by expert annotators. It is the rate-limiting step in the production
of the database. A supplement to UniProtKB/Swiss-Prot was created in 1996, since
it is vital to make new sequences available as quickly as possible
without relaxing the high editorial standards of UniProtKB/Swiss-Prot.
This supplement, UniProtKB/TrEMBL (Translation of EMBL Nucleotide Sequence Database),
consists of computer-annotated entries derived from the translation of
all coding sequences (CDS) in the EMBL Nucleotide Sequence Database,
except for those already included in UniProtKB/Swiss-Prot.
UniProtKB/TrEMBL is split in
two main sections, SP-TrEMBL and REM-TrEMBL. SP-TrEMBL
contains the entries, which should be eventually incorporated into
UniProtKB/Swiss-Prot. REM-TrEMBL (REMaining TrEMBL) contains the entries that
will not get included in UniProtKB/Swiss-Prot.
REM-TrEMBL contains sequences that are either synthetic, truncated, pseudogenes,
patented, small fragments or immunoglobulins and T-cell receptors which UniProtKB/Swiss-Prot are not
interested in annotating. REM-TrEMBL is now a deprecated database.
|
|
|
Protein Databases <<< 3/11 >>> |
|