![]() |
EMBL - CDSEMBLCDS is a database of nucleotide sequences of the CDS (coding sequence) features, as annotated in EMBL database. EMBLCDS record contains the nucleotide sequence of the CDS region, accompanying annotation from the parent nucleotide entry and the additional automatically generated annotation.Data is distributed in a flatfile and XML format, similar to that of the EMBL nucleotide sequence database, but with some differences and additional annotation; the short description of the format can be found in the readme file. In addition to the EMBLCDS flatfiles some other files are distributed : for example, index file cds_all.index.gz linking CRC32 of the nucleotide sequence with the protein_id and non-redundant fasta file cds_nr.fasta.gz, where non-redundancy is based on CRC32. EMBL CDS data are retrievable via EBI SRS (http://srs.ebi.ac.uk, EMBLCDS library); via dbfetch (http://www.ebi.ac.uk/cgi-bin/emblfetch) and via Web Services version of dbfetch (http://www.ebi.ac.uk/Tools/webservices) Files are created daily. Similarity searches (FASTA algorithm) at can be run against EMBL CDSs dataset (http://www.ebi.ac.uk/fasta33/nucleotide.html)
![]() |