EMBL Outstation - The European Bioinformatics Institute
EMBL Nucleotide Sequence Database
Release Notes
Release 89 December 2006
EMBL Outstation
European Bioinformatics Institute
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
United Kingdom
Telephone: +44-1223-494400
Telefax : +44-1223-494468
URL: http://www.ebi.ac.uk/embl/
Feedback form : http://www.ebi.ac.uk/support/
CONTENTS
1 RELEASE 89
1.1 Feature Table Definition Document v6.6
1.2 Database Files
1.2.1 Naming Conventions
1.2.2 CRC Values for Distributed Files
1.3 Cross-Reference Information
1.4 Digital Object Identifiers (DOI) and PubMed references
1.5 EMBL Database FAQ
1.6 Disclaimer
1.7 Acknowledgements
2 CHANGES IN THIS RELEASE
2.1 Introduction of TGN taxonomic division
2.2 Use of 'O' for pyrrolysine
2.3 Location descriptor "single base from a range" (n.m) is discontinued
2.4 Usage of qualifier /operon changed
2.5 New qualifier /mobile_element and dropping of two old qualifiers
3 FORTHCOMING CHANGES
3.1 New line type for project IDs
4 SEQUENCE SUBMISSION SYSTEMS
5 CITING THE EMBL NUCLEOTIDE SEQUENCE DATABASE
6 EBI NETWORK SERVICES
6.1 Electronic Mail Server
6.2 Anonymous FTP Server
6.3 World Wide Web (WWW) Server
6.4 Sequence Version Archive
6.5 Sequence Similarity Search Servers
7 RELEASE 89 FILES
APPENDIX A DATABASE GROWTH TABLE
1 RELEASE 89
The EMBL Nucleotide Sequence Database was frozen to make Release 89 on
30-NOV-2006. The release contains 83,666,567 sequence entries
comprising 150,163,403,742 nucleotides, of which 18,535,353 entries
(81,521,656,204 nucleotides) are WGS (whole genome shotgun) data.
The release 89 files total 69 GB compressed and 376 GB uncompressed.
A breakdown of Release 89 by dataclass and taxonomic division
is shown below:
Breakdown by dataclass
Class entries nucleotides
----------------------------------------------------------------
CON:Constructed 849,126 62,900,561,886
EST:Expressed Sequence Tag 39,808,988 21,920,348,867
GSS:Genome Sequence Scan 16,073,640 10,211,729,434
HTC:High Throughput CDNA sequencing 450,564 553,627,818
HTG:High Throughput Genome sequencing 97,088 16,352,298,997
PAT:Patents 3,556,157 2,137,088,250
STD:Standard 3,399,493 16,630,347,569
STS:Sequence Tagged Site 890,995 500,504,557
TPA:Third Party Annotation 5,163 335,802,046
WGS:Whole Genome Shotgun 18,535,353 81,521,656,204
---------- --------------
Total 83,666,567 150,163,403,742
Breakdown by taxonomic division
Division entries nucleotides
----------------------------------------------------------------
ENV:Environmental Samples 2,162,420 1,545,072,677
FUN:Fungi 1,488,539 2,230,258,753
HUM:Human 11,475,120 21,333,688,751
INV:Invertebrates 9,870,449 15,374,595,645
MAM:Other Mammals 17,362,895 50,366,774,550
MUS:Mus musculus 8,167,813 13,853,936,225
PHG:Bacteriophage 4,081 22,102,277
PLN:Plants 19,239,706 15,375,178,666
PRO:Prokaryotes 516,770 3,219,947,843
ROD:Rodents 3,615,200 15,116,847,721
SYN:Synthetic 717,339 272,882,442
TGN:Transgenic 789 458,737
UNC:Unclassified 1,289,894 590,194,931
VRL:Viruses 428,087 428,811,568
VRT:Other Vertebrates 7,327,465 10,432,652,956
---------- --------------
Total 83,666,567 150,163,403,742
Breakdown by both taxonomic division and dataclass can be found in
divisions.ndx, distributed together with the release
EMBL database statistics are available at
URL: http://www.ebi.ac.uk/embl/Services/DBStats/
Note: The nucleotide count for CON(structed) entries is included
in the tables, but not in the total, because it is already included
with the statistics for the segments of each constructed entry.
1.1 Feature Table Definition Document v6.5
The last version of the Feature Table Definition Document (FTv6.6) has been
implemented in October 2006. The document is available from the EBI
servers at:
http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html
ftp://ftp.ebi.ac.uk/pub/databases/embl/doc/
The next edition of the Feature table document will become available in
April 2007.
1.2 Database Files
For the full list of distribution files see table in section 7
1.2.1 Naming Conventions
For all data apart from WGS, the data file names in the release
look as follows
rel_dtc_tax_nn_rRN.dat
where
"dtc" is a three lowercase letters abbreviation for the dataclass
"tax" is a three lowercase letters taxonomic division abbreviation
"nn" - number of the file in a particular sequence (starting from "01")
"RN" - number of the release where the file belongs
Examples:
rel_est_hum_01_r89.dat
rel_htg_mus_04_r89.dat
Dataclass list : EST, GSS, HTC, HTG, PAT, STS, STD, TPA, CON
Taxonomic division list : HUM, MUS, ROD, PRO, MAM, VRT, FUN, PLN, ENV,
INV, SYN, UNC, VRL, PHG
Where STD dataclass abbreviation stands for "standard" entries.
Filesize is kept under 4 Gb by regulating the number of entries in each file
(doesn't apply to WGS files)
For WGS data - one data file is formed per WGS project, and filenames
incorporate the project prefix and the indication of the taxonomic
division of the entries, e.g. wgs_caae_vrt.dat.
1.2.2 CRC values for distributed files
To help users verify the integrity of release data files, we supply files
containing 32-bit checksum Cyclic Redundancy Check (CRC) values, plus byte
counts, for both compressed and uncompressed release files.
CRC values are calculated based on POSIX standard, which is implemented as a
default behaviour by the 'cksum' command in most of modern Unix and Linux
platforms. However, it has been found that some implementations wrap the
file byte count to zero when the file size reaches over 4 Gbytes. We are now
using 'cksum' from fileutils package to do the calculations. If you are in
any doubt whether your 'cksum' is POSIX-compliant, or it has 4
Gbyte-limit(32-bit unsigned integer), you could download the utility from http://www.gnu.org/software/fileutils/, and install it.
File: crc_gz.txt for compressed data files
File: crc.txt for uncompressed data files
Example from crc.txt
1985969636 415093160 rel_con_env_01_r89.dat
This output shows that the checksum of the file env01.dat is 1985969636
and the file contains 415093160 bytes.
1.3 Cross-Reference Information
Links to external databases allow integration with specialised data
collections, such as protein databases, species-specific databases,
taxonomy databases etc. The WWW-based sequence retrieval system SRS
enables users to easily navigate between cross-referenced database
entries.
EMBL Release 89 includes 64357622 cross-references to related
databases. 14235630 of these also refer to individual features e.g.
CDS (coding sequences) via the /db_xref feature qualifier in EMBL
entries.
EMBL cross-references to other databases:
DATABASES Nr of Links
-------------------- -----------
UniProtKB/TrEMBL 3852587
SGD 14423
PDB 171502
CABRI 67256
Flybase 135745
GeneDB 7171
GrainGenes 1080193
GOA 2914298
HGNC 145625
H-InvDB 167992
HSSP 724211
EPD 7529
IMGT/HLA 5864
Interpro 5846489
IMGT/LIGM 107738
TRANSFAC 6620
RZPD 8123298
UniProtKB/Swiss-Prot 450748
SubtiList 4106
UNILIB 38566191
Unite 73
VectorBase 30279
VBASE2 2884
WormBase 23732
GDB 1663824
MGI 220666
ZFIN 16578
-----------
Total 64357622
Cross-references in the feature table
DATABASES Nr of Links
-------------------- -----------
UniProtKB/TrEMBL 3804320
SGD 14422
PDB 171502
Flybase 63288
GeneDB 7171
GOA 2882024
HGNC 145625
HSSP 724211
Interpro 5846489
UniProtKB/Swiss-Prot 421342
SubtiList 4106
VectorBase 29868
GDB 48091
MGI 56593
ZFIN 16578
-----------
Total 14235630
Apart from cross-references to the external resources listed above,
internal cross-references can be present in the header of EMBL entries.
Such "intradatabase" cross-references include EMBL-TPA, EMBL-ANN, EMBL-CON,
EMBL-ALIGN and EMBL-JOIN. Formats and explanation:
DR EMBL-TPA; acc#.
used in a standard entry that serves as primary source for a TPA entry acc#
DR EMBL-ANN; acc#.
used in a standard entry that serves as segment in an annotated CON entry acc#
DR EMBL-CON; acc#.
used in a standard entry that serves as segment in a CON entry acc#
DR EMBL-ALIGN; acc#.
used in a standard entry that participates in an alignment entry in EMBL Alignment database acc#
DR EMBL-JOIN; acc#.
used in a standard entry any part of sequence of which is used in a "join" operator in a different entry acc#
1.4 Digital Object Identifiers (DOI) and PubMed references
Digital Object Identifiers (DOIs) provide unique references to the URLs
of full text versions of cited publications.
DOI identifiers are provided for 98897 citations.
The number of EMBL entries containing at least one citation with DOI is 18185398.
PubMed references are provided for 180222 citations.
The number of EMBL entries containing at least one citation with PubMed reference
is 19932232.
1.5 EMBL Database FAQ
EMBL Database FAQ are available from the EBI at URL
http://www.ebi.ac.uk/embl/Documentation/FAQ/
1.6 Disclaimer
No guarantee is given and no legal liability or responsibility is assumed
for the completeness and accuracy of the database entries, in particular
the conformity of sequence data in the database with the journal
publication where the sequence is also disclosed.
1.7 Acknowledgements
EMBL database is maintained by:
Ruth Akhtar, Philippe Aldebert, Nicola Althorpe, Alastair Baldwin,
Kirsty Bates, Sumit Bhattacharyya, Lawrence Bower, Paul Browne,
Matias Castro, Guy Cochrane, Nadeem Faruque, Gemma Hoad, Carola Kanz,
Tamara Kulikova, Rasko Leinonen, Quan Lin, Dariusz Lorenc, Rodrigo Lopez,
Hamish McWilliam, Gaurab Mukherjee, Francesco Nardone, Sheila Plaister,
Siamak Sobhany, Robert Vaughan, Dan Wu, Weimin Zhu, and Rolf Apweiler
2 CHANGES IN THIS RELEASE
2.1 Introduction of TGN taxonomic division
A new database taxonomic division, Transgenic (TGN), was created in the
release 89. Entries representing transgenic organisms (indicated
by the inclusion of the /transgenic qualifier in one of the source features),
are now be stored in the new TGN division.
2.2 Use of 'O' for pyrrolysine
A single-letter amino acid abbreviation "O" is used to represent
pyrrolysine in the CDS translation starting from October 2006.
2.3 Location descriptor "single base from a range" (n.m) is discontinued
Use of Locaction descriptor defined as "a single base chosen from a range of
bases" which is currently indicated by the first base number and the last
base number of the range separated by a single period (e.g., '12.21') was
discontinued in October 2006.
2.4 Usage of qualifier /operon
Qualifier /operon will became valid on the rRNA feature in Oct 2006.
2.5 New qualifier /mobile_element and dropping of two old qualifiers
New qualifier /mobile_element was introduced in December 2006 to hold type
and name or identifier of the mobile element which is described by the parent
feature . At the same time, two less generic qualifiers - /transposon and
/insertion sequence were dropped and all existing instances of them retrofitted
to make use of the new qualifier. Please note that this change doesn't affect
the release itself, but update files after 8 December 2006.
3. Forthcoming changes
3.1 New line type for project IDs
New line type, provisional two-character line type code PR, will be introduced into
EMBL flatfiles with the March release of EMBL database. The line will contain
INSDC-assigned ID for the sequencing project.
4 SEQUENCE SUBMISSION SYSTEM
Information on submission of sequence data to the EMBL Nucleotide
Sequence Database is available at:
http://www.ebi.ac.uk/embl/Submission/
For further information on submission of sequence data to the
EMBL Nucleotide Sequence Database please contact database staff at:
EMBL Nucleotide Sequence Submissions
e-mail: datasubs@ebi.ac.uk
telephone: 1223-494499
telefax: 1223-494472
5 CITING THE EMBL NUCLEOTIDE SEQUENCE DATABASE
We encourage authors to include a reference to the EMBL Database in
publications related to their research.
When citing data in the EMBL Database, we suggest authors provide the
primary accession number and the publication in which the sequence first
appeared. For unpublished data, we suggest authors contact the original
submitters for recent publication information or revisions of the data.
We suggest authors also provide a reference to the EMBL Database itself.
Our recent publication describing the EMBL database should be cited:
Tamara Kulikova*, Ruth Akhtar, Philippe Aldebert, Nicola Althorpe,
Mikael Andersson, Alastair Baldwin, Kirsty Bates, Sumit Bhattacharyya,
Lawrence Bower, Paul Browne, Matias Castro, Guy Cochrane, Karyn Duggan,
Ruth Eberhardt, Nadeem Faruque, Gemma Hoad, Carola Kanz, Charles Lee,
Rasko Leinonen, Quan Lin, Vincent Lombard, Rodrigo Lopez, Dariusz Lorenc,
Hamish McWilliam, Gaurab Mukherjee, Francesco Nardone,
Maria Pilar Garcia Pastor, Sheila Plaister, Siamak Sobhany, Peter Stoehr,
Robert Vaughan, Dan Wu, Weimin Zhu and Rolf Apweiler
EMBL Nucleotide Sequence Database in 2006
Nucleic Acids Research, doi:10.1093/nar/gkl913
6 EBI NETWORK SERVICES
6.1 Electronic Mail Server
Copies of database entries and other information could be obtained by
sending commands via email to a server running at EBI. New and updated
EMBL nucleotide sequence entries are made available on the server on a
daily basis.
Send file server commands to the address netserv@ebi.ac.uk. Each line
of the mail message should consist of a single file server request; the
first request to get started, is:
HELP
When the file server receives this command, it will return a helpfile to
the sender, explaining in some detail how to use the facility.
6.2 Anonymous FTP Server
The file transfer protocol (ftp) can be used to access the EBI data archives.
Researchers with direct access to the Internet can use the FTP program on
their local machine to connect to the host ftp.ebi.ac.uk and enter the
username "anonymous" and their email address as password.
The directory pub/help contains detailed information about the data
available from the EBI anonymous FTP server which includes the complete
EMBL Nucleotide Sequence Database releases as well as daily and weekly
updates and a cumulative update file (gzip compressed format) in the
following directories:
EMBL quarterly release: /pub/databases/embl/release/
EMBL updates: /pub/databases/embl/new
There are other EMBL database datasets available, please check
ftp://ftp.ebi.ac.uk/pub/databases/embl/README
for more detailed information
6.3 World Wide Web (WWW) Server
EMBL database data directory at the EBI :
http://www.ebi.ac.uk/embl/
Data Retrieval:
Nucleotide sequences can be retrieved with a simple query by
accession number at http://www.ebi.ac.uk/cgi-bin/emblfetch Entries can be retrieved in flatfile format, fasta format and
in two XML formats - emblxml and insdxml.
For programmatic access to the dbfetch, new Web Services
version implemented using SOAP and HTTP is recommended:
http://www.ebi.ac.uk/Tools/webservices/
More complex queries can be constructed using the SRS databank browser at http://srs.ebi.ac.uk
Data Submission: Nucleotide sequences can be submitted to the
database using the interactive submission system Webin at
http://www.ebi.ac.uk/embl/Submission/webin.html
6.4 Sequence Version Archive
The EMBL Sequence Version Archive (SVA) is a publicly available
database containing all versions of any entry which has ever appeared
in the EMBL database.
The archive can be accessed programmatically via
dbfetch at http://www.ebi.ac.uk/cgi-bin/dbfetch or
interactively via a Web interface at http://www.ebi.ac.uk/embl/sva/ Batch retrieval from the SVA is available at http://www.ebi.ac.uk/cgi-bin/sva/sva.pl?&do_batch=1 Note : Expanded versions of CON entries can be downloaded from the SVA
via the batch retrieval form.
6.5 Sequence Similarity Search Servers
The EBI offers two network servers for sequence similarity searches via
electronic mail or interactive WWW forms:
FASTA based on W. Pearson's FASTA algorithm. Allows local
similarity searches of protein and nucleotide sequence databases.
Send "help" to fasta@ebi.ac.uk or use URL http://www.ebi.ac.uk/fasta33/
Complete genomes and whole genome shotgun datasets can
be searched at the following URL's:
http://www.ebi.ac.uk/fasta33/genomes.html
http://www.ebi.ac.uk/fasta33/wgs.html
Alternatively, send "help" command to gpfasta@ebi.ac.uk
BLAST based on the NCBI and WU-BLAST software
Send "help" to blast@ebi.ac.uk or use URL http://www.ebi.ac.uk/blast2/
7 RELEASE 89 FILES
The release contains the files shown below.
File Number File Name Description
The release contains the files shown below.
File Number File Name Description
1 crc.txt Checksum CRC uncompressed files
2 crc_gz.txt Checksum CRC compressed files
3 deleteac.txt Deleted accession numbers
4 ftable.txt Feature Table Documentation
5 relnotes.txt Release Notes (this document)
6 subinfo.txt Data Submission Documentation
7 update.txt Data Update Form
8 usrman.txt User Manual
9 division.ndx Division Index
10 rel_est_env_01_r89.dat EST Sequences
11 rel_est_fun_01_r89.dat EST Sequences
12 rel_est_fun_02_r89.dat EST Sequences
13 rel_est_hum_01_r89.dat EST Sequences
14 rel_est_hum_02_r89.dat EST Sequences
15 rel_est_hum_03_r89.dat EST Sequences
16 rel_est_hum_04_r89.dat EST Sequences
17 rel_est_hum_05_r89.dat EST Sequences
18 rel_est_hum_06_r89.dat EST Sequences
19 rel_est_hum_07_r89.dat EST Sequences
20 rel_est_hum_08_r89.dat EST Sequences
21 rel_est_hum_09_r89.dat EST Sequences
22 rel_est_hum_10_r89.dat EST Sequences
23 rel_est_hum_11_r89.dat EST Sequences
24 rel_est_hum_12_r89.dat EST Sequences
25 rel_est_hum_13_r89.dat EST Sequences
26 rel_est_hum_14_r89.dat EST Sequences
27 rel_est_inv_01_r89.dat EST Sequences
28 rel_est_inv_02_r89.dat EST Sequences
29 rel_est_inv_03_r89.dat EST Sequences
30 rel_est_inv_04_r89.dat EST Sequences
31 rel_est_inv_05_r89.dat EST Sequences
32 rel_est_inv_06_r89.dat EST Sequences
33 rel_est_inv_07_r89.dat EST Sequences
34 rel_est_inv_08_r89.dat EST Sequences
35 rel_est_inv_09_r89.dat EST Sequences
36 rel_est_inv_10_r89.dat EST Sequences
37 rel_est_mam_01_r89.dat EST Sequences
38 rel_est_mam_02_r89.dat EST Sequences
39 rel_est_mam_03_r89.dat EST Sequences
40 rel_est_mam_04_r89.dat EST Sequences
41 rel_est_mam_05_r89.dat EST Sequences
42 rel_est_mus_01_r89.dat EST Sequences
43 rel_est_mus_02_r89.dat EST Sequences
44 rel_est_mus_03_r89.dat EST Sequences
45 rel_est_mus_04_r89.dat EST Sequences
46 rel_est_mus_05_r89.dat EST Sequences
47 rel_est_mus_06_r89.dat EST Sequences
48 rel_est_mus_07_r89.dat EST Sequences
49 rel_est_mus_08_r89.dat EST Sequences
50 rel_est_pln_01_r89.dat EST Sequences
51 rel_est_pln_02_r89.dat EST Sequences
52 rel_est_pln_03_r89.dat EST Sequences
53 rel_est_pln_04_r89.dat EST Sequences
54 rel_est_pln_05_r89.dat EST Sequences
55 rel_est_pln_06_r89.dat EST Sequences
56 rel_est_pln_07_r89.dat EST Sequences
57 rel_est_pln_08_r89.dat EST Sequences
58 rel_est_pln_09_r89.dat EST Sequences
59 rel_est_pln_10_r89.dat EST Sequences
60 rel_est_pln_11_r89.dat EST Sequences
61 rel_est_pln_12_r89.dat EST Sequences
62 rel_est_pln_13_r89.dat EST Sequences
63 rel_est_pln_14_r89.dat EST Sequences
64 rel_est_pln_15_r89.dat EST Sequences
65 rel_est_pln_16_r89.dat EST Sequences
66 rel_est_pln_17_r89.dat EST Sequences
67 rel_est_pln_18_r89.dat EST Sequences
68 rel_est_pln_19_r89.dat EST Sequences
69 rel_est_pro_01_r89.dat EST Sequences
70 rel_est_rod_01_r89.dat EST Sequences
71 rel_est_rod_02_r89.dat EST Sequences
72 rel_est_unc_01_r89.dat EST Sequences
73 rel_est_vrt_01_r89.dat EST Sequences
74 rel_est_vrt_02_r89.dat EST Sequences
75 rel_est_vrt_03_r89.dat EST Sequences
76 rel_est_vrt_04_r89.dat EST Sequences
77 rel_est_vrt_05_r89.dat EST Sequences
78 rel_est_vrt_06_r89.dat EST Sequences
79 rel_est_vrt_07_r89.dat EST Sequences
80 rel_est_vrt_08_r89.dat EST Sequences
81 rel_est_vrt_09_r89.dat EST Sequences
82 rel_est_vrt_10_r89.dat EST Sequences
83 rel_gss_env_01_r89.dat Genome Survey Sequences
84 rel_gss_fun_01_r89.dat Genome Survey Sequences
85 rel_gss_hum_01_r89.dat Genome Survey Sequences
86 rel_gss_hum_02_r89.dat Genome Survey Sequences
87 rel_gss_inv_01_r89.dat Genome Survey Sequences
88 rel_gss_inv_02_r89.dat Genome Survey Sequences
89 rel_gss_mam_01_r89.dat Genome Survey Sequences
90 rel_gss_mam_02_r89.dat Genome Survey Sequences
91 rel_gss_mam_03_r89.dat Genome Survey Sequences
92 rel_gss_mam_04_r89.dat Genome Survey Sequences
93 rel_gss_mus_01_r89.dat Genome Survey Sequences
94 rel_gss_mus_02_r89.dat Genome Survey Sequences
95 rel_gss_mus_03_r89.dat Genome Survey Sequences
96 rel_gss_phg_01_r89.dat Genome Survey Sequences
97 rel_gss_pln_01_r89.dat Genome Survey Sequences
98 rel_gss_pln_02_r89.dat Genome Survey Sequences
99 rel_gss_pln_03_r89.dat Genome Survey Sequences
100 rel_gss_pln_04_r89.dat Genome Survey Sequences
101 rel_gss_pln_05_r89.dat Genome Survey Sequences
102 rel_gss_pln_06_r89.dat Genome Survey Sequences
103 rel_gss_pln_07_r89.dat Genome Survey Sequences
104 rel_gss_pln_08_r89.dat Genome Survey Sequences
105 rel_gss_pro_01_r89.dat Genome Survey Sequences
106 rel_gss_rod_01_r89.dat Genome Survey Sequences
107 rel_gss_vrl_01_r89.dat Genome Survey Sequences
108 rel_gss_vrt_01_r89.dat Genome Survey Sequences
109 rel_htc_fun_01_r89.dat High throughput cDNAs
110 rel_htc_hum_01_r89.dat High throughput cDNAs
111 rel_htc_inv_01_r89.dat High throughput cDNAs
112 rel_htc_mam_01_r89.dat High throughput cDNAs
113 rel_htc_mus_01_r89.dat High throughput cDNAs
114 rel_htc_pln_01_r89.dat High throughput cDNAs
115 rel_htc_pro_01_r89.dat High throughput cDNAs
116 rel_htc_rod_01_r89.dat High throughput cDNAs
117 rel_htc_vrt_01_r89.dat High throughput cDNAs
118 rel_htg_env_01_r89.dat High Throughput Genome Sequences
119 rel_htg_fun_01_r89.dat High Throughput Genome Sequences
120 rel_htg_hum_01_r89.dat High Throughput Genome Sequences
121 rel_htg_hum_02_r89.dat High Throughput Genome Sequences
122 rel_htg_inv_01_r89.dat High Throughput Genome Sequences
123 rel_htg_inv_02_r89.dat High Throughput Genome Sequences
124 rel_htg_mam_01_r89.dat High Throughput Genome Sequences
125 rel_htg_mam_02_r89.dat High Throughput Genome Sequences
126 rel_htg_mam_03_r89.dat High Throughput Genome Sequences
127 rel_htg_mus_01_r89.dat High Throughput Genome Sequences
128 rel_htg_phg_01_r89.dat High Throughput Genome Sequences
129 rel_htg_pln_01_r89.dat High Throughput Genome Sequences
130 rel_htg_pro_01_r89.dat High Throughput Genome Sequences
131 rel_htg_rod_01_r89.dat High Throughput Genome Sequences
132 rel_htg_rod_02_r89.dat High Throughput Genome Sequences
133 rel_htg_rod_03_r89.dat High Throughput Genome Sequences
134 rel_htg_vrl_01_r89.dat High Throughput Genome Sequences
135 rel_htg_vrt_01_r89.dat High Throughput Genome Sequences
136 rel_pat_env_01_r89.dat Patent Sequences
137 rel_pat_fun_01_r89.dat Patent Sequences
138 rel_pat_hum_01_r89.dat Patent Sequences
139 rel_pat_hum_02_r89.dat Patent Sequences
140 rel_pat_inv_01_r89.dat Patent Sequences
141 rel_pat_mam_01_r89.dat Patent Sequences
142 rel_pat_mus_01_r89.dat Patent Sequences
143 rel_pat_phg_01_r89.dat Patent Sequences
144 rel_pat_pln_01_r89.dat Patent Sequences
145 rel_pat_pro_01_r89.dat Patent Sequences
146 rel_pat_rod_01_r89.dat Patent Sequences
147 rel_pat_syn_01_r89.dat Patent Sequences
148 rel_pat_unc_01_r89.dat Patent Sequences
149 rel_pat_unc_02_r89.dat Patent Sequences
150 rel_pat_vrl_01_r89.dat Patent Sequences
151 rel_pat_vrt_01_r89.dat Patent Sequences
152 rel_std_env_01_r89.dat Standard Sequences
153 rel_std_fun_01_r89.dat Standard Sequences
154 rel_std_hum_01_r89.dat Standard Sequences
155 rel_std_hum_02_r89.dat Standard Sequences
156 rel_std_hum_03_r89.dat Standard Sequences
157 rel_std_hum_04_r89.dat Standard Sequences
158 rel_std_hum_05_r89.dat Standard Sequences
159 rel_std_hum_06_r89.dat Standard Sequences
160 rel_std_hum_07_r89.dat Standard Sequences
161 rel_std_hum_08_r89.dat Standard Sequences
162 rel_std_hum_09_r89.dat Standard Sequences
163 rel_std_hum_10_r89.dat Standard Sequences
164 rel_std_hum_11_r89.dat Standard Sequences
165 rel_std_hum_12_r89.dat Standard Sequences
166 rel_std_hum_13_r89.dat Standard Sequences
167 rel_std_hum_14_r89.dat Standard Sequences
168 rel_std_hum_15_r89.dat Standard Sequences
169 rel_std_hum_16_r89.dat Standard Sequences
170 rel_std_hum_17_r89.dat Standard Sequences
171 rel_std_hum_18_r89.dat Standard Sequences
172 rel_std_hum_19_r89.dat Standard Sequences
173 rel_std_hum_20_r89.dat Standard Sequences
174 rel_std_hum_21_r89.dat Standard Sequences
175 rel_std_hum_22_r89.dat Standard Sequences
176 rel_std_hum_23_r89.dat Standard Sequences
177 rel_std_hum_24_r89.dat Standard Sequences
178 rel_std_hum_25_r89.dat Standard Sequences
179 rel_std_hum_26_r89.dat Standard Sequences
180 rel_std_hum_27_r89.dat Standard Sequences
181 rel_std_inv_01_r89.dat Standard Sequences
182 rel_std_inv_02_r89.dat Standard Sequences
183 rel_std_mam_01_r89.dat Standard Sequences
184 rel_std_mus_01_r89.dat Standard Sequences
185 rel_std_mus_02_r89.dat Standard Sequences
186 rel_std_mus_03_r89.dat Standard Sequences
187 rel_std_mus_04_r89.dat Standard Sequences
188 rel_std_phg_01_r89.dat Standard Sequences
189 rel_std_pln_01_r89.dat Standard Sequences
190 rel_std_pln_02_r89.dat Standard Sequences
191 rel_std_pln_03_r89.dat Standard Sequences
192 rel_std_pro_01_r89.dat Standard Sequences
193 rel_std_pro_02_r89.dat Standard Sequences
194 rel_std_rod_01_r89.dat Standard Sequences
195 rel_std_syn_01_r89.dat Standard Sequences
196 rel_std_tgn_01_r89.dat Standard Sequences
197 rel_std_unc_01_r89.dat Standard Sequences
198 rel_std_vrl_01_r89.dat Standard Sequences
199 rel_std_vrl_02_r89.dat Standard Sequences
200 rel_std_vrt_01_r89.dat Standard Sequences
201 rel_sts_fun_01_r89.dat STS Sequences
202 rel_sts_hum_01_r89.dat STS Sequences
203 rel_sts_inv_01_r89.dat STS Sequences
204 rel_sts_mam_01_r89.dat STS Sequences
205 rel_sts_mus_01_r89.dat STS Sequences
206 rel_sts_pln_01_r89.dat STS Sequences
207 rel_sts_pro_01_r89.dat STS Sequences
208 rel_sts_rod_01_r89.dat STS Sequences
209 rel_sts_vrt_01_r89.dat STS Sequences
210 rel_tpa_fun_01_r89.dat Third Party Annotation
211 rel_tpa_hum_01_r89.dat Third Party Annotation
212 rel_tpa_inv_01_r89.dat Third Party Annotation
213 rel_tpa_mam_01_r89.dat Third Party Annotation
214 rel_tpa_mus_01_r89.dat Third Party Annotation
215 rel_tpa_phg_01_r89.dat Third Party Annotation
216 rel_tpa_pln_01_r89.dat Third Party Annotation
217 rel_tpa_pro_01_r89.dat Third Party Annotation
218 rel_tpa_rod_01_r89.dat Third Party Annotation
219 rel_tpa_syn_01_r89.dat Third Party Annotation
220 rel_tpa_vrl_01_r89.dat Third Party Annotation
221 rel_tpa_vrt_01_r89.dat Third Party Annotation
222 rel_con_env_01_r89.dat Constructed Sequences
223 rel_con_fun_01_r89.dat Constructed Sequences
224 rel_con_hum_01_r89.dat Constructed Sequences
225 rel_con_inv_01_r89.dat Constructed Sequences
226 rel_con_mam_01_r89.dat Constructed Sequences
227 rel_con_mus_01_r89.dat Constructed Sequences
228 rel_con_pln_01_r89.dat Constructed Sequences
229 rel_con_pro_01_r89.dat Constructed Sequences
230 rel_con_rod_01_r89.dat Constructed Sequences
231 rel_con_vrt_01_r89.dat Constructed Sequences
232 wgs_aaaa_pln.dat WGS - Oryza sativa (indica cultivar-group)
233 wgs_aaab_inv.dat WGS - Anopheles gambiae strain PEST
234 wgs_aaac_pro.dat WGS - Bacillus anthracis A2012
235 wgs_aaah_pro.dat WGS - Chloroflexus aurantiacus
236 wgs_aaak_pro.dat WGS - Enterococcus faecium
237 wgs_aaal_pro.dat WGS - Xylella fastidiosa Dixon
238 wgs_aaam_pro.dat WGS - Xylella fastidiosa Ann-1
239 wgs_aaap_pro.dat WGS - Magnetospirillum magnetotacticum
240 wgs_aaau_pro.dat WGS - Azotobacter vinelandii
241 wgs_aaaw_pro.dat WGS - Desulfitobacterium hafniense
242 wgs_aaay_pro.dat WGS - Nostoc punctiforme
243 wgs_aabc_pro.dat WGS - Ferroplasma acidarmanus
244 wgs_aabf_pro.dat WGS - Fusobacterium nucleatum vincentii ATCC49256
245 wgs_aabg_pro.dat WGS - Clostridium thermocellum ATCC 27405
246 wgs_aabl_inv.dat WGS - Plasmodium yoelii yoelii
247 wgs_aabm_pro.dat WGS - Bifidobacterium longum DJO10A
248 wgs_aabr_rod.dat WGS - Rattus norvegicus
249 wgs_aabs_inv.dat WGS - Ciona intestinalis
250 wgs_aabt_fun.dat WGS - Aspergillus terreus
251 wgs_aabu_inv.dat WGS - Drosophila melanogaster strain y
252 wgs_aabw_pro.dat WGS - Rickettsia sibirica
253 wgs_aabx_fun.dat WGS - Neurospora crassa strain OR74A
254 wgs_aaby_fun.dat WGS - Saccharomyces paradoxus
255 wgs_aabz_fun.dat WGS - Saccharomyces mikatae
256 wgs_aaca_fun.dat WGS - Saccharomyces bayanus
257 wgs_aacb_inv.dat WGS - Giardia lamblia ATCC 50803
258 wgs_aacc_hum.dat WGS - Homo sapiens chromosome 7
259 wgs_aacd_fun.dat WGS - Aspergillus nidulans FGSC A4
260 wgs_aace_fun.dat WGS - Saccharomyces kluyveri NRRL Y-12651
261 wgs_aacf_fun.dat WGS - Saccharomyces castellii NRRL Y-12630
262 wgs_aacg_fun.dat WGS - Saccharomyces bayanus 623-6C
263 wgs_aach_fun.dat WGS - Saccharomyces mikatae IFO 1815
264 wgs_aaci_fun.dat WGS - Saccharomyces kudriavzevii IFO 1802
265 wgs_aacj_pro.dat WGS - Haemophilus somnus 2336
266 wgs_aack_pro.dat WGS - Actinobacillus pleuropneumoniae
267 wgs_aacm_fun.dat WGS - Gibberella zeae PH-1
268 wgs_aacn_mam.dat WGS - Canis familiaris
269 wgs_aaco_fun.dat WGS - Cryptococcus neoformans var. grubii H99
270 wgs_aacp_fun.dat WGS - Ustilago maydis 521
271 wgs_aacq_fun.dat WGS - Candida albicans SC5314
272 wgs_aacr_pro.dat WGS - Pasteuria nishizawae
273 wgs_aacs_fun.dat WGS - Coprinopsis cinerea okayama7#130
274 wgs_aact_inv.dat WGS - Ciona savignyi
275 wgs_aacu_fun.dat WGS - Magnaporthe grisea
276 wgs_aacv_pln.dat WGS - Oryza sativa (japonica cultivar-group)
277 wgs_aacw_fun.dat WGS - Rhizopus oryzae RA 99-880
278 wgs_aacy_env.dat WGS - multiple-organism environmental
279 wgs_aacz_mam.dat WGS - Pan troglodytes WU
280 wgs_aada_mam.dat WGS - Pan troglodytes
281 wgs_aadb_hum.dat WGS - Homo sapiens Celera WGA
282 wgs_aadc_hum.dat WGS - Homo sapiens Celera CSA
283 wgs_aadd_hum.dat WGS - Homo sapiens Celera WGSA
284 wgs_aade_inv.dat WGS - Drosophila pseudoobscura
285 wgs_aadg_inv.dat WGS - Apis mellifera
286 wgs_aadj_pro.dat WGS - Rickettsia rickettsii
287 wgs_aadk_inv.dat WGS - Bombyx mori
288 wgs_aadl_env.dat WGS - environmental-sampling
289 wgs_aadm_fun.dat WGS - Kluyveromyces waltii NCYC 2644
290 wgs_aadn_vrt.dat WGS - Gallus gallus
291 wgs_aado_pro.dat WGS - Haemophilus influenzae R2846
292 wgs_aadp_pro.dat WGS - Haemophilus influenzae R2866
293 wgs_aadq_pro.dat WGS - Listeria monocytogenes str. 1/2a
294 wgs_aadr_pro.dat WGS - Listeria monocytogenes str. 4b
295 wgs_aads_fun.dat WGS - Phanerochaete chrysosporium RP-78
296 wgs_aadv_pro.dat WGS - Crocosphaera watsonii WH 8501
297 wgs_aadw_pro.dat WGS - Exiguobacterium sp. 255-15
298 wgs_aaec_fun.dat WGS - Coccidioides immitis RS
299 wgs_aaee_inv.dat WGS - Cryptosporidium parvum chromosome 7
300 wgs_aaef_pro.dat WGS - Rubrobacter xylanophilus DSM 9941
301 wgs_aaeg_fun.dat WGS - Saccharomyces cerevisiae RM11-1a
302 wgs_aaeh_pro.dat WGS - Burkholderia cepacia R1808
303 wgs_aaek_pro.dat WGS - Bacillus cereus G9241
304 wgs_aael_inv.dat WGS - Cryptosporidium hominis strain TU502
305 wgs_aaem_pro.dat WGS - Rubrivivax gelatinosus PM1
306 wgs_aaen_pro.dat WGS - Bacillus anthracis str. CNEVA-9066
307 wgs_aaeo_pro.dat WGS - Bacillus anthracis str. A1055
308 wgs_aaep_pro.dat WGS - Bacillus anthracis str. Vollum
309 wgs_aaeq_pro.dat WGS - Bacillus anthracis str. Kruger B
310 wgs_aaer_pro.dat WGS - Bacillus anthracis str. USA6153
311 wgs_aaes_pro.dat WGS - Bacillus anthracis str. Australia 94
312 wgs_aaeu_inv.dat WGS - Drosophila yakuba
313 wgs_aaew_pro.dat WGS - Desulfuromonas acetoxidans DSM 684
314 wgs_aaex_mam.dat WGS - Canis familiaris
315 wgs_aaey_fun.dat WGS - Cryptococcus neoformans var. neoformans
316 wgs_aafa_pro.dat WGS - Streptococcus suis 89/1591
317 wgs_aafb_inv.dat WGS - Entamoeba histolytica HM-1:IMSS
318 wgs_aafc_mam.dat WGS - Bos taurus
319 wgs_aafd_pln.dat WGS - Thalassiosira pseudonana CCMP1335
320 wgs_aafe_pro.dat WGS - Rickettsia akari str. Hartford
321 wgs_aaff_pro.dat WGS - Rickettsia canadensis str. McKiel
322 wgs_aafi_inv.dat WGS - Dictyostelium discoideum
323 wgs_aafj_pro.dat WGS - Campylobacter upsaliensis RM3195
324 wgs_aafk_pro.dat WGS - Campylobacter lari RM2100
325 wgs_aafl_pro.dat WGS - Campylobacter coli RM2228
326 wgs_aafm_fun.dat WGS - Pichia guilliermondii ATCC 6260
327 wgs_aafn_fun.dat WGS - Candida tropicalis T1
328 wgs_aafo_fun.dat WGS - Candida albicans WO-1
329 wgs_aafp_fun.dat WGS - Cryptococcus neoformans R265
330 wgs_aafr_mam.dat WGS - Monodelphis domestica
331 wgs_aafs_inv.dat WGS - Drosophila pseudoobscura
332 wgs_aaft_fun.dat WGS - Clavispora lusitaniae ATCC 42720
333 wgs_aafu_fun.dat WGS - Chaetomium globosum CBS 148.51
334 wgs_aafv_pro.dat WGS - Streptococcus pyogenes M49 591
335 wgs_aafw_fun.dat WGS - Saccharomyces cerevisiae YJM789
336 wgs_aafx_env.dat WGS - environmental sequence
337 wgs_aafy_env.dat WGS - environmental sequence
338 wgs_aafz_env.dat WGS - environmental sequence
339 wgs_aaga_env.dat WGS - environmental sequence
340 wgs_aagb_pro.dat WGS - Wolbachia(Drosophila ananassae)
341 wgs_aagc_pro.dat WGS - Wolbachia(Drosophila simulans)
342 wgs_aagd_inv.dat WGS - Caenorhabditis remanei
343 wgs_aage_inv.dat WGS - Aedes aegypti
344 wgs_aagf_inv.dat WGS - Tetrahymena thermophila SB210
345 wgs_aagh_inv.dat WGS - Drosophila simulans
346 wgs_aagi_fun.dat WGS - Phaeosphaeria nodorum SN15
347 wgs_aagj_inv.dat WGS - Strongylocentrotus purpuratus
348 wgs_aagk_inv.dat WGS - Theileria parva
349 wgs_aagl_mam.dat WGS - Gorilla gorilla
350 wgs_aagm_mam.dat WGS - Pongo pygmaeus
351 wgs_aagn_mam.dat WGS - Macaca mulatta
352 wgs_aagp_pro.dat WGS - Brevibacterium linens BL2
353 wgs_aagt_fun.dat WGS - Sclerotinia sclerotiorum 1980
354 wgs_aagu_mam.dat WGS - Loxodonta africana
355 wgs_aagv_mam.dat WGS - Dasypus novemcinctus
356 wgs_aagw_mam.dat WGS - Oryctolagus cuniculus
357 wgs_aagx_pro.dat WGS - Mycoplasma genitalium G-37
358 wgs_aagy_pro.dat WGS - Streptococcus pneumoniae TIGR4
359 wgs_aagz_inv.dat WGS - Trypanosoma brucei
360 wgs_aaha_inv.dat WGS - Trypanosoma brucei
361 wgs_aahb_inv.dat WGS - Trypanosoma brucei
362 wgs_aahc_inv.dat WGS - Trichomonas vaginalis
363 wgs_aahf_fun.dat WGS - Aspergillus fumigatus Af293
364 wgs_aahj_pro.dat WGS - Chlorobium limicola DSM 245
365 wgs_aahk_inv.dat WGS - Trypanosoma cruzi
366 wgs_aahm_pro.dat WGS - Burkholderia mallei 10229
367 wgs_aahn_pro.dat WGS - Burkholderia mallei 10399
368 wgs_aaho_pro.dat WGS - Burkholderia mallei GB8 horse 4
369 wgs_aahp_pro.dat WGS - Burkholderia mallei NCTC 10247
370 wgs_aahq_pro.dat WGS - Burkholderia mallei SAVP1
371 wgs_aahr_pro.dat WGS - Burkholderia pseudomallei 1655
372 wgs_aahs_pro.dat WGS - Burkholderia pseudomallei 1710a
373 wgs_aahu_pro.dat WGS - Burkholderia pseudomallei 668
374 wgs_aahv_pro.dat WGS - Burkholderia pseudomallei Pasteur
375 wgs_aahw_pro.dat WGS - Burkholderia pseudomallei S13
376 wgs_aahx_rod.dat WGS - Rattus norvegicus
377 wgs_aahy_mus.dat WGS - Mus musculus
378 wgs_aaib_pro.dat WGS - Chlorobium phaeobacteroides DSM 266
379 wgs_aaic_pro.dat WGS - Chlorobium phaeobacteroides BS1
380 wgs_aaid_fun.dat WGS - Botryotinia fuckeliana B05.10
381 wgs_aaif_pro.dat WGS - Ehrlichia chaffeensis str. Sapulpa
382 wgs_aaih_fun.dat WGS - Aspergillus flavus NRRL3357
383 wgs_aaii_pro.dat WGS - Frankia sp. EAN1pec
384 wgs_aaij_pro.dat WGS - Prosthecochloris aestuarii DSM 271
385 wgs_aaik_pro.dat WGS - Pelodictyon phaeoclathratiforme BU-1
386 wgs_aail_fun.dat WGS - Trichoderma reesei QM9414
387 wgs_aaim_fun.dat WGS - Gibberella moniliformis 7600
388 wgs_aain_pro.dat WGS - Shewanella amazonensis SB2B
389 wgs_aaio_pro.dat WGS - Shewanella baltica OS155
390 wgs_aaiq_pro.dat WGS - Burkholderia mallei FMH
391 wgs_aair_pro.dat WGS - Burkholderia mallei JHU
392 wgs_aait_pro.dat WGS - Paracoccus denitrificans PD1222
393 wgs_aaiw_fun.dat WGS - Uncinocarpus reesii 1704
394 wgs_aaix_pro.dat WGS - Mycobacterium tuberculosis F11
395 wgs_aaiy_mam.dat WGS - Echinops telfairi
396 wgs_aaiz_inv.dat WGS - Drosophila persimilis
397 wgs_aajb_pro.dat WGS - Nocardioides sp. JS614
398 wgs_aajd_pro.dat WGS - Prosthecochloris vibrioformis DSM 265
399 wgs_aajh_pro.dat WGS - Pelobacter propionicus DSM 2379
400 wgs_aaji_fun.dat WGS - Ajellomyces capsulatus NAm1
401 wgs_aajj_inv.dat WGS - Tribolium castaneum
402 wgs_aajm_pro.dat WGS - Bacillus thuringiensis ATCC 35646
403 wgs_aajn_fun.dat WGS - Aspergillus terreus NIH2624
404 wgs_aajo_pro.dat WGS - Streptococcus agalactiae 18RS21
405 wgs_aajp_pro.dat WGS - Streptococcus agalactiae 515
406 wgs_aajq_pro.dat WGS - Streptococcus agalactiae CJB111
407 wgs_aajr_pro.dat WGS - Streptococcus agalactiae COH1
408 wgs_aajs_pro.dat WGS - Streptococcus agalactiae H36B
409 wgs_aajt_pro.dat WGS - Escherichia coli B7A
410 wgs_aaju_pro.dat WGS - Escherichia coli F11
411 wgs_aajv_pro.dat WGS - Escherichia coli E22
412 wgs_aajw_pro.dat WGS - Escherichia coli E110019
413 wgs_aajx_pro.dat WGS - Escherichia coli B171
414 wgs_aajy_pro.dat WGS - Escherichia coli HS
415 wgs_aajz_pro.dat WGS - Escherichia coli E24377A
416 wgs_aaka_pro.dat WGS - Shigella boydii BS512
417 wgs_aakb_pro.dat WGS - Escherichia coli 53638
418 wgs_aakc_pro.dat WGS - Actinobacillus succinogenes 130Z
419 wgs_aakd_fun.dat WGS - Aspergillus clavatus NRRL 1
420 wgs_aake_fun.dat WGS - Neosartorya fischeri NRRL 181
421 wgs_aakf_pro.dat WGS - Vibrio cholerae MO10
422 wgs_aakg_pro.dat WGS - Vibrio cholerae O395
423 wgs_aakh_pro.dat WGS - Vibrio cholerae RC385
424 wgs_aaki_pro.dat WGS - Vibrio cholerae V51
425 wgs_aakj_pro.dat WGS - Vibrio cholerae V52
426 wgs_aakk_pro.dat WGS - Vibrio sp. Ex25
427 wgs_aakl_pro.dat WGS - Ralstonia solanacearum UW551
428 wgs_aakm_inv.dat WGS - Plasmodium vivax
429 wgs_aakn_rod.dat WGS - Cavia porcellus
430 wgs_aako_inv.dat WGS - Drosophila sechellia
431 wgs_aakq_pro.dat WGS - Thermoanaerobacter ethanolicus ATCC 33223
432 wgs_aakr_pro.dat WGS - Mycobacterium tuberculosis C
433 wgs_aaks_pro.dat WGS - Yersinia pestis Angola
434 wgs_aakt_pro.dat WGS - Yersinia pseudotuberculosis IP 31758
435 wgs_aaku_pro.dat WGS - Alkaliphilus metalliredigenes QYMF
436 wgs_aakv_pro.dat WGS - Pseudomonas aeruginosa C3719
437 wgs_aakw_pro.dat WGS - Pseudomonas aeruginosa 2192
438 wgs_aakx_pro.dat WGS - Burkholderia cenocepacia PC184
439 wgs_aaky_pro.dat WGS - Burkholderia dolosa AUO158
440 wgs_aalb_pro.dat WGS - Shewanella putrefaciens CN-32
441 wgs_aalc_pro.dat WGS - Yersinia bercovieri ATCC 43970
442 wgs_aald_pro.dat WGS - Yersinia mollaretii ATCC 43969
443 wgs_aale_pro.dat WGS - Yersinia frederiksenii ATCC 33641
444 wgs_aalf_pro.dat WGS - Yersinia intermedia ATCC 29909
445 wgs_aalg_pro.dat WGS - Marinobacter aquaeolei VT8
446 wgs_aalj_pro.dat WGS - Bradyrhizobium sp. BTAi1
447 wgs_aall_pro.dat WGS - Bacillus cereus subsp. cytotoxis NVH 391-98
448 wgs_aalm_pro.dat WGS - Pseudomonas putida F1
449 wgs_aaln_pro.dat WGS - Shewanella sp. W3-18-1
450 wgs_aalo_pro.dat WGS - Clostridium beijerincki NCIMB 8052
451 wgs_aalp_pro.dat WGS - Prochlorococcus marinus str. MIT 9211
452 wgs_aals_pro.dat WGS - Shewanella sp. PV-4
453 wgs_aalt_mam.dat WGS - Sorex araneus
454 wgs_aalv_pro.dat WGS - Sulfitobacter sp. EE-36
455 wgs_aalw_pro.dat WGS - Caldicellulosiruptor saccharolyticus
456 wgs_aaly_pro.dat WGS - Roseovarius nubinhibens ISM
457 wgs_aalz_pro.dat WGS - Sulfitobacter sp. NAS-14.1
458 wgs_aama_pro.dat WGS - Burkholderia pseudomallei 1106a
459 wgs_aamb_pro.dat WGS - Burkholderia pseudomallei 1106b
460 wgs_aamd_pro.dat WGS - Stigmatella aurantiaca DW4/3-1
461 wgs_aame_pro.dat WGS - Rhodobacter sphaeroides ATCC 17025
462 wgs_aamf_pro.dat WGS - Rhodobacter sphaeroides ATCC 17029
463 wgs_aamg_env.dat WGS - uncultured human fecal virus
464 wgs_aamh_env.dat WGS - uncultured human fecal virus
465 wgs_aami_env.dat WGS - uncultured human fecal virus
466 wgs_aamj_pro.dat WGS - Shigella dysenteriae 1012
467 wgs_aamk_pro.dat WGS - Escherichia coli 101-1
468 wgs_aaml_pro.dat WGS - Clostridium difficile QCD-32g58
469 wgs_aamm_pro.dat WGS - Burkholderia pseudomallei 406e
470 wgs_aamn_pro.dat WGS - Janibacter sp. HTCC2649
471 wgs_aamo_pro.dat WGS - Oceanicola batsensis HTCC2597
472 wgs_aamp_pro.dat WGS - Croceibacter atlanticus HTCC2559
473 wgs_aamq_pro.dat WGS - Oceanicaulis alexandrii HTCC2633
474 wgs_aamr_pro.dat WGS - Vibrio splendidus 12B01
475 wgs_aams_pro.dat WGS - Loktanella vestfoldensis SKA53
476 wgs_aamt_pro.dat WGS - Rhodobacterales bacterium HTCC2654
477 wgs_aamu_pro.dat WGS - Parvularcula bermudensis HTCC2503
478 wgs_aamv_pro.dat WGS - Roseovarius sp. 217
479 wgs_aamw_pro.dat WGS - Erythrobacter sp. NAP1
480 wgs_aamx_pro.dat WGS - Idiomarina baltica OS145
481 wgs_aamy_pro.dat WGS - Nitrobacter sp. Nb-311A
482 wgs_aamz_pro.dat WGS - Cellulophaga sp. MED134
483 wgs_aana_pro.dat WGS - Tenacibaculum sp. MED152
484 wgs_aanb_pro.dat WGS - Roseobacter sp. MED193
485 wgs_aanc_pro.dat WGS - Flavobacterium sp. MED217
486 wgs_aand_pro.dat WGS - Vibrio sp. MED222
487 wgs_aane_pro.dat WGS - Marinomonas sp. MED121
488 wgs_aanf_pro.dat WGS - Bartonella bacilliformis KC583
489 wgs_aang_mam.dat WGS - Felis catus
490 wgs_aanh_vrt.dat WGS - Gasterosteus aculeatus
491 wgs_aani_inv.dat WGS - Drosophila virilis
492 wgs_aanj_pro.dat WGS - Campylobacter jejuni subsp. jejuni CF93-6
493 wgs_aank_pro.dat WGS - Campylobacter jejuni subsp. jejuni 260.94
494 wgs_aanl_pro.dat WGS - Candidatus Sulcia muelleri str. Hc
495 wgs_aanm_pro.dat WGS - Polaromonas naphthalenivorans CJ2
496 wgs_aann_mam.dat WGS - Erinaceus europaeus
497 wgs_aano_pro.dat WGS - Synechococcus sp. WH 5701
498 wgs_aanp_pro.dat WGS - Synechococcus sp. RS9917
499 wgs_aanq_pro.dat WGS - Campylobacter jejuni subsp. jejuni HB93-13
500 wgs_aans_inv.dat WGS - Plasmodium falciparum HB3
501 wgs_aant_pro.dat WGS - Campylobacter jejuni subsp. jejuni 84-25
502 wgs_aanu_mam.dat WGS - Macaca mulatta
503 wgs_aanv_inv.dat WGS - Entamoeba dispar SAW760
504 wgs_aanw_inv.dat WGS - Entamoeba invadens IP1
505 wgs_aanx_pro.dat WGS - Burkholderia mallei 2002721280
506 wgs_aany_pro.dat WGS - Campylobacter jejuni subsp. jejuni 81-176
507 wgs_aanz_pro.dat WGS - Blastopirellula marina DSM 3645
508 wgs_aaoa_pro.dat WGS - gamma proteobacterium KT 71
509 wgs_aaob_pro.dat WGS - marine actinobacterium PHSC20C1
510 wgs_aaoc_pro.dat WGS - Flavobacteriales bacterium HTCC2170
511 wgs_aaod_pro.dat WGS - Alteromonas macleodii 'Deep ecotype'
512 wgs_aaoe_pro.dat WGS - Reinekea sp. MED297
513 wgs_aaof_pro.dat WGS - Nitrococcus mobilis Nb-231
514 wgs_aaog_pro.dat WGS - Polaribacter irgensii 23-P
515 wgs_aaoh_pro.dat WGS - Pseudoalteromonas tunicata D2
516 wgs_aaoi_pro.dat WGS - Robiginitalea biformata HTCC2501
517 wgs_aaoj_pro.dat WGS - Vibrio angustum S14
518 wgs_aaok_pro.dat WGS - Synechococcus sp. WH 7805
519 wgs_aaom_pro.dat WGS - Dehalococcoides sp. BAV1
520 wgs_aaon_pro.dat WGS - Geobacter uraniumreducens Rf4
521 wgs_aaoo_pro.dat WGS - Acidiphilium cryptum JF-5
522 wgs_aaop_pro.dat WGS - Desulfotomaculum reducens MI-1
523 wgs_aaoq_pro.dat WGS - Halorhodospira halophila SL1
524 wgs_aaos_pro.dat Yersinia pestis biovar Orientalis str. IP275
525 wgs_aaot_pro.dat WGS - Oceanicola granulosus HTCC2516
526 wgs_aaou_pro.dat WGS - Photobacterium sp. SKA34
527 wgs_aaov_pro.dat WGS - Lactobacillus reuteri JCM 1112
528 wgs_aaow_pro.dat WGS - Oceanospirillum sp. MED92
529 wgs_aaox_pro.dat WGS - Bacillus sp. NRRL B-14911
530 wgs_aaoy_pro.dat WGS - Bacillus weihenstephanensis KBAB4
531 wgs_aaoz_pro.dat WGS - Halothermothrix orenii H 168
532 wgs_aapa_pro.dat WGS - Mycobacterium flavescens PYR-GCK
533 wgs_aapc_pro.dat WGS - Xanthobacter autotrophicus Py2
534 wgs_aapd_pro.dat WGS - Flavobacteria bacterium BBFL7
535 wgs_aape_mam.dat WGS - Myotis lucifugus
536 wgs_aapf_pro.dat WGS - Mycobacterium vanbaalenii PYR-1
537 wgs_aapg_pro.dat WGS - Psychromonas sp. CNPT3
538 wgs_aaph_pro.dat WGS - Photobacterium profundum 3TCK
539 wgs_aapi_pro.dat WGS - marine gamma proteobacterium HTCC2207
540 wgs_aapj_pro.dat WGS - Aurantimonas sp. SI85-9A1
541 wgs_aapk_pro.dat WGS - Staphylococcus aureus subsp. aureus JH1
542 wgs_aapl_pro.dat WGS - Staphylococcus aureus subsp. aureus JH9
543 wgs_aapm_pro.dat WGS - Flavobacterium johnsoniae UW101
544 wgs_aapn_mam.dat WGS - Ornithorhynchus anatinus
545 wgs_aapo_fun.dat WGS - Lodderomyces elongisporus NRRL YB-4239
546 wgs_aapp_inv.dat WGS - Drosophila ananassae
547 wgs_aapq_inv.dat WGS - Drosophila erecta
548 wgs_aapr_pro.dat WGS - Psychroflexus torquis ATCC 700755
549 wgs_aaps_pro.dat WGS - Vibrio alginolyticus 12G01
550 wgs_aapt_inv.dat WGS - Drosophila grimshawi
551 wgs_aapu_inv.dat WGS - Drosophila mojavensis
552 wgs_aapv_pro.dat WGS - Candidatus Pelagibacter ubique HTCC1002
553 wgs_aapx_pro.dat WGS - Psychrobacter sp. PRwf-1
554 wgs_aapy_mam.dat WGS - Tupaia belangeri
555 wgs_aapz_pro.dat WGS - Lactobacillus reuteri 100-23
556 wgs_aaqb_inv.dat WGS - Drosophila willistoni
557 wgs_aaqc_pro.dat WGS - Mycobacterium sp. JLS
558 wgs_aaqd_pro.dat WGS - Mycobacterium sp. KMS
559 wgs_aaqe_pro.dat WGS - Pseudomonas aeruginosa PA7
560 wgs_aaqf_pro.dat WGS - delta proteobacterium MLMS-1
561 wgs_aaqg_pro.dat WGS - Sphingomonas sp. SKA58
562 wgs_aaqh_pro.dat WGS - Oceanobacter sp. RED65
563 wgs_aaqi_pro.dat WGS - Coxiella burnetii Dugway 7E9-12
564 wgs_aaqj_pro.dat WGS - Rickettsiella grylli
565 wgs_aaqk_env.dat WGS - environmental sequence
566 wgs_aaql_env.dat WGS - environmental sequence
567 wgs_aaqm_inv.dat WGS - Toxoplasma gondii RH
568 wgs_aaqn_pro.dat WGS - Xanthomonas oryzae pv. oryzicola BLS256
569 wgs_aaqo_pro.dat WGS - Coxiella burnetii RSA 331
570 wgs_aaqp_pro.dat Wolbachia endosymbiont of Drosophila willistoni
571 wgs_aaqq_rod.dat WGS - Spermophilus tridecemlineatus
572 wgs_aaqr_mam.dat WGS - Otolemur garnettii
573 wgs_aaqs_pro.dat WGS - Psychromonas ingrahamii 37
574 wgs_aaqt_pro.dat WGS - Clostridium phytofermentans ISDg
575 wgs_aaqu_pro.dat WGS - Roseiflexus sp. RS-1
576 wgs_aaqv_pro.dat WGS - Clostridium sp. OhILAs
577 wgs_aaqw_pro.dat WGS - Pseudomonas aeruginosa PACS2
578 wgs_aaqx_pln.dat WGS - Phytophthora ramorum
579 wgs_aaqy_pln.dat WGS - Phytophthora sojae
580 wgs_aaqz_pro.dat WGS - Campylobacter concisus 13826
581 wgs_aara_pro.dat WGS - Campylobacter curvus 525.92
582 wgs_aarb_pro.dat WGS - Campylobacter jejuni subsp. doylei 269.97
583 wgs_aarc_pro.dat WGS - Rickettsia bellii OSU 85-389
584 wgs_aare_fun.dat WGS - Ascosphaera apis USDA-ARSEF 7405
585 wgs_aarf_pro.dat WGS - Paenibacillus larvae subsp. larvae
586 wgs_aarg_pro.dat WGS - Fusobacterium nucleatum subsp. polymorphum
587 wgs_aarh_pln.dat WGS - Populus trichocarpa
588 wgs_aari_pro.dat WGS - Listeria monocytogenes FSL F2-515
589 wgs_aarj_pro.dat WGS - Listeria monocytogenes FSL J1-194
590 wgs_aark_pro.dat WGS - Listeria monocytogenes FSL J1-175
591 wgs_aarl_pro.dat WGS - Listeria monocytogenes FSL J1-208
592 wgs_aarm_pro.dat WGS - Listeria monocytogenes FSL J2-003
593 wgs_aaro_pro.dat WGS - Listeria monocytogenes FSL J2-064
594 wgs_aarq_pro.dat WGS - Listeria monocytogenes FSL N3-165
595 wgs_aarr_pro.dat WGS - Listeria monocytogenes FSL R2-503
596 wgs_aaru_pro.dat WGS - Listeria monocytogenes F6900
597 wgs_aarw_pro.dat WGS - Listeria monocytogenes J0161
598 wgs_aarx_pro.dat WGS - Listeria monocytogenes J2818
599 wgs_aary_pro.dat WGS - Listeria monocytogenes LO28
600 wgs_aarz_pro.dat WGS - Listeria monocytogenes 10403S
601 wgs_aasa_pro.dat WGS - Mannheimia haemolytica PHL213
602 wgs_aasc_inv.dat WGS - Aplysia californica
603 wgs_aasd_pro.dat WGS - Acidovorax sp. JS42
604 wgs_aase_pro.dat WGS - Chlorobium ferrooxidans DSM 13031
605 wgs_aasg_pln.dat WGS - Ricinus communis
606 wgs_aash_pro.dat WGS - Geobacter sp. FRC-32
607 wgs_aasi_pro.dat WGS - Methanoculleus marisnigri JR1
608 wgs_aasj_pro.dat WGS - Thermofilum pendens Hrk 5
609 wgs_aasl_pro.dat WGS - Campylobacter jejuni subsp. jejuni 81-176
610 wgs_aasm_inv.dat WGS - Plasmodium falciparum Dd2
611 wgs_aasn_pro.dat WGS - Mycobacterium tuberculosis str. Haarlem
612 wgs_aaso_fun.dat WGS - Coccidioides immitis H538.4
613 wgs_aasp_pro.dat WGS - desc to add
614 wgs_aasq_pro.dat WGS - Verminephrobacter eiseniae EF01-2
615 wgs_aasr_inv.dat WGS - Drosophila simulans
616 wgs_aass_inv.dat WGS - Drosophila simulans
617 wgs_aast_inv.dat WGS - Drosophila simulans
618 wgs_aasu_inv.dat WGS - Drosophila simulans
619 wgs_aasv_inv.dat WGS - Drosophila simulans
620 wgs_aasw_inv.dat WGS - Drosophila simulans
621 wgs_aasx_pro.dat WGS - Acidovorax avenae subsp. citrulli AAC00-1
622 wgs_aasz_env.dat WGS - environmental sequence
623 wgs_aatg_pro.dat WGS - Sinorhizobium medicae WSM419
624 wgs_aath_pro.dat WGS - Caulobacter sp. K31
625 wgs_aati_pro.dat WGS - Herpetosiphon aurantiacus ATCC 23779
626 wgs_aatj_pro.dat WGS - Salinispora tropica CNB-440
627 wgs_aatk_pro.dat WGS - Shewanella baltica OS195
628 wgs_aatl_pro.dat WGS - Listeria monocytogenes HPB2262
629 wgs_aatm_fun.dat WGS - Schizosaccharomyces japonicus yFS275
630 wgs_aatn_env.dat WGS - environmental sequence
631 wgs_aato_env.dat WGS - environmental sequence
632 wgs_aatp_pro.dat WGS - Fulvimarina pelagi HTCC2506
633 wgs_aatq_pro.dat WGS - Roseovarius sp. HTCC2601
634 wgs_aatr_pro.dat WGS - alpha proteobacterium HTCC2255
635 wgs_aats_pro.dat WGS - Mariprofundus ferrooxydans PV-1
636 wgs_aatt_fun.dat WGS - Batrachochytrium dendrobatidis JEL423
637 wgs_aatu_pln.dat WGS - Phytophthora infestans T30-4
638 wgs_aatv_pro.dat WGS - Thermoanaerobacter ethanolicus X514
639 wgs_aatw_pro.dat WGS - Desulfovibrio vulgaris subsp. vulgaris DP4
640 wgs_aatx_fun.dat WGS - Coccidioides immitis RMSCC 2394
641 wgs_aaty_pro.dat WGS - Vibrio cholerae AM-19226
642 wgs_aatz_pro.dat WGS - Synechococcus sp. BL107
643 wgs_aaua_pro.dat WGS - Synechococcus sp. RS9916
644 wgs_aaub_pro.dat WGS - Yersinia pestis FV-1
645 wgs_aauc_pro.dat WGS - Polynucleobacter sp. QLW-P1DMWA-1
646 wgs_aaue_pro.dat WGS - Bacillus cereus AH820
647 wgs_aauf_pro.dat WGS - Bacillus cereus AH187
648 wgs_aaug_pro.dat WGS - Burkholderia phymatum STM815
649 wgs_aauh_pro.dat WGS - Burkholderia phytofirmans PsJN
650 wgs_aaui_pro.dat WGS - Chloroflexus aggregans DSM 9485
651 wgs_aauj_pro.dat WGS - Comamonas testosteroni KF-1
652 wgs_aauk_pro.dat WGS - Fervidobacterium nodosum Rt17-B1
653 wgs_aaul_pro.dat WGS - Pseudomonas mendocina ymp
654 wgs_aaum_pro.dat WGS - Roseiflexus castenholzii DSM 13941
655 wgs_aaun_pro.dat WGS - Serratia proteamaculans 568
656 wgs_aauo_pro.dat WGS - Shewanella woodyi ATCC 51908
657 wgs_aaup_pro.dat WGS - Coxiella burnetii 'MSU Goat Q177'
658 wgs_aaur_pro.dat WGS - Vibrio cholerae 1587
659 wgs_aaus_pro.dat WGS - Vibrio cholerae MAK 757
660 wgs_aauu_pro.dat WGS - Vibrio cholerae MZO-3
661 wgs_aauv_pro.dat WGS - Oenococcus oeni ATCC BAA-1163
662 wgs_aauw_pro.dat WGS - Stappia aggregata IAM 12614
663 wgs_aaux_pro.dat WGS - Methylophilales bacterium HTCC2181
664 wgs_baab_inv.dat WGS - Bombyx mori
665 wgs_baac_pro.dat WGS - Pelotomaculum thermopropionicum SI
666 wgs_baad_pro.dat WGS - Bifidobacterium adolescentis
667 wgs_baaf_vrt.dat WGS - Oryzias latipes
668 wgs_caaa_mus.dat WGS - Mus musculus
669 wgs_caab_vrt.dat WGS - Fugu rubripes
670 wgs_caac_inv.dat WGS - Caenorhabditis briggsae
671 wgs_caae_vrt.dat WGS - Tetraodon nigroviridis
672 wgs_caai_inv.dat WGS - Plasmodium berghei
673 wgs_caaj_inv.dat WGS - Plasmodium chabaudi
674 wgs_caak_vrt.dat WGS - Danio rerio
675 wgs_caal_inv.dat WGS - Paramecium tetraurelia
676 wgs_caam_env.dat WGS - environmental sequence
677 wgs_caan_env.dat WGS - Neanderthal fossil environmental sequences
APPENDIX A
DATABASE GROWTH TABLE
The following table shows the growth of the EMBL Nucleotide Sequence Database
at each release.
Release Month Entries Nucleotides
1 06/1982 568 585433
2 04/1983 811 1114447
3 12/1983 1481 1654863
4 08/1984 1698 2147205
5 04/1985 2378 2874493
6 08/1985 4835 4567592
7 12/1985 5789 5622638
8 04/1986 6395 6353040
9 09/1986 7630 7813214
10 12/1986 8817 9766948
11 04/1987 11621 12189783
12 07/1987 12706 13638061
13 10/1987 14397 16023478
14 01/1988 15344 17272160
15 05/1988 17961 20318442
16 08/1988 19592 22625941
17 11/1988 20695 24211054
18 02/1989 22938 27249830
19 05/1989 24365 29066676
20 08/1989 26223 31240948
21 11/1989 28679 34748087
22 02/1990 31508 38165786
23 05/1990 34902 42923803
24 08/1990 37784 47354438
25 11/1990 41580 52900354
26 02/1991 43745 55859549
27 05/1991 46871 59915244
28 09/1991 54558 70448052
29 12/1991 57655 75400487
30 03/1992 63378 83574342
31 06/1992 72481 94390065
32 09/1992 79377 101292310
33 12/1992 89100 111413979
34 03/1993 99591 121420828
35 06/1993 108973 131880111
36 09/1993 127933 145401156
37 12/1993 146576 158171400
38 03/1994 167777 177550115
39 06/1994 182615 192195819
40 09/1994 209352 211017104
41 12/1994 230950 226259607
42 03/1995 303206 262559786
43 06/1995 420111 315840053
44 09/1995 506190 363273777
45 12/1995 622566 427620278
46 03/1996 701246 473691480
47 06/1996 827174 550739395
48 09/1996 928067 608931850
49 12/1996 1047263 696183789
50 03/1997 1187455 789755858
51 06/1997 1432941 931351601
52 10/1997 1787004 1181167498
53 12/1997 1917868 1281391651
54 03/1998 2125225 1427634373
55 06/1998 2330040 1607673907
56 09/1998 2689618 1904091473
57 12/1998 3046471 2164718256
58 03/1999 3272064 2355200790
59 06/1999 3952878 2924568545
60 09/1999 4719266 3543553093
61 12/1999 5303436 4508169737
62 03/2000 5865742 6120908677
63 06/2000 6760113 8255674441
64 09/2000 8344436 9650223037
65 12/2000 9549382 10710321435
66 03/2001 11169673 11916112872
67 06/2001 12044420 12821742622
68 09/2001 12964797 13727100206
69 12/2001 14366182 15383451165
70 03/2002 15851373 17807926047
71 06/2002 17226422 20020556107
72 09/2002 18324246 23090186146
73 12/2002 20857746 27903283528
74 03/2003 23234788 30356786718
75 06/2003 25214767 32195012823
76 09/2003 27248475 33885908155
77 12/2003 30351263 36042464651
78 03/2004 32631252 37984728579
79 06/2004 39214123 65185548741
80 09/2004 42312264 70222432184
81 12/2004 46105397 79271300840
82 03/2005 49474402 85134714382
83 06/2005 54491598 94996164558
84 09/2005 58758902 107562580723
85 12/2005 64739883 116106677726
86 03/2006 69783593 126401347060
87 06/2006 74034622 134602904495
88 09/2006 80591891 146595277574
89 12/2006 83666567 150163403742
|