Some of the experiments and platforms in ArrayExpress have been imported from the Gene Expression Omnibus (GEO) at the NCBI.
We import data on a weekly basis from GEO. SOFT files and data files are downloaded and then we extract the experiment description, sample annotation and experimental factor information etc using custom automated methods and text mining tools. Some experiments are also manually curated, particularly if selected for inclusion in the Expression Atlas
Experiments imported from GEO have ArrayExpress accession numbers in the format of E-GEOD-n, where n is a number. They also have a secondary accession number (shown in the Links section in the detailed view) which is the original GEO identifier e.g. GSE35642. You can also query for the GEO identifier of an experiment in the ArrayExpress browse interface.
Clicking on the GEO secondary accession number will take you to the original entry in GEO. Note that the GEO and ArrayExpress databases are not synchronized - if annotation and/or data files are updated in GEO this will not necessary be reflected in the corresponding ArrayExpress entry. We are currently working with GEO to develop a synchronization process.
GEO platform designs are imported in a similar way to experiments and have accession numbers in the format of A-GEOD-n, where n is a number. Affymetrix, Agilent, Illumina catalogue platform designs are not imported from GEO but instead experiments that use them are linked to the ArrayExpress equivalent. E.g. E-GEOD-35642.
Any further questions, please see our FAQ.