Frequently Asked Questions (FAQ)

Please contact us using the "Feedback" button at the top of this page if you cannot find the answer to your question.

1. Using ArrayExpress

Searching and understanding data
Data access policy - permissions and restrictions
FTP data download and programmatic access
Other general questions

2. Submitting data to ArrayExpress

Accession numbers
Keeping unpublished data private
Public release date
Which files to submit
Making changes to data already submitted to Arrayexpress

Top

1. Using ArrayExpress

Searching and understanding data
  • How do I search the ArrayExpress database?
    For simple keyword searches and browsing, please visit the search help page. For more complex search queries, e.g. "Find all the microarray experiments on Agilent chip X studying diabetes in mouse", you will need to use some sophisticated search filters to limit the search space of some terms, e.g. ask the search engine to find studies where "diabetes" is the subject of investigation, and not merely mentioned in background information of a study. Please refer to the advanced search help for more details.
  • How do I link sample and data file information in the downloadable files?

    Top level information about an experiment (title, authors, protocols etc) is in a tab-delimited text file called the IDF (Investigation Definition Format). Sample, extract and assay and data file information is in a file called the SDRF (Sample and Data Relationship Format), also in tab-delimited text format, and it links to the protocol information in the IDF. Tab-delimited text files can be easily opened in Microsoft Excel as nicely-formatted spreadsheets.

    More information on IDF/SDRF can be found on IDF help page and SDRF help page.
  • I read a paper that referred to data in ArrayExpress but when I searched for the data I got no hits. Why?
    It could be that the submitter of the data has not yet told us that the experiment should be made public. If you come across this problem please email us at arrayexpress@ebi.ac.uk - tell us which paper you found the reference in and the ArrayExpress accession number quoted if any. We will make this data publicly available.
  • I have seen ArrayExpress experiment accessions with prefixes such as "E-MTAB", "E-GEOD", etc. What do the prefixes mean?
    The prefixes indicate the source and/or submission route from which the data came from. The common ones are:
    • MEXP = data submitted via the MIAMExpress submission route (discontinued since July 2014)
    • TABM = data submitted via the Tab2MAGE submission route (discontinued since January 2012)
    • MTAB = data submitted via the MAGE-TAB (discontinued since September 2014) or Annotare submission route
    • GEOD = data imported from NCBI Gene Expression Omnibus
    See the accession codes help page for more information.

Top

Data access policy - permissions and restrictions
  • Are there any restrictions on the use of microarray data obtained through ArrayExpress?
    No restrictions, all public data from ArrayExpress can be used by anyone and our services are completely free of charge.
  • Do I need a login account to view or access data in ArrayExpress?
    For public data, no. We only provide login accounts to data submitters so that they, and their reviewers, can access pre-publication private data. All other data can be viewed by everyone.
  • I've lost my ArrayExpress website login details. What do I do?
    If you are viewing a curated experiment which you submitted to ArrayExpress, use the forgotten password reminder form in the login box. If you are not the data submitter (e.g. a reviewer), then please contact the submitter. We can only give ArrayExpress login details to the owners of the data directly.
  • I'm reviewing a paper. How do I get a login to view private data that the authors have deposited in ArrayExpress?
    Reviewer login details are sent to submitters on completion of the processing of their submission. Please contact the data submitter, via the journal editor, to request this login information. We cannot provide access to private data to anyone without first getting authorization from the submitter or journal.

Top

FTP data download and programmatic access
  • Is there an FTP site to download data files in bulk?
    Yes, all the data and array designs in ArrayExpress are available for direct download in a number of different formats. For more information on what files are available and how to access them see our FTP files for download help.
  • Do you have any application programming interfaces (APIs) for accessing ArrayExpress?
    Yes, we have REST-style and WebService APIs for accessing ArrayExpress. Details can be found here: Programmatic access.

Top

Other general questions
  • How much overlap is there between ArrayExpress and the NCBI Gene Expression Omnibus (GEO)?
    We import data on a weekly basis from NCBI Gene Expression Omnibus (GEO). All experiments imported from GEO have accession numbers in the format of E-GEOD-n, where n is a number. For example, GEO accession "GSE29080" would become "E-GEOD-29080" in ArrayExpress. For more information see the GEO data import page.
  • How do I cite ArrayExpress in my publication?

    You should include your experiment accession number and the URL to ArrayExpress home page, www.ebi.ac.uk/arrayexpress. e.g. "Microarray data are available in the ArrayExpress database (www.ebi.ac.uk/arrayexpress) under accession number E-MTAB-xxxx." You can also include a direct link to your experiment or array design. For example, the link would be http://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-1234 for experiment "E-MTAB-1234", or http://www.ebi.ac.uk/arrayexpress/arrays/A-MTAB-567 for array design "A-MTAB-567".

    If you wish to include a citation for ArrayExpress then the following publication should be used: Kolesnikov N. et al. 2015 ArrayExpress update - simplifying data submissions. Nucleic Acids Res, doi: 10.1093/nar/gku1057. Pubmed ID 25361974.

Top

 

2. Submitting data to ArrayExpress

Accession numbers
  • I urgently need an accession number to include in my publication - how can I get one?
    Firstly, start your submission if you have not already done so! ArrayExpress does not provide accession numbers in advance of submission of data. This is in line with the NCBI Gene Expression Omnibus (GEO)'s policy of provision of accession numbers after deposition. If you have completed the submission in Annotare, you should receive an automated email with the accession number usually within 15 minutes. For other submissions you can email us at arrayexpress@ebi.ac.uk to let us know how urgently the accession number is required. We will reply to your email as quickly as possible but please note that we often have several submissions requiring urgent attention.
  • How long will it take to get an accession number?
    Using the Annotare submission tool, the majority of users submit within 3 hours, from creating a new submission account to submitting the experiment. This does not imply strictly 3 hours of hands-on time because information entered in Annotare is saved automatically, and a submitter is free to logout and login again at any time. Annotare features a built-in validation tool to ensure all essential information is included in your submission so we can assign an accession number to your data set quickly upon submission, usually within 15 minutes. Sometimes we receive a high volume of submissions, which can cause a delay in accessioning for up to 24 hours. The accession number will NOT change throughout data curation and processing, so you can quote it in your manuscript.

    If you still haven't received an accession number withing 24 hours post submission, please email us at arrayexpress@ebi.ac.uk citing your experiment's title, and we will investigate.

Top

Keeping unpublished data private
  • Can my data be kept private after I submit it?
    Yes, all submitted information will be kept private until the release date that you set at the time of submission, or until a publication is released that contains the ArrayExpress accession number relating to your data. You can also change the release date to suit the peer review progress of your paper. Please see our data availability page for more information.
  • The journal I am submitting to has requested private access to my data - how do I get an ArrayExpress login for them?
    Firstly, start your ArrayExpress submission if you have not already done so! You will be sent login details for viewing your private experimental data on the ArrayExpress website only when curation is finished and when your data is loaded into the ArrayExpress database. If you are really pushed for time, reply to the email sent to you by Annotare or the ArrayExpress curator in charge of your submission, or email us at arrayexpress@ebi.ac.uk to let us know about your situation. We will try our best to fast-track your submission.
  • The journal I am submitting to supports "double-blind" peer review. Can ArrayExpress hide my name and contact details from the reviewers?
    Yes. If you're submitting a new data set via Annotare, simply check the Hide my identity from reviewers box on the Experiment Description page. Please see submitter's guide on anonymity for more information.
    If you are the submitter of a previously non-anonymised private data set in ArrayExpress and would like to switch on anomymity, please write to us at arrayexpress@ebi.ac.uk.
  • I tried to login to ArrayExpress to see some recently submitted private data but it said my username and password were invalid. Why?
    Possibly you have tried to login with your submitter account with our submission tools, such as Annotare? These "submitter" accounts can only be used for submitting data, not viewing it in ArrayExpress. We will provide you with an ArrayExpress data access login account after your submission has been curated and loaded into ArrayExpress. If you are using an ArrayExpress login which you have just received, it might not be working yet because ArrayExpress is updated only once a day at about 6am UK time. Try again after this time.

    If you've forgotten or lost your access account details, you can retrieve it using the forgotten password tool. If it still does not work, contact us at arrayexpress@ebi.ac.uk and tell us what login details you are trying to use / which page you tried those details on.

Top

Public release date
  • I don't know when my paper will be published so what should I put as the release date?
    Enter an estimated public release date up to 1 year in the future from the day of submission. The release date can be changed during/after curation for multiple times to match the peer-review progress. Please see our release date change page for more information.
  • My paper is about to be published - how do I make the data public?
    For microarray data, you can change the release date using this self-service tool. Simply login with the data access account details which we emailed you when curation was completed, double-click at the private experiment that you would like release date changed, and follow on-screen instructions. You would normally have used these account details to log in to the ArrayExpress website and view your private experiment(s). Don't worry if you have forgotten the username and/or password, you can retrieve login details using your submitter's email address and the experiment accession number. Your experiment will appear on the ArrayExpress website at about 06am UK time the following day, after an overnight website update.

    For sequencing data, please email us at arrayexpress@ebi.ac.uk with the experiment accession number and tell us when it should be made public. We will change release date for you in both ArrayExpress and ENA (where the raw data files are stored) to make sure the records are in sync.

  • My paper has been published - how do I add the citation details to my experiment?
    You can add, remove or edit citation details for your experiment (microarray or sequencing alike) using this self-service tool. Simply login with the data access account details which we emailed you when curation was completed, double-click at the experiment that you would like associate the publication with, and follow on-screen instructions. You would normally have used these account details to log in to the ArrayExpress website and view your private experiment(s).

    Don't worry if you have forgotten the username and/or password, you can retrieve login details using your submitter's email address and the experiment accession number. The changes you make will appear on the ArrayExpress website at about 06am UK time the following day, after an overnight website update.

Top

What files to submit
  • Which files do I need to submit for an Affymetrix experiment?
    For a MIAME compliant submission we need Affymetrix .CEL files as raw data and some form of processed, probe set level data. The processed data is preferably a matrix file generated by software such as as robust multi-array average (RMA). A matrix file has a rather strict format, with the Affyemtrix probesets in the rows and processed intensity values per hybridisation in the columns (data matrix example).
  • What files do I need to submit for an Illumina experiment?
    Illumina raw data files are usually either in plain text or binary format. Plain text files are generated by the Illumina GenomeStudio software (previously known as the "BeadStudio" software, using the ArrayExpress Data Submission Report Plug-in). The binary "IDAT" files (stands for "intensity data file") are generated by the scanner and can be parsed using R/BioConductor packages such as illuminaio). IDAT is the preferred file format, as it is a binary format, containing all information required to analyse the data. In contrast, plain-text files can be missing information such as which are the control probes; this is sometimes provided in a separate file, but not always, and heterogeneity in raw data file formats makes systematic analysis of data difficult. Another disadvantage of plain-text files is that they are susceptible to human-introduced errors, as it is easy for someone to open the file in a text editor or spreadsheet program and accidentally change its content.
  • How do I submit a high throughput sequencing (HTS) or next-generation sequencing (NGS) experiment?
    You will need to provide as much information as possible about the experiment's purpose, starting materials (samples), sequencing libraries, wet- and dry-lab protocols used, raw sequence data files for each sequencing assay, and optionally processed data files, e.g. BAM alignment files, RPFM/FPKM values for each gene/transcript. The Annotare submission tool will guide you step-by-step in the process.

    After you've submitted via Annotare, an ArrayExpress curator will check the meta-data and data files you provided. If everything is fine, ArrayExpress will act as a data broker and transfer the raw read files to the Sequence Read Archive (SRA) at the European Nucleotide Archive (ENA) on your behalf, so you do not need to send the read files directly to the ENA. or any SRA partners. (The SRA is a collaboration between NCBI in the United States, EBI in the United Kingdom and DDBJ in Japan. The three partners exchange data daily.)

    Raw data files must be in a format accepted by the Sequence Read Archive (SRA) at the ENA, otherwise we will not be able to process your submission. We strongly recommend that you check out our sequencing submissions help page as it explains the requirements in a lot more detail.
  • I have MAGE-ML/MAGE-TAB files generated from in-house pipelines. Does ArrayExpress still accept them?
    MAGE-ML files are no longer supported or accepted by ArrayExpress since 2011. Only MAGE-TAB files are supported. To submit metadata, please use the webform-based tool Annotare, which will generate MAGE-TAB files for you, so you don't have to worry about learning the file format. All you need to do is to fill in a series of webforms and upload data files within the tool. (If you are interested, here is the PDF version of the latest MAGE-TAB 1.1 specification for download. The file is 57Mb in size.)

    In exceptional cases, we accept the submission of correctly formatted MAGE-TAB spreadsheets uploaded via FTP. If you would like to update your pipeline to generate MAGE-TAB files, please write to us at arrayexpress@ebi.ac.uk and we will try our best to offer assistance.

Top

Making changes to data already submitted to ArrayExpress

  • I hit "Submit" at the end of my submission but I haven't actually finished editing. I logged in to the submission tool but can't edit my experiment anymore. Help!
    Email us at annotare@ebi.ac.uk as soon as possible and ask us to "reopen" the submission and "assign" it back to you, so you can edit it again. (Submitted data sets are "read-only" to the submitter to avoid confusing, concurrent editing of the same submission by submitter and curator.) Remember to include your username and the experiment/array design title in your email so we can quickly find your submission.
  • My data has been loaded into ArrayExpress but I need to change/correct something - how do I do this?
    Email us at arrayexpress@ebi.ac.uk with the accession number and describe what corrections you would like to make. Please see this help page on updating experiments/array designs for further details.
  • My contact details have changed, how do I update them?
    Send your new contact details to us at arrayexpress@ebi.ac.uk and we will change them in ArrayExpress.

Top