Array Design Submission GuidelinesIn this help page:
OverviewThis page is about submitting microarray designs (array layout and annotation) to ArrayExpress. An array design describes how a microarray was manufactured, what was printed/synthesized at each position on the array, and what biological sequences these represent. The same array design can be used in many different hybridizations across many different experiments. Use the diagram below to decide if you need to submit an array design to us. Red text is a hyperlink to the relevant help section.
Checking if an array design is already in ArrayExpressIf the array design you used has already been described in ArrayExpress then you do not need to submit it. Many commercial and academic array designs from organizations such as Affymetrix, Agilent, Illumina, Nimblegen and Sanger are already loaded into ArrayExpress. Use these links to find array designs already in ArrayExpress:
If your array design is already in ArrayExpress then you can use it in your experiment submission as follows:
New commercial array designsIf you used a commercial catalogue array design and you cannot find it in ArrayExpress please contact us at and tell us the exact name of the array design you used and the manufacturer and we'll let you know if we can get the information directly from the manufacturer or if we need you to submit the layout information to us.
Custom Affymetrix and Nimblegen array designsIf you have used a custom array design you will need to submit the array design as an Array Design Format (ADF) file - see next section. There are two exceptions:
Submitting an array design
If you need to submit an array design you can do this using the
MIAMExpress submissions tool. You will need to first create an
Array Design Format file to upload. Then go to the MIAMExpress
login page:
After submitting an array design you can continue with your experiment immediately - you do not need to wait for us to process the array submission. To use your new array design:
Although you can complete your experiment submission before the array design is processed please be aware that the array design submission needs to be curated and loaded into ArrayExpress before we can process the experiment.
Creating an array design format (ADF) fileAn array design format (ADF) file is simply a table with standardized column names describing what was printed/synthesized at each position on a microarray. The ADF file can be created in any spreadsheet application but must be saved as a tab delimited text file. An ADF file contains the following information:
The following sections show you how to create an ADF file containing this information. A .gal file or data file can be a good place to start your ADF file from. The annotation you provide in the ADF file will be referenced by all data files that match the array. You can download a typical template header in tab delimited format and excel format and an example is also shown below, where the mandatory fields are in red bold font.
1. Location and name
Each spot on the array is called a feature. The position of each feature is described by 4 coordinates: MetaColumn, MetaRow, Column, Row.
GAL files with Block, Column, Row coordinates If you have a gal file containing Block, Column, Row coordinates you can convert these to MetaColumn, MetaRow, Column, Row coordinates using this tool: GAL to ADF converter tool >>
Reporter Identifier and Reporter Name columns are mandatory. A reporter represents the sequence spotted on the array. The Reporter Identifier is used internally in the ADF file, the name will be displayed and should be biologically meaningful, for example a gene or clone name. The identifier entered should be the same as the one you use for the reporter in final gene expression matrices and other normalized data files. We use the reporter identifier values in the array design files and data files to link array annotation to measurement values in data files.
If the same sequence is spotted at more than 1 Feature then the Reporter Identifier can be repeated.
If your Reporter doesn't have a name, for example if it is an unknown sequence, then repeat the Reporter Identifier in the Reporter Name column.
Don't reuse names for different reporters that have different identifiers, we will ask you why. The example below is not acceptable.
2. Annotation
We need database entries or actual sequence to describe the sequences on your array. At least one database entry or sequence is needed for MIAME compliance. If you do not have database entries for all reporters then we may contact you for more information. We need to know which database these accession numbers are from and we ask you to supply a database code inside the [square brackets] in the header row. You can find a complete list of allowed databases here (use the values in the 'Name' column). A short list of common ones is below. If you supply entries from a database not already on this list then we will look at it to make sure it is suitable and we may ask you about it.
To provide a single database entry do this:
To show multiple database entries from the same database do this:
To show multiple database entries from different databases do this:
To include chromosome coordinates do this (change the part 'ucsc_hg17' to indicate the source of your coordinates):
If you have sequence verified information for your array then you can provide that in the Reporter BioSequence [Actual Sequence] column. This is not mandatory unless you have oligos on your array. Only supply actual sequence if there are oligos or if YOU have sequence verified it. If you have oligo arrays and the manufacturer will not allow you to add these, then continue with your submission, we may ask you about this later.
3. Type of reporter
We need to know what Reporter Type you are using and how it was generated. These values go into the Reporter BioSequence Type and Reporter BioSequence Polymer Type columns. The Reporter BioSequence Type describes the material spotted on the array. A list of common allowed values to put in these columns is shown below.
4. Group of reporter
We need to know how the Reporters on the array are grouped. There are two groups: Experimental and Control. There is a column to specify which group each Reporter belongs to called Reporter Group [role]. This column is mandatory. Allowed values for this column are: Experimental and Control
Describe what type of controls were used in the next column. If the spot is not a control then do not fill in anything in this column. The allowed values for this column are:
5. Additional informationReporter comment
If you want to add free text information about a Reporter add this column:
Composite Sequences
If you want you can provide an additional level of information showing which Reporters represent the same gene. For example if you have different Reporters that correspond to two exons of a gene you might want to supply different information on these exons and also information on the gene. This done by providing extra columns that describe CompositeSequences. Columns that are used to add this information are shown below. Note that you need all the information in the example above as well as what is shown below in red bold.
CompositeSequence IdentifierA unique identifier for each CompositeSequence, this is just an id, and not a database entry that we will link to. Ids can be duplicated if the same CompositeSequence is on the array multiple times. If you provide this column then name and database entry are mandatory.
CompositeSequence NameCompositeSequences need a name. If names are missing copy over the identifier into this field.
CompositeSequence Database EntryOne or more database entries (accession numbers) that describe the CompositeSequence. The database that the entries are from is provided inside [square brackets]. The example above is for LocusLink. A list of allowed values for databases is here (use values in 'Name' column). We will check that the id format you provide belongs to the database that you specified.
CompositeSequence CommentA free text comment that describes the CompositeSequence.
Example ADF files for download
Checking your ADF before submission
A tool is provided which will check your ADF for common formatting
errors. The tool will report any problems in the ADF - please try
to fix as many as you can before submitting the ADF as this will
speed up the processing of your array design submission: ADF checkList
Any further questions, please see our FAQ. |
