MAGE-TAB and MAGE-ML Pipeline Submission Guidelines
In this help page:
Overview
This page is aimed at software developers who support a database or an application capable of exporting data to ArrayExpress.
If you are not a software developer and want to submit data to ArrayExpress please see our general submissions page:
ArrayExpress submissions help page >>
The ArrayExpress repository accepts MAGE-ML submissions as a deposition route for submitters who have an existing MAGE-ML pipeline. For new external resources we suggest using a MAGE-TAB based submission pipeline rather than MAGE-ML, which will be superceded by MAGE v2 in 2008.
If you need to contact us for any reason about any MAGE-ML or MAGE-TAB submissions please email
. This allows us to track and archive your emails and is checked regularly by a duty curator.
New Pipelines
MAGE-TAB is now an accepted standard exchange format for representing microarray data and we are developing a new ArrayExpress implementation which will be MAGE-TAB compatible. Submissions are already accepted in MAGE-TAB format via local ArrayExpress submission tools. MAGE-TAB can be used to represent CGH, ChIP-Chip, Chip-based genotype data, protein array data. and re-sequencing data (Solexa, 454). If you need to contact us about how to format your data please email us
at
.
Instructions For Developing a MAGE-TAB Pipeline
- Read the MAGE-TAB specification.
As of June 2006 v1.0 is available and supported at ArrayExpress for Experiment submissions. Submission of MAGE-TAB ADF files representing MAGE-TAB for Arrays is not yet supported. However, very similar format ADF files can currently be submitted via MIAMExpress.
- Map your local schema to MAGE-TAB.
ArrayExpress curators can assist with this process once a preliminary mapping has been established. Please contact for advice and for feedback on schema mappings to MAGE-TAB.
- Make a test export in MAGE-TAB format.
Choose a sample (small but typical) data set from your local resource. Include raw and processed data. If you store one and two channel data please make a test export for both.
- Validate the MAGE-TAB files generated.
Download the Perl MAGE-TAB validation software developed at ArrayExpress for incoming MAGE-TAB submissions and validate the MAGE-TAB files exported. Include the logs when ftping data to us. Access the MAGE-TAB software here.
- FTP the MAGE-TAB files and supporting data files.
Format should be as a zip or tar archive to our incoming FTP. Do not email us large data files. Name the file YOURNAME_MAGE-TAB_DATE_VERSION.ZIP or YOURNAME_MAGE-TAB_DATE_VERSION.tar.gz
FTP Instructions:
December 2009: FTP password changed to 'aexpress1'
On Unix, connect to the FTP server using the command ftp ftp1.ebi.ac.uk. Username is aexpress and password is aexpress1. Use the put command to place your file (or mput for multiple files) into the default directory. Please ensure that you use unique file names and name the files according to the convention above. To exit FTP, type quit. On exiting you will get a message printed to screen to tell you whether your transfer was successful. [Note: you will not be able to list the files in the directory or download files from the FTP site to your directory.]
- An ArrayExpress curator will examine the file(s) and provide feedback.
If the submitted file is valid a test load will be made and you will be assigned a login to ArrayExpress to see the loaded data. If the file is not valid a curator will contact you with information on necessary changes to the file.
- Operating MAGE-TAB pipelines.
Once tests are successfully completed you may deposit your real data submissions to the FTP site as described above. These will be tracked locally and assigned to a curator for curation. An accession will usually be provided within 4 working days.
All communication will be with the email address attached to the Person with the Person Role ''submitter' in the IDF file.
A login will be provided for this person and a unique reviewer login for each experiment submitted. A reminder email will be sent one month before the experiment is due to be released, this date may be extended.
We may contact you to request publication details. If the experiment is published (accession number cited in an article) we will make the experiment public.
Existing Pipelines
ArrayExpress will continue to accept data submissions as MAGE-ML format from existing pipelines for the foreseeable future. However, should you decide to make significant changes to your MAGE-ML pipeline we strongly suggest that you consider MAGE-TAB as an alternative format. We will not prioritize testing changes to existing MAGE-ML pipelines.
The previous 'Information for submitters' page is still available but is no longer being updated. If you have questions about your MAGE-ML pipeline please contact us.
If you have an existing pipeline with ArrayExpress you will be signed up for a new mailing list 'ae_pipe@ebi.ac.uk'. We will use this mailing list to update you about future changes to ArrayExpress.
Common MAGE-ML Pipeline Problems
- Identifiers are not unique between successive MAGE-ML submissions.
Identifiers must be globally unique for MAGE-ML Group, Map and Dimension objects. These objects are containers for other identifiers and to avoid loading the same objects multiply we refer instead to the previous occurrence of the identifier.
- Experiments are supplied without release dates
If you supply an experiment without a release date the curator will automatically insert a release date of one year from submission. A reminder email will be sent to the Person with the Role 'submitter' one month before release. Please code release dates in MAGE-ML as a Name Value Type attached to the Experiment or PhysicalArrayDesign. Note that we limit release dates to up to 1 year from the date of submission.
E.g.
<Experiment identifier="E-SMDB-4088">
<PropertySets_assnlist>
<NameValueType value="2007-06-04"
name="ArrayExpressSubmissionDate">
</NameValueType>
<NameValueType value="2007-06-04"
name="ArrayExpressReleaseDate">
</NameValueType>
</PropertySets_assnlist>
- Experiment and/or Array Identifier are pre-assigned by submitting database
This may only be done by prior arrangement with ArrayExpress. Any database with multiple installations e.g. MAXD and BASE will not be permitted to assign accession numbers at source. In cases where we find publications with accession numbers not assigned by us we will immediately inform the journal that the data are not available.
- Essential information is encoded in non standard ways, e.g. as Name Value Types or descriptions attached to many objects.
NVTs can be ignored by MAGE-ML capable applications, and during the set up of a pipeline you will have been informed of ArrayExpress preferred coding styles for optimal display of data in ArrayExpress. These are also available as a set of best practice examples. NVTs and Description should not be used a substitute for best practice coding.
Any further questions, please see our FAQ.
 |