The IPD-MHC Database provides an FTP site for the retrieval of sequences in a number of pre-formatted files. The sequence are provided as FASTA, as well as an archive of the sequence alignments and a flat file formatted copy of the database. Descriptions of each file type is available below. The same data is conveniently available as Github repository.
An FTP directory containing the latest version of the IPD-MHC Database is available at the following address: ftp://ftp.ebi.ac.uk/pub/databases/ipd/mhc/. The directory contains files in a number of file formats:
- Flat files (MHC.dat) - A flatfile containing all the public alleles in the database is provided in the FPT directory. Please read the documentation available in the FTP directory or the IMGT/HLA online documentation for a description of the flat file format.
- XML files (MHC.xml) - A xml containing all the public alleles in the database, together with additional information like previous nomenclature and mature peptide sequence.
- FASTA files - a set of FASTA files containing all nucleotide and protein sequences. The files in the archive use the following naming conventions:
- MHC_nuc.txt - nucleotide CDS sequences
- MHC_gen.txt - genomic sequences
- MHC_prot.txt - protein sequences
- Aln files - a set of files in CLUSTALW format containing nucletide and protein alignment.
The files are copyrighted by the IPD-IMGT/HLA Database, see licence and distributed under the Creative Commons Attribution-NoDerivs License.