IPI - International Protein Index - Gene Cross-References File Format
IPI gene cross-reference (ipi.genes.*.xrefs) files are convenient tab-delineated files mapping protein entries
from IPI protein source databases to gene entries and their chromosomal locations from Ensembl.
This data is produced using mappings between IPI protein source databases and
Ensembl, Entrez
Gene, Vega (when available), and species-specific resources such as
Genew (the nomenclature database of the HGNC),
MGI, RGD,
ZFIN, TAIR (proteins and genes for
A. thaliana) and UniGene as part of the
process used to make IPI, and is available for all species covered by IPI.
These files can be downloaded for the current release from the
IPI FTP site.
Each file represents all the chromosomes of a single genome.
If a given column contains a list of values, each value in the list is separated by use of a semi-colon (';').
Other links:
Revised format from 04 of April 2006
From the 04 April 2006 onwards, which
corresponds to human/mouse/rat 3.16, zebrafish 3.15, arabidopsis 3.14, chicken 3.10
and cow 3.02 releases, Ensembl End co-ordinates and genomic strand information will be
inserted (column 4 and 5) after the Start co-ordinate information (column 3).
Subsequent columns will shift to the right.
e.g.
- Column 4 - End co-ordinate of gene (on Ensembl assembly) given in base pairs, global
(these are chromosomal co-ordinates). Data taken from Ensembl.
- Column 5 - Strand of gene (on Ensembl assembly) corresponding to these co-ordinates.
1 for FORWARD or SENSE and -1 for REVERSE or ANTISENSE.
- Subsequent columns shifted to the right.
Previously:
1 216475253 1q41 ENSG00000196660 25355,SLC30A10 55532,SLC30A10 IPI00000012;IPI00464958;
Q49AL9;Q6XR72;Q9NPW0; ENSP00000349018; VALIDATED:NP_001004433;VALIDATED:NP_061183;
HIT000015673;HIT000022321;HIT000026337; Hs.519812; GI:52351208;GI:52351218;
OTTHUMG00000037434 OTTHUMP00000035563;
|
Revised format:
1 216475253 216489910 -1 1q41 ENSG00000196660
25355,SLC30A10 55532,SLC30A10 IPI00000012;IPI00464958;
Q49AL9;Q6XR72;Q9NPW0; ENSP00000349018; VALIDATED:NP_001004433;VALIDATED:NP_061183;
HIT000015673;HIT000022321;HIT000026337; Hs.519812; GI:52351208;GI:52351218;
OTTHUMG00000037434 OTTHUMP00000035563;
|
Revised format from 22 of February 2006
From the 22 February 2006 onwards, which
corresponds to human/mouse/rat 3.15, zebrafish 3.14, arabidopsis 3.13, chicken 3.09
and cow 3.01 releases, Ensembl peptides will be moved from column 13 to 11. Single column
will be used for all RefSeq entries (subsequent columns will shift to the left).
Entry
status information
available for each referenced RefSeq entry.
e.g.
- Column 11 - All Ensembl peptide IDs associated with this gene (previously in column 13).
- Column 12 - List of RefSeq STATUS:ID couples
(separated by a semi-colon ';') associated with this gene
(RefSeq entry revision
status details)
plus column 13 removed and subsequent columns shifted to the left.
Previously:
7 151579364 7q36.1 155100,LOC155100 IPI00045628;
NP_001025037; XP_379977; ENSP00000021776;[...]
|
Revised format:
7 151579364 7q36.1 155100,LOC155100 IPI00045628;
ENSP00000021776; INFERRED:NP_001025037;MODEL:XP_379977;[...]
|
 |