 |
IPI - International Protein Index - Protein Cross-References File Format
IPI cross-reference (.xrefs) files are convenient tab-delineated files of the major cross-references
in each IPI
data set. They can be downloaded for the current release from the
IPI FTP site.
Each row of the IPI cross reference file represents one protein in IPI, and consists of the following
tab
delineated fields:
- Database from which master entry of this IPI entry has been taken.
One of either SP (UniProtKB/Swiss-Prot), TR (UniProtKB/TrEMBL), ENSEMBL (Ensembl), ENSEMBL_HAVANA
(Ensembl Havana subset),
REFSEQ_STATUS (where STATUS corresponds to the RefSeq entry revision status), VEGA (Vega), TAIR (TAIR Protein data set) or HINV
(H-Invitational Database).
- UniProtKB accession number or Vega ID or Ensembl ID or RefSeq ID or TAIR Protein ID or H-InvDB ID.
- International Protein Index identifier.
- Supplementary UniProtKB/Swiss-Prot entries associated with this IPI entry.
- Supplementary UniProtKB/TrEMBL entries associated with this IPI entry.
- Supplementary Ensembl entries associated with this IPI entry. Havana curated transcripts preceeded
by the key
HAVANA: (e.g. HAVANA:ENSP00000237305;ENSP00000356824;).
- Supplementary list of RefSeq STATUS:ID couples (separated by a semi-colon ';') associated with this IPI
entry
(RefSeq entry revision status
details).
- Supplementary TAIR Protein entries associated with this IPI entry.
- Supplementary H-Inv Protein entries associated with this IPI entry.
- Protein identifiers (cross reference to EMBL/Genbank/DDBJ nucleotide databases).
- List of HGNC number, HGNC official gene symbol couples (separated by by a semi-colon ';')
associated with this IPI entry.
- List of NCBI Entrez Gene gene number, Entrez Gene Default Gene Symbol couples (separated by a
semi-colon ';')
associated with this IPI entry.
- UNIPARC identifier associated with the sequence of this IPI entry.
- UniGene identifiers associated with this IPI entry.
- CCDS identifiers associated with this IPI entry.
- RefSeq GI protein identifiers associated with this IPI entry.
- Supplementary Vega entries associated with this IPI entry.
The mouse, rat, zebrafish and arabidopsis xref files have the following differences:
- Column 11 in the mouse file contains the MGI (Mouse Genome Informatics) identifier and symbol for the
genes
- Column 11 in the rat file contains the RGD (Rat Genome Database) identifier and symbol for the
genes.
- Column 11 in the zebrafish file contains the ZFIN (Zebrafish information network) identifier and symbol
for the
genes.
- Column 11 in the arabidopsis file contains the TAIR Gene (The Arabidopsis Information Resource) symbol
and locus
identifier for the genes.
- Column 11 does not contain any data for chicken and cow.
|
Revised format from 22 of February 2006
From the 22 February 2006 onwards, which
corresponds to human/mouse/rat 3.15, zebrafish 3.14, arabidopsis 3.13, chicken 3.09
and cow 3.01 releases, separate columns for RefSeq NPs and XPs will be replaced by a single
one for all RefSeq entries (subsequent columns will be shifted to the left).
Entry status information will be made available for each referenced RefSeq
entry (for both masters and cluster members).
e.g.
- Column 1 - Database from which master entry of this IPI entry has been taken.
One of either SP (UniProtKB/Swiss-Prot), TR (UniProtKB/TrEMBL), ENSEMBL (Ensembl), REFSEQ_STATUS
(where STATUS corresponds to the RefSeq entry
revision status),
VEGA (Vega), TAIR (TAIR Protein data set) or HINV (H-Invitational Database).
Previously:
REFSEQN NP_000338 IPI00216513 P11277-1; Q8WX82;Q71VF9;Q71VF8;
ENSP00000284159;[...]
|
Revised format:
REFSEQ_VALIDATED NP_000338 IPI00216513 P11277-1;
Q71VF8;Q71VF9;Q8WX82; ENSP00000284159;[...]
|
- Column 7 - Supplementary list of RefSeq STATUS:ID couples
(separated by a semi-colon ';') associated with this IPI entry
(RefSeq entry revision
status details)
plus column 8 removed and subsequent columns shifted to the left.
Previously
SP P83876 IPI00216338 ENSP00000269601; NP_006692; XP_499552;[...]
|
Revised format:
SP P83876 IPI00216338 ENSP00000269601; PROVISIONAL:NP_006692;MODEL:XP_499552;[...]
|
|
 |