IPI - International Protein Index - Fasta Format
Each IPI entry consists of a cluster of related entries from the constituent
databases, together with a sequence and a description line taken from a master entry. This
data is presented in FASTA format. See this page for more
details on how IPI is produced, how entries are clustered and how master entries are chosen.
FASTA Example
A sample IPI entry is shown below. In this entry, one UniProtKB/Swiss-Prot
entry has been associated with one UniProtKB/TrEMBL
sequence, one Ensembl sequence, one RefSeq sequence, one H-Inv sequence and one VEGA sequence:
>IPI:IPI00000023.4|SWISS-PROT:P18507|TREMBL:Q9UDB3|
ENSEMBL:ENSP00000354651|REFSEQ:NP_000807|
H-INV:HIT000321503|VEGA:OTTHUMP00000160874
Tax_Id=9606 Gene_Symbol=GABRG2 Gamma-aminobutyric-acid
receptor subunit gamma-2 precursor
MSSPNIWSTGSSVYSTPVFSQKMTVWILLLLSLYPGFTSQKSDDDYEDYASNKTWVLTPK
VPEGDVTVILNNLLEGYDNKLRPDIGVKPTLIHTDMYVNSIGPVNAINMEYTIDIFFAQT
WYDRRLKFNSTIKVLRLNSNMVGKIWIPDTFFRNSKKADAHWITTPNRMLRIWNDGRVLY
TLRLTIDAECQLQLHNFPMDEHSCPLEFSSYGYPREEIVYQWKRSSVEVGDTRSWRLYQF
SFVGLRNTTEVVKTTSGDYVVMSVYFDLSRRMGYFTIQTYIPCTLIVVLSWVSFWINKDA
VPARTSLGITTVLTMTTLSTIARKSLPKVSYVTAMDLFVSVCFIFVFSALVEYGTLHYFV
SNRKPSKDKDKKKKNPAPTIDIRPRSATIQMNNATHLQERDEEYGYECLDGKDCASFFCC
FEDCRTGAWRHGRIHIRIAKMDSYARIFFPTAFCLFNLVYWVSYLYL
|
Extension from 26 of March 2007
From the 26 March 2007 onwards, which corresponds to human/mouse/rat 3.27,
zebrafish 3.26, arabidopsis 3.25, chicken 3.21 and cow 3.13 releases, the
format is extendes to contain Gene symbols. Gene_Symbol
e.g.
Previously:
>IPI:IPI00000023.4|SWISS-PROT:P18507|TREMBL:Q9UDB3|
ENSEMBL:ENSP00000354651|REFSEQ:NP_000807|
H-INV:HIT000321503|VEGA:OTTHUMP00000160874
Tax_Id=9606 Gamma-aminobutyric-acid receptor subunit
gamma-2 precursor [...]
|
Extended format:
>IPI:IPI00000023.4|SWISS-PROT:P18507|TREMBL:Q9UDB3|
ENSEMBL:ENSP00000354651|REFSEQ:NP_000807|
H-INV:HIT000321503|VEGA:OTTHUMP00000160874
Tax_Id=9606 Gene_Symbol=GABRG2 Gamma-aminobutyric-acid
receptor subunit gamma-2 precursor [...]
|
Revised format from 22 of February 2006
From the 22 February 2006 onwards, which
corresponds to human/mouse/rat 3.15, zebrafish 3.14, arabidopsis 3.13, chicken 3.09
and cow 3.01 releases, all RefSeq entries will be listed after the solitary REFSEQ
database code.
e.g.
Previously:
>IPI:IPI00000005.1|SWISS-PROT:P01111-3|TREMBL:P54111|
REFSEQ_NP:NP_002515|REFSEQ_XP:XP_032698;XP_001317|
ENSEMBL:ENSP00000261444|H-INV:HIT000032298 [...]
|
Revised format:
>IPI:IPI00000005.1|SWISS-PROT:P01111-3|TREMBL:P54111|
REFSEQ:NP_002515;XP_032698;XP_001317|
ENSEMBL:ENSP00000261444|H-INV:HIT000032298 [...]
|
 |