spacer
spacer

IPI - International Protein Index - Fasta Format


Each IPI entry consists of a cluster of related entries from the constituent databases, together with a sequence and a description line taken from a master entry. This data is presented in FASTA format. See this page for more details on how IPI is produced, how entries are clustered and how master entries are chosen.

FASTA Example

A sample IPI entry is shown below. In this entry, one UniProtKB/Swiss-Prot entry has been associated with one UniProtKB/TrEMBL sequence, one Ensembl sequence, one RefSeq sequence, one H-Inv sequence and one VEGA sequence:

            >IPI:IPI00000023.4|SWISS-PROT:P18507|TREMBL:Q9UDB3|
            ENSEMBL:ENSP00000354651|REFSEQ:NP_000807|
            H-INV:HIT000321503|VEGA:OTTHUMP00000160874 
            Tax_Id=9606 Gene_Symbol=GABRG2 Gamma-aminobutyric-acid 
            receptor subunit gamma-2 precursor
            MSSPNIWSTGSSVYSTPVFSQKMTVWILLLLSLYPGFTSQKSDDDYEDYASNKTWVLTPK
            VPEGDVTVILNNLLEGYDNKLRPDIGVKPTLIHTDMYVNSIGPVNAINMEYTIDIFFAQT
            WYDRRLKFNSTIKVLRLNSNMVGKIWIPDTFFRNSKKADAHWITTPNRMLRIWNDGRVLY
            TLRLTIDAECQLQLHNFPMDEHSCPLEFSSYGYPREEIVYQWKRSSVEVGDTRSWRLYQF
            SFVGLRNTTEVVKTTSGDYVVMSVYFDLSRRMGYFTIQTYIPCTLIVVLSWVSFWINKDA
            VPARTSLGITTVLTMTTLSTIARKSLPKVSYVTAMDLFVSVCFIFVFSALVEYGTLHYFV
            SNRKPSKDKDKKKKNPAPTIDIRPRSATIQMNNATHLQERDEEYGYECLDGKDCASFFCC
            FEDCRTGAWRHGRIHIRIAKMDSYARIFFPTAFCLFNLVYWVSYLYL
       		

Extension from 26 of March 2007

From the 26 March 2007 onwards, which corresponds to human/mouse/rat 3.27, zebrafish 3.26, arabidopsis 3.25, chicken 3.21 and cow 3.13 releases, the format is extendes to contain Gene symbols. Gene_Symbol

e.g.

Previously:

            >IPI:IPI00000023.4|SWISS-PROT:P18507|TREMBL:Q9UDB3|
            ENSEMBL:ENSP00000354651|REFSEQ:NP_000807|
            H-INV:HIT000321503|VEGA:OTTHUMP00000160874 
            Tax_Id=9606 Gamma-aminobutyric-acid receptor subunit 
            gamma-2 precursor [...]
       		

Extended format:

            >IPI:IPI00000023.4|SWISS-PROT:P18507|TREMBL:Q9UDB3|
            ENSEMBL:ENSP00000354651|REFSEQ:NP_000807|
            H-INV:HIT000321503|VEGA:OTTHUMP00000160874 
            Tax_Id=9606 Gene_Symbol=GABRG2 Gamma-aminobutyric-acid 
            receptor subunit gamma-2 precursor [...]
       		

Revised format from 22 of February 2006

From the 22 February 2006 onwards, which corresponds to human/mouse/rat 3.15, zebrafish 3.14, arabidopsis 3.13, chicken 3.09 and cow 3.01 releases, all RefSeq entries will be listed after the solitary REFSEQ database code.

e.g.

Previously:

            >IPI:IPI00000005.1|SWISS-PROT:P01111-3|TREMBL:P54111|
            REFSEQ_NP:NP_002515|REFSEQ_XP:XP_032698;XP_001317|
            ENSEMBL:ENSP00000261444|H-INV:HIT000032298 [...]
       		

Revised format:

            >IPI:IPI00000005.1|SWISS-PROT:P01111-3|TREMBL:P54111|
            REFSEQ:NP_002515;XP_032698;XP_001317|
            ENSEMBL:ENSP00000261444|H-INV:HIT000032298 [...]
       		

spacer
spacer