EMBL-EBI > Goldman Group

PANDIT Home | Browse PANDIT | Help on PANDIT | Release notes | Pfam



PANDIT Homepage
pan•dit
PANDIT
Protein and Associated NucleotideDomains with Inferred Trees

PANDIT flat file description


Note
All of the information in the PANDIT database is contained within the downloadable flatfile. The information for each family is arranged as follows:

File description
field no. lines permitted block repeats description
FAM 1 this
block
repeated
for
each
family
in
PANDIT
family number = Pfam accession number
PID 1 Pfam ID = short name for family
DES >= 1 description = longer descriptive name
IPR 0 | 1 InterPro accession number, if available
COM >= 0 comments = more information on domain
ANO 1 number of seqs in AA seq alignment (PANDIT-aa)
ALN 1 length (in AAs, incl gaps) of AA seq alignments (PANDIT-aa & PANDIT-aa-restricted)
AID 1 PANDIT-aa ave AA pairwise identity
APH 1 PANDIT-aa AA seq alignment phylogeny (tree of evolutionary relationships + branch lengths)
ATP 1 PANDIT-aa AA seq alignment topology (tree of evolutionary relationships, no branch lengths)
ATL 1 PANDIT-aa AA seq alignment tree length (sum of branch lengths in APH field)
DNO 1 number of seqs in DNA (and restricted AA) seq alignments (PANDIT-dna & PANDIT-aa-restricted)
DLN 1 length (in bp, incl gaps) of DNA seq alignment (should equal 3*ALN)
DID 1 PANDIT-dna ave DNA pairwise identity
DPH 1 PANDIT-dna DNA seq alignment phylogeny (tree of evolutionary relationships + branch lengths)
DTP 1 PANDIT-dna DNA seq alignment topology (tree of evolutionary relationships, no branch lengths)
DTL 1 PANDIT-dna DNA seq alignment tree length (sum of branch lengths in DPH field)
RID 1 PANDIT-aa-restricted ave AA pairwise identity
RPH 1 PANDIT-aa-restricted AA seq alignment phylogeny (tree of evolutionary relationships + branch lengths)
RTP 1 PANDIT-aa-restricted AA seq alignment topology (tree of evolutionary relationships, no branch lengths)
RTL 1 PANDIT-aa-restricted AA seq alignment tree length (sum of branch lengths in RPH field)
LNK ANO list of correspondences linking seq names NAM (see below) to their respective SWISS-PROT accession numbers (each followed by :) and (if available) EMBL accession numbers
AMK 1 amino acid sequence mask, indicating which sites of AA seq alignments (PANDIT-aa & PANDIT-aa-restricted) are deemed reliable ('x') and unreliable ('.') by HMM analysis
DMK 1 DNA sequence mask, indicating which sites of DNA seq alignments (PANDIT-dna) are deemed reliable ('x') and unreliable ('.') by HMM analysis
NAM 1 this
block
repeated
ANO
times
seq name (see below for explanation of seq names)
ASQ 1 aligned AA seq
DSQ 0 | 1 aligned DNA seq, if available
TRN 0 | 1 info regarding comparisons of different genetic code translations of DSQ with ASQ, if DSQ is available—used to check for correct and accurate DNA seq corresp to each AA seq; contact pandit@ebi.ac.uk for further details
// 1 indicates end of family

Comment
The NAM fields give the name for each sequence in the family, consisting of the name of the SWISS-PROT entry from which the AA sequence in Pfam is derived, followed by a /, followed by the starting - ending residue positions of the sequence in that SWISS-PROT entry.