spacer
pdbe.org/sifts
SIFTS

Quick Access

The FTP site provides access to data from the SIFTS initiative. The public FTP account is maintained by the European Bioinformatics Institute.

Individual entry data can either be found in a path like this:

ftp://ftp.ebi.ac.uk/pub/databases/msd/sifts/xml/1xyz.xml.gz - where 1xyz is the actual PDB id code

or in a path like this:

ftp://ftp.ebi.ac.uk/pub/databases/msd/sifts/split_xml/xy/1xyz.xml.gz - where 'xy' are the second and third characters of the PDB id code and 1xyz is the actual PDB id code.

Residue-level cross reference data from the new PDBe database has been made available in the xml_remediated directory. This directory contains a file for each PDB entry as before and in the same format. Note that the tab delimited ASCII text format files and PDBe view web pages correspond to the xml_remediated files.

The residue-level cross reference legacy data is available in XML format and is located in the XML directory. This directory contains a file for each PDB entry.

The data for chain level mapping for all the PDB chains is also available in both a tab delimited ASCII text format located in the "text" directory and a comma separated value format (CSV) located in the "csv" directory

Both the "text" and "csv" directories contain a number of flat files exported from the PDBe database. The contents of each file are as follows:



pdb_chain_uniprot.lst pdb_chain_uniprot.csv A summary of the PDBe to UniProt residue level mapping, showing the start and end residues of the mapping using SEQRES, PDB sequence and UniProt numbering.
pdb_chain_taxonomy.lst pdb_chain_taxonomy.csv A summary of the NCBI tax_id(s),scientific_name(s) and chain type for each PDB chain that has been processed.
pdb_pubmed.lst pdb_pubmed.csv A summary of the Pubmed id(s) associated with each PDB entry, together with an ordinal number.
pdb_chain_enzyme.lst pdb_chain_enzyme.csv A summary of the EC number(s) (derived via the UniProt mapping) for each PDB chain that has been processed.
pdb_chain_go.lst pdb_chain_go.csv A summary of the GO identifier(s) (derived via the UniProt mapping) for each PDB chain that has been processed.
pdb_chain_interpro.lst pdb_chain_interpro.csv A summary of the InterPro identifier(s) (derived via the UniProt mapping) for each PDB chain that has been processed.
pdb_chain_pfam.lst pdb_chain_pfam.csv A summary of the Pfam domain identifier(s)(derived via the UniProt mapping) for each PDB chain that has been processed.
pdb_chain_cath_uniprot.lst pdb_chain_cath_uniprot.csv A summary of the CATH identifier(s) and UniProt primary accession number(s) for each PDB chain that has been processed.
pdb_chain_scop_uniprot.lst pdb_chain_scop_uniprot.csv A summary of the SCOP identifier(s) and UniProt primary accession number(s) for each PDB chain that has been processed.


spacer
spacer