|
PICR - Protein Identifier Cross-Reference ServiceUsing PICR is very simple with very few options that need setting.Main Search Page Options
The main search form is divided into four main sections:
PICR can be used to map protein identifiers or sequences, so adjust the data type
selector accordingly in the Input Data section>. You can either paste a list
of protein identifiers (one per line) or protein sequences in FASTA format.
Alternatively, you can upload a file containing this data by clicking on
the
The Input Parameters section can be used to refine your search. By default, PICR will not restrict mappings based on taxonomical information. If you want to obtain mappings for a specific organism, select it from the pull-down list. If the organism you wish to limit to is not in the list, you can type a partial name in the space provided and query the NEWT taxonomy using the Ontology Lookup Service (OLS). A list should appear with the required organism. Any selected value will override the choice selected in the species list above.
Select which databases you wish to map to from the Mapping Databases section. You can map to any number of databases. Note that the choices can sometimes refer to more than one database. For example, selecting Ensembl will attempt to map to all species-specific Ensembl releases, as is the case for Vega, Trome and Refseq. Selecting SwissProt and TREMBL will also include the splice variant databases of each source database. Executing A SearchOnce all search parameters have been selected, select the desired output format and click
on the
Searches will try and collate information from multiple databases and may involve SOAP queries to the NCBI. While your search is being executed, a progress bar will be displayed and refreshed every 2 seconds. Once your search is done, the appropriate result page will be shown.
Understanding The ResultsSimple HTML view
The table is organized such that each row is a submitted accession or sequence and each column represents a selected mapping database. An empty cell means that no mappings could be found to the corresponding database for the search parameters you entered.
By default, PICR only returns mappings to active database entries, though many more might
be available. PICR queries the Uniprot Archive (UniParc), which is a historical archive
of all known protein entries for over 60 protein sequence databases. As entries are
deleted or obsoleted from the source databases, they are never deleted from UniParc but
are marked as inactive. PICR can include these inactive mappings in the results if the
Entries that can map to an active SwissProt or TREMBL may also have additional mappings, which will be shown in blue. These mappings are obtained from the Uniprot Knowledge Base and, while valid, might not have 100% sequence identity to the submitted accession. Once a search has been done, results can be saved in CSV format or another search can be started.
A dialog box will be shown prompting you to save or open your file.
If the submitted accession or sequence is not present in the Uniprot Archive, it cannot be mapped at this time.
The detailed HTML view will contain additional information not shown in the simple HTML view. Mappings are done on the basis of 100% sequence identity. As such, one protein accession (P29375 in this example) can map to more than one protein sequence. Each sequence will have a UPI (Uniparc Protein Identifier) as well as multiple cross-references. Each cross-reference will contain:
|