0%

From disease to protein to variant

UniProt makes it easy to identify and retrieve disease-related proteins and the disease-causing variants that they contain. One way to do this is to follow the steps below. In this example, we are interested in finding out about spinal muscular atrophy 2 which is also known as SMA2.

On the UniProt website, select ‘Human diseases’ from the drop-down menu next to the search box, add the disease name ‘sma2’ in the search box and click on the ‘Search’ button (Figure 70).

Figure 70 ‘Human diseases’ search for ‘sma2’.

The search returns a link to a page which gives an overview of the disease and links to resources which can tell you more (Figure 71). 

Figure 71 Search result for ‘sma2’.

Click on the ‘View proteins’ link to access a list of all proteins known to be genetically associated with the disease. In this case, currently only 1 protein is linked, survival motor neuron protein (Figure 72).

Figure 72 ‘spinal muscular atrophy 2’ disease mapped to UniProtKB results.

Click on the ‘Q16637’ accession number in the results table to access the entry and then click on the ‘Disease & Variants’ link in the menu on the left side of the page to jump directly to that section within the entry (Figure 73). The protein is associated with a number of diseases including SMA2 and you can find information in the entry about the disease as well as variants in the protein which are associated with this disorder.

Figure 73 ‘Disease and Variants’ section of UniProtKB entry ‘Q16637’ describing SMA2.
As well as accessing variant information from the website, UniProt also provides variation information through FTP downloads. The humsavar.txt file is an index of manually curated human polymorphisms and disease mutations in UniProtKB/Swiss-Prot. Additional files are provided for a number of species, including human, which list variants from sources – such as 1000 Genomes, Ensembl and COSMIC (v71) – that are not in UniProtKB and that modify the protein sequence, including nonsynonymous or missense variants, stop lost, stop gained and initiator codon variants.