Help - Relationship between DNA, RNA and Protein Sequences


DNA Stands

This option lets you choose which DNA strand to search with when you are using a DNA sequence to compare against the DNA databanks. The 'default' is to search the 'both' strands. 'top' means the sequence will be searched as it is input into the form. 'bottom' means the reverse and complement sequence to your input sequence will be searched against the database entry.

A gene is composed of DNA, which is located in the nucleus. It is a double helix consisting of 2 strands. Many tools will have options where you can search against the top, bottom or both strands of DNA.




e.g.   CAATGGCTAGGTACTATGTATGAGATCATGATCTTTACAAATCCGAG      Top Strand
       |||||||||||||||||||||||||||||||||||||||||||||||
       GTTACCGATCCATGATACATACTCTAGTACTAGAAATGTTTAGGCTC      Bottom Strand



This bonding between strands is known as a base pair. A base pair is simply a pair of bases which form bonds with each other. There are only two base pairs found in DNA: adenine(A) and thymine(T) form one base pair, and cytosine(C) and guanine(G) form the other.

This piece of hypothetical DNA could produce 2 RNA sequences based upon which strand is used as the template. They are similar to the reverse strand of DNA except Uradine(U) replaces thymine(T), found in DNA.



e.g.   CAATGGCTAGGTACTATGTATGAGATCATGATCTTTACAAATCCGAG  Top Strand
       |||||||||||||||||||||||||||||||||||||||||||||||


       GUUACCGAUCCAUGAUACAUACUCUAGUACTAGAAAUGUUUAGGCUC  RNA


       CAAUGGCUAGGUACUAUGUAUGAGAUCAUGAUCUUUACAAAUCCGAG  RNA

       |||||||||||||||||||||||||||||||||||||||||||||||
       GTTACCGATCCATGATACATACTCTAGTACTAGAAATGTTTAGGCTC  Bottom Strand

If your query sequence was as follows:


       CAAUGGCUAGGUACUAUGUAUGAGAUCAUGAUCUUUACAAAUCCGAG 

This would match the top strand.


e.g.   CAATGGCTAGGTACTATGTATGAGATCATGATCTTTACAAATCCGAG  Top Strand
       |||||||||||||||||||||||||||||||||||||||||||||||
       GUUACCGAUCCAUGAUACAUACUCUAGUACTAGAAAUGUUUAGGCUC  Original sequence 

If the database entry consisted of the sequence in the bottom strand, no match would occur, so it is possible to calculate the reverse and complement sequence to your query sequence:


e.g.   GUUACCGAUCCAUGAUACAUACUCUAGUACTAGAAAUGUUUAGGCUC  Original sequence


       CAAUGGCUAGGUACUAUGUAUGAGAUCAUGAUCUUUACAAAUCCGAG  Compliment sequence

This sequence would then match the bottom strand:


       CAAUGGCUAGGUACUAUGUAUGAGAUCAUGAUCUUUACAAAUCCGAG  Compliment sequence
       |||||||||||||||||||||||||||||||||||||||||||||||
       GTTACCGATCCATGATACATACTCTAGTACTAGAAATGTTTAGGCTC  Bottom Strand


Top


Reading Frames

Once the RNA has been transcribed, it travels from the DNA template to the ribosome on the endoplasmic reticulum to be translated for protein sysnthesis. Each 3 bases in the RNA sequence codes for 1 amino acid. As you may not be sure what position to start at when predicting what protein sequence may be produced by this code, you could start with one of 3 positions from either end of the RNA sequence. Thus there are 6 possible predicted protein sequences resulting from such a peice of code. These are known as the 6 possible reading frames. There are 3 forward frames and 3 reverse sense frames.



e.g. 5' CAATGGCTAGGTACTATGTATGAGATCATGATCTTTACAAATCCGAG  3'  DNA

Forward Frames

       CAA TGG CTA GGT ACT ATG TAT GAG ATC ATG ATC TTT ACA AAT CCG AG DNA
        Q   W   L   G   T   M   Y   E   I   M   I   F   T   N   P     Amino Acids

     C AAT GGC TAG GTA CTA TGT ATG AGA TCA TGA TCT TTA CAA ATC CGA G  DNA
        N   G   *   V   L   C   M   R   S   *   S   L   Q   I   R     Amino Acids

    CA ATG GCT AGG TAC TAT GTA TGA GAT CAT GAT CTT TAC AAA TCC GAG    DNA
        M   A   R   Y   Y   V   *   D   H   D   L   Y   K   S   E     Amino Acids

Reverse Frames

Here we take the reverse/complimentry (bottom) strand and reverse it so it starts with the 5' end.
 
   5' CAATGGCTAGGTACTATGTATGAGATCATGATCTTTACAAATCCGAG  3'  DNA Top Strand
      |||||||||||||||||||||||||||||||||||||||||||||||
   3' GTTACCGATCCATGATACATACTCTAGTACTAGAAATGTTTAGGCTC  5'  DNA Bottom (Complimentry) Strand

   5' CTCGGATTTGTAAAGATCATGATCTCATACATAGTACCTAGCCATTG  3'  DNA Bottom (Complimentry) Strand Reversed 


       CTC GGA TTT GTA AAG ATC ATG ATC TCA TAC ATA GTA CCT AGC CAT TG DNA
        L   G   F   V   K   I   M   I   S   Y   I   V   P   S   H   X Amino Acids

     C TCG GAT TTG TAA AGA TCA TGA TCT CAT ACA TAG TAC CTA GCC ATT G  DNA
        S   D   L   *   R   S   *   S   H   T   *   Y   L   A   I  X  Amino Acids

    CT CGG ATT TGT AAA GAT CAT GAT CTC ATA CAT AGT ACC TAG CCA TTG    DNA
        R   I   C   K   D   H   D   L   I   H   S   T   *   P   L     Amino Acids

Convert a nucleotide sequence to a protein sequence with Transeq





Top


Molecule Types

Code Description
Protein Proteins are macromolecules made up from 20 different amino acids, also referred to as residues.
DNA This is genomic DNA, a sequence derived directly from the DNA of an organism.
RNA This is genomic RNA, a sequence derived directly from the genomic RNA of certain organisms.
preRNA This is precursor RNA, an RNA transcript before it is processed into mRNA, rRNA, tRNA, or other cellular RNA species, any RNA species that is not yet the mature RNA product.
mRNA This is messenger RNA, it is a copy of the information carried by a gene on the DNA. The role of mRNA is to move the information contained in DNA to the translation machinery (ribosomes).
rRNA This is ribosomal RNA, it is a component of the ribosomes, the protein synthetic factories in the cell.
tRNA This is transfer RNA, it is the information adapter molecule. It is the direct interface between amino-acid sequence of a protein and the information in DNA. Therefore it decodes the information in DNA.
snRNA This is small nuclear RNA and refers to a number of small RNA molecules found in the nucleus. These RNA molecules are important several processes including RNA splicing and maintenance of the telomeres, or chromosome ends. They are always found associated with specific proteins and the complexes are referred to as small nuclear ribonucleoproteins (SNRNP).




























Top