Help - Relationship between DNA, RNA and Protein Sequences
DNA Stands
This option lets you choose which DNA strand to search with when you are using a DNA sequence to compare against the DNA databanks. The 'default' is to search the 'both' strands. 'top' means the sequence will be searched as it is input into the form. 'bottom' means the reverse and complement sequence to your input sequence will be searched against the database entry.
A gene is composed of DNA, which is located in the nucleus. It is a double helix consisting of 2 strands. Many tools will have options where you can search against the top, bottom or both strands of DNA.
This bonding between strands is known as a base pair. A base pair is simply a pair of bases which form bonds with each other. There are only two base pairs found in DNA: adenine(A) and thymine(T) form one base pair, and cytosine(C) and guanine(G) form the other.
This piece of hypothetical DNA could produce 2 RNA sequences based upon which strand is used as the template. They are similar to the reverse strand of DNA except Uradine(U) replaces thymine(T), found in DNA.
e.g. CAATGGCTAGGTACTATGTATGAGATCATGATCTTTACAAATCCGAG Top Strand
|||||||||||||||||||||||||||||||||||||||||||||||
GUUACCGAUCCAUGAUACAUACUCUAGUACTAGAAAUGUUUAGGCUC RNA
CAAUGGCUAGGUACUAUGUAUGAGAUCAUGAUCUUUACAAAUCCGAG RNA
|||||||||||||||||||||||||||||||||||||||||||||||
GTTACCGATCCATGATACATACTCTAGTACTAGAAATGTTTAGGCTC Bottom Strand
If your query sequence was as follows:
CAAUGGCUAGGUACUAUGUAUGAGAUCAUGAUCUUUACAAAUCCGAG
This would match the top strand.
e.g. CAATGGCTAGGTACTATGTATGAGATCATGATCTTTACAAATCCGAG Top Strand
|||||||||||||||||||||||||||||||||||||||||||||||
GUUACCGAUCCAUGAUACAUACUCUAGUACTAGAAAUGUUUAGGCUC Original sequence
If the database entry consisted of the sequence in the bottom strand, no match would occur, so it is possible to calculate the reverse and complement sequence to your query sequence:
e.g. GUUACCGAUCCAUGAUACAUACUCUAGUACTAGAAAUGUUUAGGCUC Original sequence
CAAUGGCUAGGUACUAUGUAUGAGAUCAUGAUCUUUACAAAUCCGAG Compliment sequence
This sequence would then match the bottom strand:
CAAUGGCUAGGUACUAUGUAUGAGAUCAUGAUCUUUACAAAUCCGAG Compliment sequence
|||||||||||||||||||||||||||||||||||||||||||||||
GTTACCGATCCATGATACATACTCTAGTACTAGAAATGTTTAGGCTC Bottom Strand
Reading Frames
Once the RNA has been transcribed, it travels from the DNA template to the ribosome on the endoplasmic reticulum to be translated for protein sysnthesis. Each 3 bases in the RNA sequence codes for 1 amino acid. As you may not be sure what position to start at when predicting what protein sequence may be produced by this code, you could start with one of 3 positions from either end of the RNA sequence. Thus there are 6 possible predicted protein sequences resulting from such a peice of code. These are known as the 6 possible reading frames. There are 3 forward frames and 3 reverse sense frames.
e.g. 5' CAATGGCTAGGTACTATGTATGAGATCATGATCTTTACAAATCCGAG 3' DNA Forward Frames CAA TGG CTA GGT ACT ATG TAT GAG ATC ATG ATC TTT ACA AAT CCG AG DNA Q W L G T M Y E I M I F T N P Amino Acids C AAT GGC TAG GTA CTA TGT ATG AGA TCA TGA TCT TTA CAA ATC CGA G DNA N G * V L C M R S * S L Q I R Amino Acids CA ATG GCT AGG TAC TAT GTA TGA GAT CAT GAT CTT TAC AAA TCC GAG DNA M A R Y Y V * D H D L Y K S E Amino Acids Reverse Frames Here we take the reverse/complimentry (bottom) strand and reverse it so it starts with the 5' end. 5' CAATGGCTAGGTACTATGTATGAGATCATGATCTTTACAAATCCGAG 3' DNA Top Strand ||||||||||||||||||||||||||||||||||||||||||||||| 3' GTTACCGATCCATGATACATACTCTAGTACTAGAAATGTTTAGGCTC 5' DNA Bottom (Complimentry) Strand 5' CTCGGATTTGTAAAGATCATGATCTCATACATAGTACCTAGCCATTG 3' DNA Bottom (Complimentry) Strand Reversed CTC GGA TTT GTA AAG ATC ATG ATC TCA TAC ATA GTA CCT AGC CAT TG DNA L G F V K I M I S Y I V P S H X Amino Acids C TCG GAT TTG TAA AGA TCA TGA TCT CAT ACA TAG TAC CTA GCC ATT G DNA S D L * R S * S H T * Y L A I X Amino Acids CT CGG ATT TGT AAA GAT CAT GAT CTC ATA CAT AGT ACC TAG CCA TTG DNA R I C K D H D L I H S T * P L Amino Acids
Convert a nucleotide sequence to a protein sequence with Transeq
Molecule Types
| Code | Description |
|---|---|
| Protein | Proteins are macromolecules made up from 20 different amino acids, also referred to as residues. |
| DNA | This is genomic DNA, a sequence derived directly from the DNA of an organism. |
| RNA | This is genomic RNA, a sequence derived directly from the genomic RNA of certain organisms. |
| preRNA | This is precursor RNA, an RNA transcript before it is processed into mRNA, rRNA, tRNA, or other cellular RNA species, any RNA species that is not yet the mature RNA product. |
| mRNA | This is messenger RNA, it is a copy of the information carried by a gene on the DNA. The role of mRNA is to move the information contained in DNA to the translation machinery (ribosomes). |
| rRNA | This is ribosomal RNA, it is a component of the ribosomes, the protein synthetic factories in the cell. |
| tRNA | This is transfer RNA, it is the information adapter molecule. It is the direct interface between amino-acid sequence of a protein and the information in DNA. Therefore it decodes the information in DNA. |
| snRNA | This is small nuclear RNA and refers to a number of small RNA molecules found in the nucleus. These RNA molecules are important several processes including RNA splicing and maintenance of the telomeres, or chromosome ends. They are always found associated with specific proteins and the complexes are referred to as small nuclear ribonucleoproteins (SNRNP). |
e.g. CAATGGCTAGGTACTATGTATGAGATCATGATCTTTACAAATCCGAG Top Strand ||||||||||||||||||||||||||||||||||||||||||||||| GTTACCGATCCATGATACATACTCTAGTACTAGAAATGTTTAGGCTC Bottom Strand