spacer
spacer

2Can Support Portal - Nucleotide Analysis


EMBOSS Transeq example

Once the RNA has been transcribed, it travels from the DNA template to the ribosome on the endoplasmic reticulum to be translated for protein synthesis. Each 3 bases in the RNA sequence codes for 1 amino acid. As you may not be sure what position to start at when predicting what protein sequence may be produced by this code, you could start with one of 3 positions from either end of the RNA sequence. There are 6 possible predicted protein sequences resulting from such a piece of code. These are known as the 6 possible reading frames. There are 3 forward frames and 3 reverse sense frames.


Consider the following EMBOSS Transeq submission form:

We will be using as an example Sequence 6.


Go to the main EBI website

N.B. This tool can also be programmatically accessed as a web service.


Here I have chosen to show the output of the amino acid sequence in all 6 reading frames and have coloured it. The program will calculate the 6 possible outputs as follows:


e.g. 5' CAATGGCTAGGTACTATGTATGAGATCATGATCTTTACAAATCCGAG  3'  DNA

Forward Frames

       CAA TGG CTA GGT ACT ATG TAT GAG ATC ATG ATC TTT ACA AAT CCG AG DNA
        Q   W   L   G   T   M   Y   E   I   M   I   F   T   N   P     Amino Acids

     C AAT GGC TAG GTA CTA TGT ATG AGA TCA TGA TCT TTA CAA ATC CGA G  DNA
        N   G   *   V   L   C   M   R   S   *   S   L   Q   I   R     Amino Acids

    CA ATG GCT AGG TAC TAT GTA TGA GAT CAT GAT CTT TAC AAA TCC GAG    DNA
        M   A   R   Y   Y   V   *   D   H   D   L   Y   K   S   E     Amino Acids

Reverse Frames

Here we take the reverse/complimentry (bottom) strand and reverse it so it starts with the 5' end.
 
   5' CAATGGCTAGGTACTATGTATGAGATCATGATCTTTACAAATCCGAG  3'  DNA Top Strand
      |||||||||||||||||||||||||||||||||||||||||||||||
   3' GTTACCGATCCATGATACATACTCTAGTACTAGAAATGTTTAGGCTC  5'  DNA Bottom (Complimentry) Strand

   5' CTCGGATTTGTAAAGATCATGATCTCATACATAGTACCTAGCCATTG  3'  DNA Bottom (Complimentry) Strand Reversed 


       CTC GGA TTT GTA AAG ATC ATG ATC TCA TAC ATA GTA CCT AGC CAT TG DNA
        L   G   F   V   K   I   M   I   S   Y   I   V   P   S   H   X Amino Acids

     C TCG GAT TTG TAA AGA TCA TGA TCT CAT ACA TAG TAC CTA GCC ATT G  DNA
        S   D   L   *   R   S   *   S   H   T   *   Y   L   A   I  X  Amino Acids

    CT CGG ATT TGT AAA GAT CAT GAT CTC ATA CAT AGT ACC TAG CCA TTG    DNA
        R   I   C   K   D   H   D   L   I   H   S   T   *   P   L     Amino Acids

The output from the transeq program is the same and reads as follows:

Genetic Code table used: [0] -> Standard Genetic Code
Frames: All Six Frames 

>nucleotide_1 sequence
QWLGTMYEIMIFTNPX
>nucleotide_2 sequence
NG*VLCMRS*SLQIRX
>nucleotide_3 sequence
MARYYV*DHDLYKSE
>nucleotide_4 sequence
RICKDHDLIHST*PL
>nucleotide_5 sequence SDL*RS*SHT*YLAIX >nucleotide_6 sequence LGFVKIMISYIVPSHX


We will next look at changing options for the EMBOSS Transeq tool and documentation >>>


<<< Previous || Start of Lesson || Next >>>

spacer
spacer