spacer
spacer

2Can Support Portal - About Optimal Alignments

The alignment that is the best, given a defined set of rules and parameter values for comparing different alignments. There is no such thing as the single best alignment, since optimality always depends on the assumptions one bases the alignment on. For example, what penalty should gaps carry? All sequence alignment procedures make some such assumptions.

Global alignment

An alignment that assumes that the two proteins are basically similar over the entire length of one another. The alignment attempts to match them to each other from end to end, even though parts of the alignment are not very convincing. A tiny example:


        NLGPSTKDFGKISESREFDNQ
          ||  ||  ||  
        QLNQLERSFGKINMRLEDALV
                  

Local alignment

An alignment that searches for segments of the two sequences that match well. There is no attempt to force entire sequences into an alignment, just those parts that appear to have good similarity, according to some criterion. Using the same sequences as above, one could get:


        NLGPSTKDDFGKILGPSTKDDQ
                  ||  || 
        QNQLERSSNFGKINQLERSSNN

It may seem that one should always use local alignments. However, it may be difficult to spot an overall similarlity, as opposed to just a domain-to-domain similarity, if one uses only local alignment, so global alignment is useful in some cases. You can produce a global or a local alignment with the Emboss Pairwise global and local alignment tool. You can search sequence databases, producing local alignments of your query sequence against known sequences with the programs BLAST and FASTA. ClustalW2 is a general purpose multiple sequence alignment program for DNA or proteins. Use this if you wish to compare your sequences against each other. You may wish to edit the alignments you have obtained if you do not like the positions chosen by programs, you can do this using the tool JalView.

Example alignment with ClustalW2


e.g. A multiple sequence alignment was done with ClustalW2 using FOSB_MOUSE vs FOSB_HUMAN. Sequences were input in the FASTA format:


>FOSB_MOUSE Protein fosB
     MFQAFPGDYD SGSRCSSSPS AESQYLSSVD SFGSPPTAAA SQECAGLGEM PGSFVPTVTA 
     ITTSQDLQWL VQPTLISSMA QSQGQPLASQ PPAVDPYDMP GTSYSTPGLS AYSTGGASGS
     GGPSTSTTTS GPVSARPARA RPRRPREETL TPEEEEKRRV RRERNKLAAA KCRNRRRELT
     DRLQAETDQL EEEKAELESE IAELQKEKER LEFVLVAHKP GCKIPYEEGP GPGPLAEVRD
     LPGSTSAKED GFGWLLPPPP PPPLPFQSSR DAPPNLTASL FTHSEVQVLG DPFPVVSPSY
     TSSFVLTCPE VSAFAGAQRT SGSEQPSDPL NSPSLLAL  
                     
>FOSB_HUMAN Protein fosB
     MFQAFPGDYD SGSRCSSSPS AESQYLSSVD SFGSPPTAAA SQECAGLGEM PGSFVPTVTA
     ITTSQDLQWL VQPTLISSMA QSQGQPLASQ PPVVDPYDMP GTSYSTPGMS GYSSGGASGS 
     GGPSTSGTTS GPGPARPARA RPRRPREETL TPEEEEKRRV RRERNKLAAA KCRNRRRELT 
     DRLQAETDQL EEEKAELESE IAELQKEKER LEFVLVAHKP GCKIPYEEGP GPGPLAEVRD
     LPGSAPAKED GFSWLLPPPP PPPLPFQTSQ DAPPNLTASL FTHSEVQVLG DPFPVVNPSY
     TSSFVLTCPE VSAFAGAQRT SGSDQPSDPL NSPSLLAL 
                       
>FOS_CHICK Proto-oncogene protein c-fos 
     MMYQGFAGEY EAPSSRCSSA SPAGDSLTYY PSPADSFSSM GSPVNSQDFC TDLAVSSANF 
     VPTVTAISTS PDLQWLVQPT LISSVAPSQN RGHPYGVPAP APPAAYSRPA VLKAPGGRGQ 
     SIGRRGKVEQ LSPEEEEKRR IRRERNKMAA AKCRNRRREL TDTLQAETDQ LEEEKSALQA  
     EIANLLKEKE KLEFILAAHR PACKMPEELR FSEELAAATA LDLGAPSPAA AEEAFALPLM 
     TEAPPAVPPK EPSGSGLELK AEPFDELLFS AGPREASRSV PDMDLPGASS FYASDWEPLG 
     AGSGGELEPL CTPVVTCTPC PSTYTSTFVF TYPEADAFPS CAAAHRKGSS SNEPSSDSLS 
     SPTLLAL  

>FOS_RAT Proto-oncogene protein c-fos 
     MMFSGFNADY EASSSRCSSA SPAGDSLSYY HSPADSFSSM GSPVNTQDFC ADLSVSSANF  
     IPTVTAISTS PDLQWLVQPT LVSSVAPSQT RAPHPYGLPT PSTGAYARAG VVKTMSGGRA  
     QSIGRRGKVE QLSPEEEEKR RIRRERNKMA AAKCRNRRRE LTDTLQAETD QLEDEKSALQ  
     TEIANLLKEK EKLEFILAAH RPACKIPNDL GFPEEMSVTS LDLTGGLPEA TTPESEEAFT 
     LPLLNDPEPK PSLEPVKNIS NMELKAEPFD DFLFPASSRP SGSETARSVP DVDLSGSFYA
     ADWEPLHSSS LGMGPMVTEL EPLCTPVVTC TPSCTTYTSS FVFTYPEADS FPSCAAAHRK
     GSSSNEPSSD SLSSPTLLAL  

>FOS_MOUSE Proto-oncogene protein c-fos 
     MMFSGFNADY EASSSRCSSA SPAGDSLSYY HSPADSFSSM GSPVNTQDFC ADLSVSSANF
     IPTVTAISTS PDLQWLVQPT LVSSVAPSQT RAPHPYGLPT QSAGAYARAG MVKTVSGGRA
     QSIGRRGKVE QLSPEEEEKR RIRRERNKMA AAKCRNRRRE LTDTLQAETD QLEDEKSALQ
     TEIANLLKEK EKLEFILAAH RPACKIPDDL GFPEEMSVAS LDLTGGLPEA STPESEEAFT
     LPLLNDPEPK PSLEPVKSIS NVELKAEPFD DFLFPASSRP SGSETSRSVP DVDLSGSFYA
     ADWEPLHSNS LGMGPMVTEL EPLCTPVVTC TPGCTTYTSS FVFTYPEADS FPSCAAAHRK
     GSSSNEPSSD SLSSPTLLAL
		            

Output was in the format:


FOS_RAT         MMFSGFNADYEASSSRCSSASPAGDSLSYYHSPADSFSSMGSPVNTQDFCADLSVSSANF 60
FOS_MOUSE       MMFSGFNADYEASSSRCSSASPAGDSLSYYHSPADSFSSMGSPVNTQDFCADLSVSSANF 60
FOS_CHICK       MMYQGFAGEYEAPSSRCSSASPAGDSLTYYPSPADSFSSMGSPVNSQDFCTDLAVSSANF 60
FOSB_MOUSE      -MFQAFPGDYDS-GSRCSS-SPSAESQ--YLSSVDSFGSPPTAAASQE-CAGLGEMPGSF 54
FOSB_HUMAN      -MFQAFPGDYDS-GSRCSS-SPSAESQ--YLSSVDSFGSPPTAAASQE-CAGLGEMPGSF 54
                 *:..* .:*:: .***** **:.:*   * *..***.*  :.. :*: *:.*.  ...*

FOS_RAT         IPTVTAISTSPDLQWLVQPTLVSSVAPSQ-------TRAPHPYGLPTPS-TGAYARAGVV 112
FOS_MOUSE       IPTVTAISTSPDLQWLVQPTLVSSVAPSQ-------TRAPHPYGLPTQS-AGAYARAGMV 112
FOS_CHICK       VPTVTAISTSPDLQWLVQPTLISSVAPSQ-------NRG-HPYGVPAPAPPAAYSRPAVL 112
FOSB_MOUSE      VPTVTAITTSQDLQWLVQPTLISSMAQSQGQPLASQPPAVDPYDMPGTS----YSTPGLS 110
FOSB_HUMAN      VPTVTAITTSQDLQWLVQPTLISSMAQSQGQPLASQPPVVDPYDMPGTS----YSTPGMS 110
                :******:** **********:**:* **... ::.    .**.:*  :    *: ..: 

FOS_RAT         KTMSGGRAQSIG--------------------RRGKVEQLSPEEEEKRRIRRERNKMAAA 152
FOS_MOUSE       KTVSGGRAQSIG--------------------RRGKVEQLSPEEEEKRRIRRERNKMAAA 152
FOS_CHICK       KAP-GGRGQSIG--------------------RRGKVEQLSPEEEEKRRIRRERNKMAAA 151
FOSB_MOUSE      AYSTGGASGSGGPSTSTTTSGPVSARPARARPRRPREETLTPEEEEKRRVRRERNKLAAA 170
FOSB_HUMAN      GYSSGGASGSGGPSTSGTTSGPGPARPARARPRRPREETLTPEEEEKRRVRRERNKLAAA 170
                   :** . * *.::: :::.. .: .: : .** : * *:********:******:***

FOS_RAT         KCRNRRRELTDTLQAETDQLEDEKSALQTEIANLLKEKEKLEFILAAHRPACKIPNDLGF 212
FOS_MOUSE       KCRNRRRELTDTLQAETDQLEDEKSALQTEIANLLKEKEKLEFILAAHRPACKIPDDLGF 212
FOS_CHICK       KCRNRRRELTDTLQAETDQLEEEKSALQAEIANLLKEKEKLEFILAAHRPACKMPEELRF 211
FOSB_MOUSE      KCRNRRRELTDRLQAETDQLEEEKAELESEIAELQKEKERLEFVLVAHKPGCKIPYEEG- 229
FOSB_HUMAN      KCRNRRRELTDRLQAETDQLEEEKAELESEIAELQKEKERLEFVLVAHKPGCKIPYEEG- 229
                *********** *********:**: *::***:* ****:***:*.**:*.**:* :   

FOS_RAT         PEEMSVTS-LDLTGGLPEATTPESEEAFTLPLLNDPEPK-PSLEPVKNISNMELKAEPFD 270
FOS_MOUSE       PEEMSVAS-LDLTGGLPEASTPESEEAFTLPLLNDPEPK-PSLEPVKSISNVELKAEPFD 270
FOS_CHICK       SEELAAATALDLG----APSPAAAEEAFALPLMTEAPPAVPPKEPSG--SGLELKAEPFD 265
FOSB_MOUSE      PGPGPLAEVRDLPG-----STSAKEDGFGWLLPPPPPPP-----------------LPFQ 267
FOSB_HUMAN      PGPGPLAEVRDLPG-----SAPAKEDGFSWLLPPPPPPP-----------------LPFQ 267
                .   . :   ** .     :..  *:.*   *   . *                   **:

FOS_RAT         DFLFPASSRPSGSETARSVPDVDLSG--SFYAADWEPLHSSSLGMGPMVTELEPLCTPVV 328
FOS_MOUSE       DFLFPASSRPSGSETSRSVPDVDLSG--SFYAADWEPLHSNSLGMGPMVTELEPLCTPVV 328
FOS_CHICK       ELLFSAGPR----EASRSVPDMDLPGASSFYASDWEPLGAGSGG------ELEPLCTPVV 315
FOSB_MOUSE      --------------SSRDAP-PNLTA--SLFTHS----------------EVQVLGDPFP 294
FOSB_HUMAN      --------------TSQDAP-PNLTA--SLFTHS----------------EVQVLGDPFP 294
                              :::..*  :*..  *::: .                *:: *  *. 

FOS_RAT         TCTPSCTTYTSSFVFTYPEADSFPSCAAAHRKGSSSNEPSSDSLSSPTLLAL 380
FOS_MOUSE       TCTPGCTTYTSSFVFTYPEADSFPSCAAAHRKGSSSNEPSSDSLSSPTLLAL 380
FOS_CHICK       TCTPCPSTYTSTFVFTYPEADAFPSCAAAHRKGSSSNEPSSDSLSSPTLLAL 367
FOSB_MOUSE      VVSP---SYTSSFVLTCPEVSAF---AGAQR--TSGSEQPSDPLNSPSLLAL 338
FOSB_HUMAN      VVNP---SYTSSFVLTCPEVSAF---AGAQR--TSGSDQPSDPLNSPSLLAL 338
                . .*   :***:**:* **..:*   *.*:*  :*..: .**.*.**:****


                    


Example alignments with DBClustal


DbClustal addresses the important problem of the automatic multiple alignment of the top scoring full-length sequences detected by a database similarity search. By combining the advantages of both local and global alignment algorithms into a single system, DbClustal is able to provide accurate global alignments of highly divergent, complex sequence sets. Local alignment information is incorporated into a ClustalW2 global alignment in the form of a list of anchor points between pairs of sequences.

Reference:

J. D. Thompson, F. Plewniak, J.-C. Thierry and O. Poch. (2000)
DbClustal: rapid and reliable global multiple alignments of protein sequences detected by database searches.
Nucleic Acids Research, 2000, Vol. 28, No. 15 2919-2926. View

Example:

The sequence for FOSB_MOUSE was queried with NCBI-BLASTp (or WU-BLAST2 ) against UniProt. The alignments are shown with DBClustal against similar/identical proteins, which were found to be similar on the NCBI BLAST output.

(a) Query(FOSB_MOUSE) Vs FOSB_MOUSE(Identical) ~




YOURQUERY       MFQAFPGDYDSGSRCSSSPSAESQYLSSVDSFGSPPTAAASQECAGLGEMPGSFVPTVTA 60
FOSB_MOUSE      MFQAFPGDYDSGSRCSSSPSAESQYLSSVDSFGSPPTAAASQECAGLGEMPGSFVPTVTA 60
                ************************************************************

YOURQUERY       ITTSQDLQWLVQPTLISSMAQSQGQPLASQPPAVDPYDMPGTSYSTPGLSAYSTGGASGS 120
FOSB_MOUSE      ITTSQDLQWLVQPTLISSMAQSQGQPLASQPPAVDPYDMPGTSYSTPGLSAYSTGGASGS 120
                ************************************************************

YOURQUERY       GGPSTSTTTSGPVSARPARARPRRPREETLTPEEEEKRRVRRERNKLAAAKCRNRRRELT 180
FOSB_MOUSE      GGPSTSTTTSGPVSARPARARPRRPREETLTPEEEEKRRVRRERNKLAAAKCRNRRRELT 180
                ************************************************************

YOURQUERY       DRLQAETDQLEEEKAELESEIAELQKEKERLEFVLVAHKPGCKIPYEEGPGPGPLAEVRD 240
FOSB_MOUSE      DRLQAETDQLEEEKAELESEIAELQKEKERLEFVLVAHKPGCKIPYEEGPGPGPLAEVRD 240
                ************************************************************

YOURQUERY       LPGSTSAKEDGFGWLLPPPPPPPLPFQSSRDAPPNLTASLFTHSEVQVLGDPFPVVSPSY 300
FOSB_MOUSE      LPGSTSAKEDGFGWLLPPPPPPPLPFQSSRDAPPNLTASLFTHSEVQVLGDPFPVVSPSY 300
                ************************************************************

YOURQUERY       TSSFVLTCPEVSAFAGAQRTSGSEQPSDPLNSPSLLAL 338
FOSB_MOUSE      TSSFVLTCPEVSAFAGAQRTSGSEQPSDPLNSPSLLAL 338
                **************************************
                  

(b) Query(FOSB_MOUSE) vs FOSB_HUMAN (similar)


YOURQUERY       MFQAFPGDYDSGSRCSSSPSAESQYLSSVDSFGSPPTAAASQECAGLGEMPGSFVPTVTA 60
FOSB_HUMAN      MFQAFPGDYDSGSRCSSSPSAESQYLSSVDSFGSPPTAAASQECAGLGEMPGSFVPTVTA 60
                ************************************************************

YOURQUERY       ITTSQDLQWLVQPTLISSMAQSQGQPLASQPPAVDPYDMPGTSYSTPGLSAYSTGGASGS 120
FOSB_HUMAN      ITTSQDLQWLVQPTLISSMAQSQGQPLASQPPVVDPYDMPGTSYSTPGMSGYSSGGASGS 120
                ********************************.***************:*.**:******

YOURQUERY       GGPSTSTTTSGPVSARPARARPRRPREETLTPEEEEKRRVRRERNKLAAAKCRNRRRELT 180
FOSB_HUMAN      GGPSTSGTTSGPGPARPARARPRRPREETLTPEEEEKRRVRRERNKLAAAKCRNRRRELT 180
                ****** ***** .**********************************************

YOURQUERY       DRLQAETDQLEEEKAELESEIAELQKEKERLEFVLVAHKPGCKIPYEEGPGPGPLAEVRD 240
FOSB_HUMAN      DRLQAETDQLEEEKAELESEIAELQKEKERLEFVLVAHKPGCKIPYEEGPGPGPLAEVRD 240
                ************************************************************

YOURQUERY       LPGSTSAKEDGFGWLLPPPPPPPLPFQSSRDAPPNLTASLFTHSEVQVLGDPFPVVSPSY 300
FOSB_HUMAN      LPGSAPAKEDGFSWLLPPPPPPPLPFQTSQDAPPNLTASLFTHSEVQVLGDPFPVVNPSY 300
                ****:.******.**************:*:**************************.***

YOURQUERY       TSSFVLTCPEVSAFAGAQRTSGSEQPSDPLNSPSLLAL 338
FOSB_HUMAN      TSSFVLTCPEVSAFAGAQRTSGSDQPSDPLNSPSLLAL 338
                ***********************:**************
                  

(c) Query(FOSB_MOUSE) vs FOS_RAT (Less similar)

YOURQUERY      -MFQAFPGDYDSGS-RCSSS-PSAESQ--YLSSVDSFGSPPTAAASQE-CAGLGEMPGSF 54
FOS_RAT        MMFSGFNADYEASSSRCSSASPAGDSLSYYHSPADSFSSMGSPVNTQDFCADLSVSSANF 60
                **..* .**::.* ****: *:.:*   * *..***.*  :.. :*: **.*.  ...*

YOURQUERY      VPTVTAITTSQDLQWLVQPTLISSMAQSQGQPLASQPPAVDPYDMPGTSYSTPGLSAYST 114
FOS_RAT        IPTVTAISTSPDLQWLVQPTLVSSVAPSQTR-------APHPYGLP-----TPSTGAYAR 108
               :******:** **********:**:* ** :       * .**.:*     **. .**: 

YOURQUERY      GGASGSGGPSTSTTTSGPVSARPARARPRRPREETLTPEEEEKRRVRRERNKLAAAKCRN 174
FOS_RAT        AGVVKT------------MSGGRAQSIGRRGKVEQLSPEEEEKRRIRRERNKMAAAKCRN 156
               .*.  :            :*.  *::  ** : * *:********:******:*******

YOURQUERY      RRRELTDRLQAETDQLEEEKAELESEIAELQKEKERLEFVLVAHKPGCKIPYEEGPGPGP 234
FOS_RAT        RRRELTDTLQAETDQLEDEKSALQTEIANLLKEKEKLEFILAAHRPACKIPNDLGFPEEM 216
               ******* *********:**: *::***:* ****:***:*.**:*.**** : *     

YOURQUERY      LAEVRDLPG-----STSAKEDGFGWLLP--PPPPPPLP-------------------FQS 268
FOS_RAT        SVTSLDLTGGLPEATTPESEEAFTLPLLNDPEPKPSLEPVKNISNMELKAEPFDDFLFPA 276
                .   **.*     :*. .*:.*   *   * * *.*                    * :

YOURQUERY      S---------RDAP-PNLTASLFT------HS----------EVQVLGDPFPVVSP---S 299
FOS_RAT        SSRPSGSETARSVPDVDLSGSFYAADWEPLHSSSLGMGPMVTELEPLCTPVVTCTPSCTT 336
               *         *..*  :*:.*:::      **          *:: *  *. . :*   :

YOURQUERY      YTSSFVLTCPEVSAF---AGAQR--TSGSEQPSDPLNSPSLLAL 338
FOS_RAT        YTSSFVFTYPEADSFPSCAAAHRKGSSSNEPSSDSLSSPTLLAL 380
               ******:* **..:*   *.*:*  :*..* .**.*.**:****
                  

Example alignment in Edinburgh Format


Edinburgh format has the two sequences aligned placed directly on top of each other with `*' to show identities and `.' to show conservative replacements above the aligned pair

Example: FOSB_MOUSE(Qy) vs FOSB_HUMAN (Db)
RESULT    2
ID   FOSB_HUMAN     STANDARD;      PRT;   338 AA.
DE   Protein fosB (G0/G1 switch regulatory protein 3).

  DB  1;  Score    2775;  Match 95.9%;  QryMatch 97.0%;  Pred. No. 0.00e+00;
  Matches   324;  Conservative   11;  Mismatches   3;  Indels   0;  Gaps   0;

          ************************************************************
Db      1 MFQAFPGDYDSGSRCSSSPSAESQYLSSVDSFGSPPTAAASQECAGLGEMPGSFVPTVTA 60
Qy      1 MFQAFPGDYDSGSRCSSSPSAESQYLSSVDSFGSPPTAAASQECAGLGEMPGSFVPTVTA 60

          ******************************** ***************.*.**.******
Db     61 ITTSQDLQWLVQPTLISSMAQSQGQPLASQPPVVDPYDMPGTSYSTPGMSGYSSGGASGS 120
Qy     61 ITTSQDLQWLVQPTLISSMAQSQGQPLASQPPAVDPYDMPGTSYSTPGLSAYSTGGASGS 120

          ****** ***** .**********************************************
Db    121 GGPSTSGTTSGPGPARPARARPRRPREETLTPEEEEKRRVRRERNKLAAAKCRNRRRELT 180
Qy    121 GGPSTSTTTSGPVSARPARARPRRPREETLTPEEEEKRRVRRERNKLAAAKCRNRRRELT 180

          ************************************************************
Db    181 DRLQAETDQLEEEKAELESEIAELQKEKERLEFVLVAHKPGCKIPYEEGPGPGPLAEVRD 240
Qy    181 DRLQAETDQLEEEKAELESEIAELQKEKERLEFVLVAHKPGCKIPYEEGPGPGPLAEVRD 240

          ****..******.**************.*.**************************.***
Db    241 LPGSAPAKEDGFSWLLPPPPPPPLPFQTSQDAPPNLTASLFTHSEVQVLGDPFPVVNPSY 300
Qy    241 LPGSTSAKEDGFGWLLPPPPPPPLPFQSSRDAPPNLTASLFTHSEVQVLGDPFPVVSPSY 300

          ***********************.**************
Db    301 TSSFVLTCPEVSAFAGAQRTSGSDQPSDPLNSPSLLAL 338
Qy    301 TSSFVLTCPEVSAFAGAQRTSGSEQPSDPLNSPSLLAL 338

Example alignment in Intelligenetics Format


Intelligenetics format uses `|' to show identities and `:' to show conservative replacements and places these indicators between the two aligned sequences.

Example: FOSB_MOUSE(Qy) vs FOSB_HUMAN (Db)
RESULT    2
ID   FOSB_HUMAN     STANDARD;      PRT;   338 AA.
DE   Protein fosB (G0/G1 switch regulatory protein 3).

  DB  1;  Score    2775;  Match 95.9%;  QryMatch 97.0%;  Pred. No. 0.00e+00;
  Matches   324;  Conservative   11;  Mismatches   3;  Indels   0;  Gaps   0;

Db      1 MFQAFPGDYDSGSRCSSSPSAESQYLSSVDSFGSPPTAAASQECAGLGEMPGSFVPTVTA 60
          ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Qy      1 MFQAFPGDYDSGSRCSSSPSAESQYLSSVDSFGSPPTAAASQECAGLGEMPGSFVPTVTA 60

Db     61 ITTSQDLQWLVQPTLISSMAQSQGQPLASQPPVVDPYDMPGTSYSTPGMSGYSSGGASGS 120
          |||||||||||||||||||||||||||||||| |||||||||||||||:|:||:||||||
Qy     61 ITTSQDLQWLVQPTLISSMAQSQGQPLASQPPAVDPYDMPGTSYSTPGLSAYSTGGASGS 120

Db    121 GGPSTSGTTSGPGPARPARARPRRPREETLTPEEEEKRRVRRERNKLAAAKCRNRRRELT 180
          |||||| ||||| :||||||||||||||||||||||||||||||||||||||||||||||
Qy    121 GGPSTSTTTSGPVSARPARARPRRPREETLTPEEEEKRRVRRERNKLAAAKCRNRRRELT 180

Db    181 DRLQAETDQLEEEKAELESEIAELQKEKERLEFVLVAHKPGCKIPYEEGPGPGPLAEVRD 240
          ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Qy    181 DRLQAETDQLEEEKAELESEIAELQKEKERLEFVLVAHKPGCKIPYEEGPGPGPLAEVRD 240

Db    241 LPGSAPAKEDGFSWLLPPPPPPPLPFQTSQDAPPNLTASLFTHSEVQVLGDPFPVVNPSY 300
          ||||::||||||:||||||||||||||:|:||||||||||||||||||||||||||:|||
Qy    241 LPGSTSAKEDGFGWLLPPPPPPPLPFQSSRDAPPNLTASLFTHSEVQVLGDPFPVVSPSY 300

Db    301 TSSFVLTCPEVSAFAGAQRTSGSDQPSDPLNSPSLLAL 338
          |||||||||||||||||||||||:||||||||||||||
Qy    301 TSSFVLTCPEVSAFAGAQRTSGSEQPSDPLNSPSLLAL 338

Other Alignment Formats


pairwise
Aligns your query sequence and database matches in pairs. Matches are connected with a "|" symbol. Mismatches are opposed with a spce. Gaps are introduced with a "-" symbol.
e.g.

Query: 12   caacatctccgtgtcgctaccttattcccttttttgcggcattttgccttcctgtttttg 71
            |||||| ||||||||||  |||||||||||||||||||||||||||||||||||||||||
Sbjct: 5319 caacatttccgtgtcgc--ccttattcccttttttgcggcattttgccttcctgtttttg 5262


Query: 12   caacatctccgtgtcgctaccttattcccttttttgcggcattttgccttcctgtttttg 71
            |||||| ||||||||||  |||||||||||||||||||||||||||||||||||||||||
Sbjct: 4684 caacatttccgtgtcgc--ccttattcccttttttgcggcattttgccttcctgtttttg 4627
                  

M/S with identities
The databases alignments are anchored (shown in relation to) to your query sequence.
Identities are displayed as dots (.).
Mismatches are displayed as single letter nucleotide abbreviations(c,t,a or g).
Gaps are introduced with a "-" symbol.
e.g.

QUERY         12   caacatctccgtgtcgctaccttattcccttttttgcggcattttgccttcctgtttttg 71
U93717        5319 ......t..........--......................................... 5262
U93716        4684 ......t..........--......................................... 4627
U93715        4684 ......t..........--......................................... 4627
U93714        4684 ......t..........--......................................... 4627
U93713        4546 ......t..........--......................................... 4489

M/S without identities

The databases alignments are anchored (shown in relation to) to your query sequence.
Identities are shown as single letter nucleotide abbreviations.
Mismatches displayed as single letter nucleotide abbreviations(c,t,a or g).
Gaps are introduced with a "-" symbol.
e.g.
QUERY            12   caacatctccgtgtcgctaccttattcccttttttgcggcattttgccttcctgtttttg 71
CVE278280        8191 caacatttccgtgtcgc--ccttattcccttttttgcggcattttgccttcctgtttttg 8134
CVE278279        7803 caacatttccgtgtcgc--ccttattcccttttttgcggcattttgccttcctgtttttg 7746
CVE278278        7987 caacatttccgtgtcgc--ccttattcccttttttgcggcattttgccttcctgtttttg 7930
CVE277959        800  caacatttccgtgtcgc--ccttattcccttttttgcggcattttgccttcctgtttttg 857
CVE272004        1168 caacatttccgtgtcgc--ccttattcccttttttgcggcattttgccttcctgtttttg 1225


Flat Query-anchored with identities

The 'flat' display shows inserts as deletions on the query.
Identities are displayed as dots.
Mismatches displayed as single letter nucleotide abbreviations (c,t,a or g).
Gaps are introduced with a "-" symbol.
e.g.

QUERY         12   caacatctccgtgtcgctaccttattcccttttttgcggcattttgccttcctgtttttg 71
U93717        5319 ......t..........--......................................... 5262
U93716        4684 ......t..........--......................................... 4627
U93715        4684 ......t..........--......................................... 4627
U93714        4684 ......t..........--......................................... 4627
U93713        4546 ......t..........--......................................... 4489
Flat Query-anchored without identities
The 'flat' display shows inserts as deletions on the query.
Identities are displayed as as single letter nucleotide abbreviations (c,t,a or g).
Mismatches displayed as single letter nucleotide abbreviations (c,t,a or g).
Gaps are introduced with a "-" symbol.
e.g.
QUERY            12   caacatctccgtgtcgctaccttattcccttttttgcggcattttgccttcctgtttttg 71
CVE278280        8191 caacatttccgtgtcgc--ccttattcccttttttgcggcattttgccttcctgtttttg 8134
CVE278279        7803 caacatttccgtgtcgc--ccttattcccttttttgcggcattttgccttcctgtttttg 7746
CVE278278        7987 caacatttccgtgtcgc--ccttattcccttttttgcggcattttgccttcctgtttttg 7930
CVE277959        800  caacatttccgtgtcgc--ccttattcccttttttgcggcattttgccttcctgtttttg 857
CVE272004        1168 caacatttccgtgtcgc--ccttattcccttttttgcggcattttgccttcctgtttttg 1225


EDITING AN ALIGNMENT


You can edit the alignment using jalview. Click on the button below to view the above alignment.

PHYLOGENETIC TREE


Phylogram is a branching diagram (tree) assumed to be an estimate of a phylogeny, branch lengths are proportional to the amount of inferred evolutionary change. A Cladogram is a branching diagram (tree) assumed to be an estimate of a phylogeny where the branches are of equal length, thus cladograms show common ancestry, but do not indicate the amount of evolutionary "time" separating taxa. Tree distances can be shown, just click on the diagram to get a menu of options.

example:


General Help Topics
Amino Acid and Nucleotide Codes | Optimal Alignments|Sequence Formats|Gaps|Matrices|DNA Strand|Reading Frames|Molecule Types

spacer
spacer