Help - About Optimal Sequence Alignments
The alignment that is the best, given a defined set of rules and
parameter values for comparing different alignments. There is
no such thing as the single best alignment, since optimality
always depends on the assumptions one bases the alignment on. For
example, what penalty should gaps carry? All sequence alignment
procedures make some such assumptions.
Global alignment
An alignment that assumes that the two proteins are basically similar over the entire length of one another. The alignment attempts to match them to each other from end to end, even though parts of the alignment are not very convincing. A tiny example:
Local alignment
An alignment that searches for segments of the two sequences that match well. There is no attempt to force entire sequences into an alignment, just those parts that appear to have good similarity, according to some criterion. Using the same sequences as above, one could get:
NLGPSTKDDFGKILGPSTKDDQ
||||
QNQLERSSNFGKINQLERSSNN
It may seem that one should always use local alignments. However, it may be difficult to spot an overall similarlity, as opposed to just a domain-to-domain similarity, if one uses only local alignment, so global alignment is useful in some cases. You can produce a global or a local alignment with the Emboss Pairwise global and local alignment tool. You can search sequence databases, producing local alignments of your query sequence against known sequences with the programs BLAST and FASTA. ClustalW2 is a general purpose multiple sequence alignment program for DNA or proteins. Use this if you wish to compare your sequences against each other. You may wish to edit the alignments you have obtained if you do not like the positions chosen by programs, you can do this using the tool JalView.
Example alignment with ClustalW2
e.g. A multiple sequence alignment was done with ClustalW2 using FOSB_MOUSE vs FOSB_HUMAN. Sequences were input in the fasta format:
>FOSB_MOUSE Protein fosB
MFQAFPGDYD SGSRCSSSPS AESQYLSSVD SFGSPPTAAA SQECAGLGEM PGSFVPTVTA
ITTSQDLQWL VQPTLISSMA QSQGQPLASQ PPAVDPYDMP GTSYSTPGLS AYSTGGASGS
GGPSTSTTTS GPVSARPARA RPRRPREETL TPEEEEKRRV RRERNKLAAA KCRNRRRELT
DRLQAETDQL EEEKAELESE IAELQKEKER LEFVLVAHKP GCKIPYEEGP GPGPLAEVRD
LPGSTSAKED GFGWLLPPPP PPPLPFQSSR DAPPNLTASL FTHSEVQVLG DPFPVVSPSY
TSSFVLTCPE VSAFAGAQRT SGSEQPSDPL NSPSLLAL
>FOSB_HUMAN Protein fosB
MFQAFPGDYD SGSRCSSSPS AESQYLSSVD SFGSPPTAAA SQECAGLGEM PGSFVPTVTA
ITTSQDLQWL VQPTLISSMA QSQGQPLASQ PPVVDPYDMP GTSYSTPGMS GYSSGGASGS
GGPSTSGTTS GPGPARPARA RPRRPREETL TPEEEEKRRV RRERNKLAAA KCRNRRRELT
DRLQAETDQL EEEKAELESE IAELQKEKER LEFVLVAHKP GCKIPYEEGP GPGPLAEVRD
LPGSAPAKED GFSWLLPPPP PPPLPFQTSQ DAPPNLTASL FTHSEVQVLG DPFPVVNPSY
TSSFVLTCPE VSAFAGAQRT SGSDQPSDPL NSPSLLAL
>FOS_CHICK Proto-oncogene protein c-fos
MMYQGFAGEY EAPSSRCSSA SPAGDSLTYY PSPADSFSSM GSPVNSQDFC TDLAVSSANF
VPTVTAISTS PDLQWLVQPT LISSVAPSQN RGHPYGVPAP APPAAYSRPA VLKAPGGRGQ
SIGRRGKVEQ LSPEEEEKRR IRRERNKMAA AKCRNRRREL TDTLQAETDQ LEEEKSALQA
EIANLLKEKE KLEFILAAHR PACKMPEELR FSEELAAATA LDLGAPSPAA AEEAFALPLM
TEAPPAVPPK EPSGSGLELK AEPFDELLFS AGPREASRSV PDMDLPGASS FYASDWEPLG
AGSGGELEPL CTPVVTCTPC PSTYTSTFVF TYPEADAFPS CAAAHRKGSS SNEPSSDSLS
SPTLLAL
>FOS_RAT Proto-oncogene protein c-fos
MMFSGFNADY EASSSRCSSA SPAGDSLSYY HSPADSFSSM GSPVNTQDFC ADLSVSSANF
IPTVTAISTS PDLQWLVQPT LVSSVAPSQT RAPHPYGLPT PSTGAYARAG VVKTMSGGRA
QSIGRRGKVE QLSPEEEEKR RIRRERNKMA AAKCRNRRRE LTDTLQAETD QLEDEKSALQ
TEIANLLKEK EKLEFILAAH RPACKIPNDL GFPEEMSVTS LDLTGGLPEA TTPESEEAFT
LPLLNDPEPK PSLEPVKNIS NMELKAEPFD DFLFPASSRP SGSETARSVP DVDLSGSFYA
ADWEPLHSSS LGMGPMVTEL EPLCTPVVTC TPSCTTYTSS FVFTYPEADS FPSCAAAHRK
GSSSNEPSSD SLSSPTLLAL
>FOS_MOUSE Proto-oncogene protein c-fos
MMFSGFNADY EASSSRCSSA SPAGDSLSYY HSPADSFSSM GSPVNTQDFC ADLSVSSANF
IPTVTAISTS PDLQWLVQPT LVSSVAPSQT RAPHPYGLPT QSAGAYARAG MVKTVSGGRA
QSIGRRGKVE QLSPEEEEKR RIRRERNKMA AAKCRNRRRE LTDTLQAETD QLEDEKSALQ
TEIANLLKEK EKLEFILAAH RPACKIPDDL GFPEEMSVAS LDLTGGLPEA STPESEEAFT
LPLLNDPEPK PSLEPVKSIS NVELKAEPFD DFLFPASSRP SGSETSRSVP DVDLSGSFYA
ADWEPLHSNS LGMGPMVTEL EPLCTPVVTC TPGCTTYTSS FVFTYPEADS FPSCAAAHRK
GSSSNEPSSD SLSSPTLLAL
Output was in the format:
FOS_RAT MMFSGFNADYEASSSRCSSASPAGDSLSYYHSPADSFSSMGSPVNTQDFCADLSVSSANF 60
FOS_MOUSE MMFSGFNADYEASSSRCSSASPAGDSLSYYHSPADSFSSMGSPVNTQDFCADLSVSSANF 60
FOS_CHICK MMYQGFAGEYEAPSSRCSSASPAGDSLTYYPSPADSFSSMGSPVNSQDFCTDLAVSSANF 60
FOSB_MOUSE -MFQAFPGDYDS-GSRCSS-SPSAESQ--YLSSVDSFGSPPTAAASQE-CAGLGEMPGSF 54
FOSB_HUMAN -MFQAFPGDYDS-GSRCSS-SPSAESQ--YLSSVDSFGSPPTAAASQE-CAGLGEMPGSF 54
*:..* .:*:: .***** **:.:* * *..***.* :.. :*: *:.*. ...*
FOS_RAT IPTVTAISTSPDLQWLVQPTLVSSVAPSQ-------TRAPHPYGLPTPS-TGAYARAGVV 112
FOS_MOUSE IPTVTAISTSPDLQWLVQPTLVSSVAPSQ-------TRAPHPYGLPTQS-AGAYARAGMV 112
FOS_CHICK VPTVTAISTSPDLQWLVQPTLISSVAPSQ-------NRG-HPYGVPAPAPPAAYSRPAVL 112
FOSB_MOUSE VPTVTAITTSQDLQWLVQPTLISSMAQSQGQPLASQPPAVDPYDMPGTS----YSTPGLS 110
FOSB_HUMAN VPTVTAITTSQDLQWLVQPTLISSMAQSQGQPLASQPPVVDPYDMPGTS----YSTPGMS 110
:******:** **********:**:* **... ::. .**.:* : *: ..:
FOS_RAT KTMSGGRAQSIG--------------------RRGKVEQLSPEEEEKRRIRRERNKMAAA 152
FOS_MOUSE KTVSGGRAQSIG--------------------RRGKVEQLSPEEEEKRRIRRERNKMAAA 152
FOS_CHICK KAP-GGRGQSIG--------------------RRGKVEQLSPEEEEKRRIRRERNKMAAA 151
FOSB_MOUSE AYSTGGASGSGGPSTSTTTSGPVSARPARARPRRPREETLTPEEEEKRRVRRERNKLAAA 170
FOSB_HUMAN GYSSGGASGSGGPSTSGTTSGPGPARPARARPRRPREETLTPEEEEKRRVRRERNKLAAA 170
:** . * *.::: :::.. .: .: : .** : * *:********:******:***
FOS_RAT KCRNRRRELTDTLQAETDQLEDEKSALQTEIANLLKEKEKLEFILAAHRPACKIPNDLGF 212
FOS_MOUSE KCRNRRRELTDTLQAETDQLEDEKSALQTEIANLLKEKEKLEFILAAHRPACKIPDDLGF 212
FOS_CHICK KCRNRRRELTDTLQAETDQLEEEKSALQAEIANLLKEKEKLEFILAAHRPACKMPEELRF 211
FOSB_MOUSE KCRNRRRELTDRLQAETDQLEEEKAELESEIAELQKEKERLEFVLVAHKPGCKIPYEEG- 229
FOSB_HUMAN KCRNRRRELTDRLQAETDQLEEEKAELESEIAELQKEKERLEFVLVAHKPGCKIPYEEG- 229
*********** *********:**: *::***:* ****:***:*.**:*.**:* :
FOS_RAT PEEMSVTS-LDLTGGLPEATTPESEEAFTLPLLNDPEPK-PSLEPVKNISNMELKAEPFD 270
FOS_MOUSE PEEMSVAS-LDLTGGLPEASTPESEEAFTLPLLNDPEPK-PSLEPVKSISNVELKAEPFD 270
FOS_CHICK SEELAAATALDLG----APSPAAAEEAFALPLMTEAPPAVPPKEPSG--SGLELKAEPFD 265
FOSB_MOUSE PGPGPLAEVRDLPG-----STSAKEDGFGWLLPPPPPPP-----------------LPFQ 267
FOSB_HUMAN PGPGPLAEVRDLPG-----SAPAKEDGFSWLLPPPPPPP-----------------LPFQ 267
. . : ** . :.. *:.* * . * **:
FOS_RAT DFLFPASSRPSGSETARSVPDVDLSG--SFYAADWEPLHSSSLGMGPMVTELEPLCTPVV 328
FOS_MOUSE DFLFPASSRPSGSETSRSVPDVDLSG--SFYAADWEPLHSNSLGMGPMVTELEPLCTPVV 328
FOS_CHICK ELLFSAGPR----EASRSVPDMDLPGASSFYASDWEPLGAGSGG------ELEPLCTPVV 315
FOSB_MOUSE --------------SSRDAP-PNLTA--SLFTHS----------------EVQVLGDPFP 294
FOSB_HUMAN --------------TSQDAP-PNLTA--SLFTHS----------------EVQVLGDPFP 294
:::..* :*.. *::: . *:: * *.
FOS_RAT TCTPSCTTYTSSFVFTYPEADSFPSCAAAHRKGSSSNEPSSDSLSSPTLLAL 380
FOS_MOUSE TCTPGCTTYTSSFVFTYPEADSFPSCAAAHRKGSSSNEPSSDSLSSPTLLAL 380
FOS_CHICK TCTPCPSTYTSTFVFTYPEADAFPSCAAAHRKGSSSNEPSSDSLSSPTLLAL 367
FOSB_MOUSE VVSP---SYTSSFVLTCPEVSAF---AGAQR--TSGSEQPSDPLNSPSLLAL 338
FOSB_HUMAN VVNP---SYTSSFVLTCPEVSAF---AGAQR--TSGSDQPSDPLNSPSLLAL 338
. .* :***:**:* **..:* *.*:* :*..: .**.*.**:****
Example alignments with DBClustal
DbClustal addresses the important problem of the automatic multiple alignment of the top scoring full-length sequences detected by a database similarity search. By combining the advantages of both local and global alignment algorithms into a single system, DbClustal is able to provide accurate global alignments of highly divergent, complex sequence sets. Local alignment information is incorporated into a ClustalW2 global alignment in the form of a list of anchor points between pairs of sequences.
Reference:
J. D. Thompson, F. Plewniak, J.-C. Thierry and O. Poch. (2000)
DbClustal: rapid and reliable global multiple alignments of protein sequences detected by database searches.
Nucleic Acids Research, 2000, Vol. 28, No. 15 2919-2926.
Example:
The sequence for FOSB_MOUSE was queried with NCBI-blastp (or WU-BLAST2 ) against UniProt. The alignments are shown with DBClustal against similar/identical proteins, which were found to be similar on the NCBI blast output.
(a) Query(FOSB_MOUSE) Vs FOSB_MOUSE(Identical)
YOURQUERY MFQAFPGDYDSGSRCSSSPSAESQYLSSVDSFGSPPTAAASQECAGLGEMPGSFVPTVTA 60
FOSB_MOUSE MFQAFPGDYDSGSRCSSSPSAESQYLSSVDSFGSPPTAAASQECAGLGEMPGSFVPTVTA 60
************************************************************
YOURQUERY ITTSQDLQWLVQPTLISSMAQSQGQPLASQPPAVDPYDMPGTSYSTPGLSAYSTGGASGS 120
FOSB_MOUSE ITTSQDLQWLVQPTLISSMAQSQGQPLASQPPAVDPYDMPGTSYSTPGLSAYSTGGASGS 120
************************************************************
YOURQUERY GGPSTSTTTSGPVSARPARARPRRPREETLTPEEEEKRRVRRERNKLAAAKCRNRRRELT 180
FOSB_MOUSE GGPSTSTTTSGPVSARPARARPRRPREETLTPEEEEKRRVRRERNKLAAAKCRNRRRELT 180
************************************************************
YOURQUERY DRLQAETDQLEEEKAELESEIAELQKEKERLEFVLVAHKPGCKIPYEEGPGPGPLAEVRD 240
FOSB_MOUSE DRLQAETDQLEEEKAELESEIAELQKEKERLEFVLVAHKPGCKIPYEEGPGPGPLAEVRD 240
************************************************************
YOURQUERY LPGSTSAKEDGFGWLLPPPPPPPLPFQSSRDAPPNLTASLFTHSEVQVLGDPFPVVSPSY 300
FOSB_MOUSE LPGSTSAKEDGFGWLLPPPPPPPLPFQSSRDAPPNLTASLFTHSEVQVLGDPFPVVSPSY 300
************************************************************
YOURQUERY TSSFVLTCPEVSAFAGAQRTSGSEQPSDPLNSPSLLAL 338
FOSB_MOUSE TSSFVLTCPEVSAFAGAQRTSGSEQPSDPLNSPSLLAL 338
**************************************
(b) Query(FOSB_MOUSE) vs FOSB_HUMAN (similar)
YOURQUERY MFQAFPGDYDSGSRCSSSPSAESQYLSSVDSFGSPPTAAASQECAGLGEMPGSFVPTVTA 60
FOSB_HUMAN MFQAFPGDYDSGSRCSSSPSAESQYLSSVDSFGSPPTAAASQECAGLGEMPGSFVPTVTA 60
************************************************************
YOURQUERY ITTSQDLQWLVQPTLISSMAQSQGQPLASQPPAVDPYDMPGTSYSTPGLSAYSTGGASGS 120
FOSB_HUMAN ITTSQDLQWLVQPTLISSMAQSQGQPLASQPPVVDPYDMPGTSYSTPGMSGYSSGGASGS 120
********************************.***************:*.**:******
YOURQUERY GGPSTSTTTSGPVSARPARARPRRPREETLTPEEEEKRRVRRERNKLAAAKCRNRRRELT 180
FOSB_HUMAN GGPSTSGTTSGPGPARPARARPRRPREETLTPEEEEKRRVRRERNKLAAAKCRNRRRELT 180
****** ***** .**********************************************
YOURQUERY DRLQAETDQLEEEKAELESEIAELQKEKERLEFVLVAHKPGCKIPYEEGPGPGPLAEVRD 240
FOSB_HUMAN DRLQAETDQLEEEKAELESEIAELQKEKERLEFVLVAHKPGCKIPYEEGPGPGPLAEVRD 240
************************************************************
YOURQUERY LPGSTSAKEDGFGWLLPPPPPPPLPFQSSRDAPPNLTASLFTHSEVQVLGDPFPVVSPSY 300
FOSB_HUMAN LPGSAPAKEDGFSWLLPPPPPPPLPFQTSQDAPPNLTASLFTHSEVQVLGDPFPVVNPSY 300
****:.******.**************:*:**************************.***
YOURQUERY TSSFVLTCPEVSAFAGAQRTSGSEQPSDPLNSPSLLAL 338
FOSB_HUMAN TSSFVLTCPEVSAFAGAQRTSGSDQPSDPLNSPSLLAL 338
***********************:**************
(c) Query(FOSB_MOUSE) vs FOS_RAT (Less similar)
YOURQUERY -MFQAFPGDYDSGS-RCSSS-PSAESQ--YLSSVDSFGSPPTAAASQE-CAGLGEMPGSF 54
FOS_RAT MMFSGFNADYEASSSRCSSASPAGDSLSYYHSPADSFSSMGSPVNTQDFCADLSVSSANF 60
**..* .**::.* ****: *:.:* * *..***.* :.. :*: **.*. ...*
YOURQUERY VPTVTAITTSQDLQWLVQPTLISSMAQSQGQPLASQPPAVDPYDMPGTSYSTPGLSAYST 114
FOS_RAT IPTVTAISTSPDLQWLVQPTLVSSVAPSQTR-------APHPYGLP-----TPSTGAYAR 108
:******:** **********:**:* ** : * .**.:* **. .**:
YOURQUERY GGASGSGGPSTSTTTSGPVSARPARARPRRPREETLTPEEEEKRRVRRERNKLAAAKCRN 174
FOS_RAT AGVVKT------------MSGGRAQSIGRRGKVEQLSPEEEEKRRIRRERNKMAAAKCRN 156
.*. : :*. *:: ** : * *:********:******:*******
YOURQUERY RRRELTDRLQAETDQLEEEKAELESEIAELQKEKERLEFVLVAHKPGCKIPYEEGPGPGP 234
FOS_RAT RRRELTDTLQAETDQLEDEKSALQTEIANLLKEKEKLEFILAAHRPACKIPNDLGFPEEM 216
******* *********:**: *::***:* ****:***:*.**:*.**** : *
YOURQUERY LAEVRDLPG-----STSAKEDGFGWLLP--PPPPPPLP-------------------FQS 268
FOS_RAT SVTSLDLTGGLPEATTPESEEAFTLPLLNDPEPKPSLEPVKNISNMELKAEPFDDFLFPA 276
. **.* :*. .*:.* * * * *.* * :
YOURQUERY S---------RDAP-PNLTASLFT------HS----------EVQVLGDPFPVVSP---S 299
FOS_RAT SSRPSGSETARSVPDVDLSGSFYAADWEPLHSSSLGMGPMVTELEPLCTPVVTCTPSCTT 336
* *..* :*:.*::: ** *:: * *. . :* :
YOURQUERY YTSSFVLTCPEVSAF---AGAQR--TSGSEQPSDPLNSPSLLAL 338
FOS_RAT YTSSFVFTYPEADSFPSCAAAHRKGSSSNEPSSDSLSSPTLLAL 380
******:* **..:* *.*:* :*..* .**.*.**:****
Example alignment in Edinburgh Format
Edinburgh format has the two sequences aligned placed directly on top of each other with `*' to show identities and `.' to show conservative replacements above the aligned pair
Example: FOSB_MOUSE(Qy) vs FOSB_HUMAN (Db)
RESULT 2
ID FOSB_HUMAN STANDARD; PRT; 338 AA.
DE Protein fosB (G0/G1 switch regulatory protein 3).
DB 1; Score 2775; Match 95.9%; QryMatch 97.0%; Pred. No. 0.00e+00;
Matches 324; Conservative 11; Mismatches 3; Indels 0; Gaps 0;
************************************************************
Db 1 MFQAFPGDYDSGSRCSSSPSAESQYLSSVDSFGSPPTAAASQECAGLGEMPGSFVPTVTA 60
Qy 1 MFQAFPGDYDSGSRCSSSPSAESQYLSSVDSFGSPPTAAASQECAGLGEMPGSFVPTVTA 60
******************************** ***************.*.**.******
Db 61 ITTSQDLQWLVQPTLISSMAQSQGQPLASQPPVVDPYDMPGTSYSTPGMSGYSSGGASGS 120
Qy 61 ITTSQDLQWLVQPTLISSMAQSQGQPLASQPPAVDPYDMPGTSYSTPGLSAYSTGGASGS 120
****** ***** .**********************************************
Db 121 GGPSTSGTTSGPGPARPARARPRRPREETLTPEEEEKRRVRRERNKLAAAKCRNRRRELT 180
Qy 121 GGPSTSTTTSGPVSARPARARPRRPREETLTPEEEEKRRVRRERNKLAAAKCRNRRRELT 180
************************************************************
Db 181 DRLQAETDQLEEEKAELESEIAELQKEKERLEFVLVAHKPGCKIPYEEGPGPGPLAEVRD 240
Qy 181 DRLQAETDQLEEEKAELESEIAELQKEKERLEFVLVAHKPGCKIPYEEGPGPGPLAEVRD 240
****..******.**************.*.**************************.***
Db 241 LPGSAPAKEDGFSWLLPPPPPPPLPFQTSQDAPPNLTASLFTHSEVQVLGDPFPVVNPSY 300
Qy 241 LPGSTSAKEDGFGWLLPPPPPPPLPFQSSRDAPPNLTASLFTHSEVQVLGDPFPVVSPSY 300
***********************.**************
Db 301 TSSFVLTCPEVSAFAGAQRTSGSDQPSDPLNSPSLLAL 338
Qy 301 TSSFVLTCPEVSAFAGAQRTSGSEQPSDPLNSPSLLAL 338
Example alignment in Intelligenetics Format
Intelligenetics format uses `|' to show identities and `:' to show conservative replacements and places these indicators between the two aligned sequences.
Example: FOSB_MOUSE(Qy) vs FOSB_HUMAN (Db)
RESULT 2
ID FOSB_HUMAN STANDARD; PRT; 338 AA.
DE Protein fosB (G0/G1 switch regulatory protein 3).
DB 1; Score 2775; Match 95.9%; QryMatch 97.0%; Pred. No. 0.00e+00;
Matches 324; Conservative 11; Mismatches 3; Indels 0; Gaps 0;
Db 1 MFQAFPGDYDSGSRCSSSPSAESQYLSSVDSFGSPPTAAASQECAGLGEMPGSFVPTVTA 60
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Qy 1 MFQAFPGDYDSGSRCSSSPSAESQYLSSVDSFGSPPTAAASQECAGLGEMPGSFVPTVTA 60
Db 61 ITTSQDLQWLVQPTLISSMAQSQGQPLASQPPVVDPYDMPGTSYSTPGMSGYSSGGASGS 120
|||||||||||||||||||||||||||||||| |||||||||||||||:|:||:||||||
Qy 61 ITTSQDLQWLVQPTLISSMAQSQGQPLASQPPAVDPYDMPGTSYSTPGLSAYSTGGASGS 120
Db 121 GGPSTSGTTSGPGPARPARARPRRPREETLTPEEEEKRRVRRERNKLAAAKCRNRRRELT 180
|||||| ||||| :||||||||||||||||||||||||||||||||||||||||||||||
Qy 121 GGPSTSTTTSGPVSARPARARPRRPREETLTPEEEEKRRVRRERNKLAAAKCRNRRRELT 180
Db 181 DRLQAETDQLEEEKAELESEIAELQKEKERLEFVLVAHKPGCKIPYEEGPGPGPLAEVRD 240
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Qy 181 DRLQAETDQLEEEKAELESEIAELQKEKERLEFVLVAHKPGCKIPYEEGPGPGPLAEVRD 240
Db 241 LPGSAPAKEDGFSWLLPPPPPPPLPFQTSQDAPPNLTASLFTHSEVQVLGDPFPVVNPSY 300
||||::||||||:||||||||||||||:|:||||||||||||||||||||||||||:|||
Qy 241 LPGSTSAKEDGFGWLLPPPPPPPLPFQSSRDAPPNLTASLFTHSEVQVLGDPFPVVSPSY 300
Db 301 TSSFVLTCPEVSAFAGAQRTSGSDQPSDPLNSPSLLAL 338
|||||||||||||||||||||||:||||||||||||||
Qy 301 TSSFVLTCPEVSAFAGAQRTSGSEQPSDPLNSPSLLAL 338
Other Alignment Formats
pairwise
Aligns your query sequence and database matches in pairs. Matches are connected with a "|" symbol. Mismatches are opposed with a space. Gaps are introduced with a "-" symbol.
e.g.
Query: 12 caacatctccgtgtcgctaccttattcccttttttgcggcattttgccttcctgtttttg 71
|||||| |||||||||| |||||||||||||||||||||||||||||||||||||||||
Sbjct: 5319 caacatttccgtgtcgc--ccttattcccttttttgcggcattttgccttcctgtttttg 5262
Query: 12 caacatctccgtgtcgctaccttattcccttttttgcggcattttgccttcctgtttttg 71
|||||| |||||||||| |||||||||||||||||||||||||||||||||||||||||
Sbjct: 4684 caacatttccgtgtcgc--ccttattcccttttttgcggcattttgccttcctgtttttg 4627
M/S with identities
The databases alignments are anchored (shown in relation to) to your query sequence.
Identities are displayed as dots (.).
Mismatches are displayed as single letter nucleotide abbreviations(c,t,a or g).
Gaps are introduced with a "-" symbol.
e.g.
QUERY 12 caacatctccgtgtcgctaccttattcccttttttgcggcattttgccttcctgtttttg 71
U93717 5319 ......t..........--......................................... 5262
U93716 4684 ......t..........--......................................... 4627
U93715 4684 ......t..........--......................................... 4627
U93714 4684 ......t..........--......................................... 4627
U93713 4546 ......t..........--......................................... 4489
M/S without identities
The databases alignments are anchored (shown in relation to) to your query sequence.
Identities are shown as single letter nucleotide abbreviations.
Mismatches displayed as single letter nucleotide abbreviations (c,t,a or g).
Gaps are introduced with a "-" symbol.
e.g.
QUERY 12 caacatctccgtgtcgctaccttattcccttttttgcggcattttgccttcctgtttttg 71
CVE278280 8191 caacatttccgtgtcgc--ccttattcccttttttgcggcattttgccttcctgtttttg 8134
CVE278279 7803 caacatttccgtgtcgc--ccttattcccttttttgcggcattttgccttcctgtttttg 7746
CVE278278 7987 caacatttccgtgtcgc--ccttattcccttttttgcggcattttgccttcctgtttttg 7930
CVE277959 800 caacatttccgtgtcgc--ccttattcccttttttgcggcattttgccttcctgtttttg 857
CVE272004 1168 caacatttccgtgtcgc--ccttattcccttttttgcggcattttgccttcctgtttttg 1225
Flat Query-anchored with identities
The 'flat' display shows inserts as deletions on the query.
Identities are displayed as dots.
Mismatches displayed as single letter nucleotide abbreviations (c,t,a or g).
Gaps are introduced with a "-" symbol.
e.g.
QUERY 12 caacatctccgtgtcgctaccttattcccttttttgcggcattttgccttcctgtttttg 71
U93717 5319 ......t..........--......................................... 5262
U93716 4684 ......t..........--......................................... 4627
U93715 4684 ......t..........--......................................... 4627
U93714 4684 ......t..........--......................................... 4627
U93713 4546 ......t..........--......................................... 4489
Flat Query-anchored without identities
The 'flat' display shows inserts as deletions on the query.
Identities are displayed as as single letter nucleotide abbreviations (c,t,a or g).
Mismatches displayed as single letter nucleotide abbreviations (c,t,a or g).
Gaps are introduced with a "-" symbol.
e.g.
QUERY 12 caacatctccgtgtcgctaccttattcccttttttgcggcattttgccttcctgtttttg 71
CVE278280 8191 caacatttccgtgtcgc--ccttattcccttttttgcggcattttgccttcctgtttttg 8134
CVE278279 7803 caacatttccgtgtcgc--ccttattcccttttttgcggcattttgccttcctgtttttg 7746
CVE278278 7987 caacatttccgtgtcgc--ccttattcccttttttgcggcattttgccttcctgtttttg 7930
CVE277959 800 caacatttccgtgtcgc--ccttattcccttttttgcggcattttgccttcctgtttttg 857
CVE272004 1168 caacatttccgtgtcgc--ccttattcccttttttgcggcattttgccttcctgtttttg 1225
EDITING AN ALIGNMENT
You can edit the alignment using jalview. Click on the button below to view the above alignment.
PHYLOGENETIC TREE
Phylogram is a branching diagram (tree) assumed to be an estimate of a phylogeny, branch lengths are proportional to the amount of inferred evolutionary change. A Cladogram is a branching diagram (tree) assumed to be an estimate of a phylogeny where the branches are of equal length, thus cladograms show common ancestry, but do not indicate the amount of evolutionary "time" separating taxa. Tree distances can be shown, just click on the diagram to get a menu of options.
example:
NLGPSTKDFGKISESREFDNQ | |||| | QLNQLERSFGKINMRLEDALV