Help with the alignment tool < IMGT/HLA < IPD

Sequence Alignment Help

Using the Sequence Alignment Tool

The sequence alignment form contains the following options:

Select Locus - this option allows the user to choose which of the HLA or HLA-related genes to align from the drop-down menu. The drop-down menu also includes a number of special choices, like multiple sequence alignments for all the DRB1, 3, 4 & 5 alleles or all the DRB pseudogene alleles.
Select the feature to align - this option provides a list of alignments available for the locus selected. The alignments available include CDS alignments, individual exon alignments or alignments of combined regions. If an option listed in the drop-down menu for one locus is not listed for another locus, then it is either not possible or is currently unavailable.
Enter any specific sequences required - this allows the user to view alignments of specific sequences by entering either common nomenclature or by listing allele names. For example, to align DRB1*01:01, DRB1*01:02:01 and DRB1*01:02:02 the user could enter 01 or 01:0 into this box as the common nomenclature and this will match the desired alleles. Alternatively, the user could enter "01:01, 01:02:01, 01:02:02" in the box, separating each allele name with either a comma or a new line, and this will also match the desired alleles. The wildcard character (*) may also be used in the allele name.
Enter the reference sequence - the alignment tool allows the user seleect an alternative reference sequence. This is optional and, if not selected or altered, the tool uses the default sequence as listed below per locus. To use an alternative reference sequence, simply enter the full numerical code in the box provided. Please note that incorrect codes will cause errors in the alignment e.g. 01:01 is not a valid code for specifying A*01:01:01:01 to be used as a reference sequence - the full numerical code must be entered. A consensus sequence based on the specified alleles in the alignment can be used by typing "CONSENSUS" into the reference box. The consensus sequence will be derived from the alleles specified for the alignment.

Official Reference Sequences
Locus	Allele	Acc. No.
HLA-A	01:01:01:01	HLA00001
HLA-B	07:02:01	HLA00132
HLA-C	01:02:01	HLA00401
HLA-E	01:01:01:01	HLA00934
HLA-F	01:01:01:01	HLA01096
HLA-G	01:01:01:01	HLA00939
HLA-H	01:01:01:01	HLA02546
HLA-J	01:01:01:01	HLA02626
HLA-K	01:01:01:01	HLA02654
HLA-L	01:01:01:01	HLA02655
HLA-P	01:01:01:01	HLA02742
HLA-V	01:01:01:01	HLA02801
HLA-Y	01:01	HLA13320
HLA-DMA	01:01:01:01	HLA00485
HLA-DMB	01:01:01:01	HLA00489
HLA-DOA	01:01:01	HLA00494
HLA-DOB	01:01:01:01	HLA01098
HLA-DPA1	01:03:01:01	HLA00499
HLA-DPB1	01:01:01	HLA00514
HLA-DPB2	01:01:01	HLA14837
HLA-DQA1	01:01:01	HLA00601
HLA-DQB1	05:01:01:01	HLA00638
HLA-DRA	01:01:01:01	HLA00662
HLA-DRB1	01:01:01	HLA00664
HLA-DRB2	01:01	HLA01028
HLA-DRB3	01:01:01	HLA00886
HLA-DRB4	01:01:01:01	HLA00905
HLA-DRB5	01:01:01	HLA00915
HLA-DRB6	01:01	HLA00929
HLA-DRB7	01:01:01	HLA00932
HLA-DRB8	01:01	HLA01029
HLA-DRB9	01:01	HLA01030
HFE	001:01:01	HLA14067
MICA	001	HLA01013
MICB	001	HLA02033
TAP1	01:01:01:01	HLA00953
TAP2	01:01:01:01	HLA00959

Note - in the DQB1 alignments, the DQB1*05 alleles are displayed first.

Alignment Display Options

Mismatches - this option selects whether to display the full sequence for all alleles in the alignment, or to only display those bases that mismatch the reference, e.g.:

- Show mismatches between sequences:

A*01:01:01:01 CGGGGGCCCT GGCCCTGACC
A*01:02 ---------- -------C--

- Show all bases:

A*01:01:01:01 CGGGGGCCCT GGCCCTGACC
A*01:02 CGGGGGCCCT GGCCCTGCCC
Numbering - depending on the alignment type, different numbering formats can be selected. For nucleotide sequences the alignments can be displayed in blocks of 10 nucleotides or in blocks of 3 nucleotides to represent the amino acid codons. Genomic alignments are only displayed in blocks of 10 nucleotides and protein alignments are always displayed in blocks of 10 amino acids. For either format, it may be necessary to increase the width of your browser window (or zoom out) to fully view the alignment. Full details of how sequences are numbered is explained here.

The use of an alternative reference sequence may impact on the numbering used, standardised numbering of positions can only be considered if the official reference sequence is used. Indels are only included in the alignment when they impact on the alleles selected, spacing for indels found in non-selected alleles will not be visible.
Alleles unsequenced in selected region - the user can omit the alleles that are not sequenced over the region of interest from their alignment. This will reduce the time taken to perform the alignment. For some loci, genomic alignments can contain over 1.5 million bases if all sequences are selected. When non-coding regions are selected, all alleles which contain unsequenced regions are removed from the alignment by default. Where possible, select only the sequences needed as this will reduce the loading time and make the alignments easier to view.
Output - to aid printing of the alignments, the user can select a text only version of the output. This removes all interactive tags and is easier to cut and paste into applications like Microsoft Word.

Further Information

For more information about the database, queries (including website) or to subscribe to the IPD mailing lists please contact IPD Support.

Please see our licence for our terms of use.

IPD-IMGT/HLA

Sequence Alignment Help

Using the Sequence Alignment Tool

Alignment Display Options

Further Information

Sponsors

Sequence Alignment Help

Using the Sequence Alignment Tool

Alignment Display Options

Further Information

Resources

Sponsors