![]() |
IPD-KIR DatabaseA Community Standard Reporting Format for KIR Genotyping DataMartin Maiers1, Rebecca Cullen2, Raja Rajalingam3, Harriet Noreen4, Neng Yu5, Elaine Reed3, Steven GE Marsh6, Stephen Spellman2, Libby Guethlein7, Elisabeth Trachtenberg8 Sarah Cooley4
KIR genes encode activating and inhibitory receptors that regulate the function of natural killer cells which may be important in donor selection for stem cell transplantation. Current KIR typing methodologies cannot resolve the extensive allelic variation of the 17 KIR genes. Although an XML standard for KIR genotype reporting is being developed, most laboratories need a reliable data format for sharing data electronically via spreadsheets. Our proposed format reports any level of allelic resolution, distinguishes between haploid and diploid ambiguities, and reports the observed number of loci, and is easily translated in to XML and parsed for downstream storage and analysis. By using a combination of 4 symbols with hierarchical precedence as defined in the table, the typing result for each KIR gene is entered into a single spreadsheet cell with no ambiguities in interpretation. For example: “001+007|006+010” represents a 3DL2 result with two possible genotypes:
Alternatively, the string “001+002/008” represents a 3DL2 genotype result showing heterozygosity, with the allele 001 on one chromosome and either 002 or 008 on the other. If haplotype “phase” is known between two genes (based on the output of segregation analysis, haplotype estimation programs or typing methods that separate haplotypes) this can be represented with a specific “in cis” symbol. We have used this scheme successfully to transmit genotyping results for two large-scale KIR typing projects, finding that it balances the need for rich data representation standards with the accessibility and ubiquity of spreadsheets. We propose this format to the community as a standard for representing KIR allele typing data. Spreadsheets have been used to report KIR typing results to the NMDP for the High Resolution KIR typing project. The current spreadsheet format for the NMDP KIR Typing project has one row per locus. The format uses the separator symbols that are described below and and allow a clear and unambiguous interpretation to be programmed easily for downstream storage analysis.
In order to represent a typing with two possible genotypes (e.g. 001+002 or 004+006), the “|” character is used, resulting in the following string: 001+002|004+006.
Note: representing >2 copies of a geneIf three copies of a gene are observed (3 different alleles), this can be represented simply with two “+” symbols: 001+002+003 Note: representing haplotype phaseIf haplotype “phase” is known between two genes (based on the output of pedigree analysis, haplotype estimation programs or typing methods that separate haplotypes) this can be represented with the “~” symbol: 2DL2*001~2DS5*003 Note: conversion to/from XMLDuring the past year, the NMDP began using the same symbol system proposed here (with “,” as a synonym for “+”) to represent HLA allele, gene and genotype lists. Tools have been developed for conversion between this “string” format and XML. Web tools and utilities for accepting this format have been developed and have been used successfully in a number of projects. Using this same system for KIR will allow reuse of these tools and methods and facilitate much easier data conversion, communication and database import/export. An example spreadsheet with annotations of the above points is available to download. ReferencesMaiers M, Spellman S, Marsh SGE, Parham P, Rajalingam R, Reed E, Noreen H, Yu N, Cooley S. A community standard reporting format for KIR genotyping data. Human Immunology (2007) 68 S105 Further InformationFor more information about the database, queries (including website) please contact IPD Support. Please see our licence for terms of use. ![]() |