- Course overview
- Search within this course
- What is genetic variation
- What are variants, alleles and haplotypes?
- Genotype or phenotype
- Types of genetic variation
- What effect do variants in coding regions have?
- Variants in transcription factor binding motifs
- Variant effects on protein structure
- Variation in prokaryotes
- Variant identification and analysis
- Types of genetic variation studies
- Quiz: Check your learning
- Your feedback
Understanding VCF format
VCF is the standard file format for storing variation data. It is used by large scale variant mapping projects such as IGSR. It is also the standard output of variant calling software such as GATK and the standard input for variant analysis tools such as the VEP or for variation archives like EVA.
VCF is a preferred format because it is unambiguous, scalable and flexible, allowing extra information to be added to the info field. Many millions of variants can be stored in a single VCF file.
VCF files are tab delimited text files. Here is an example of a variant in VCF (Figure 12) as viewed in a spreadsheet:
VCF file structure
|To learn more, take a look at the VCF specifications.|