HilbertVis: Visualization of genomic data with the Hilbert curve

⇐ Back to HilbertVis main page

Gallery of sample images


This rainbow demonstrates how the Hilbert curve winds through the square.


For this example, I re-analyzed raw data from 2007 Cell paper by Barski et al. on ChIP-Seq profiling of histone modifications. (The raw reads are available from the NCBI Short Read Archive under accession number SRA000206.)

Both figures show human chromosome 10, the left one the signal for H3K4me1 (mono-methylation of the lysine 4 of histone 3), the right one for H3K4me3 (tri-methylation). It is easy to see the qualitative differences in the distribution and shape of these two histone marks: Mono-methylation is broad, and smeared out, while tri-methylation is very well localized: each me3 peak encompasses only one to three pixels, corresponding to at most a few kilobases.

Do these marks coincide? Are they near genes? The following picture allows to judge this:

Here, the me1 picture above is overlayed with the me3 picture, the former appearing in red, the latter in green. (The resolution was decreased to make the colours appear clearer.) Where marks coincide, their colors merge to yellow. We can see that this is not very common to happen, but the marks are nevertheless close to each other. The areas devoid of marks are also far from genes. To show this, all the areas containing exons have been shaded blue.


An example for the use of Hilbert visualization in quality assessment for ChIP-chip:

These three picture are from a ChIP-chip experiment that studied histone modification H4ac in a mouse muscle cell line perfromed by Fischer et al. (Genomics, Vol. 91, 2008, p. 41). They depict the log ratio of immuno-precipitated vs input material for three technical replicates. The depicted area corresponds to the whole length of chromosome 9. As the custom NimbleGen tiling arrays used in the study only cover selected regions, all non-interrogated parts of the chromosome are shown in gray.

The left-hand and the right-hand image are from successful hybes. They agree well in which regions show enhancement (deep red) and which do not (white). Only very few regions in the left and some reguions in the right figure show a negative log ratio (blue). The middle picture, on the other hand, is of low quality: Most patches are mixtures of both red and blue pixels, i.e., contratictory signals, and too many ratios are negative (blue). This array was discarded in the analysis.


This is a visualization from an Array-CGH comparison of two Arabidopsis thaliana ecotypes, depicting chromosome 2.

Thanks to Michael Seifert, IPK Gatersleben, for sharing this data. Please see his web page for more information on the data and his HMM-based tool to analyze them.

Conservation scores

These images illustrate the how little of the highly conserved part of the human genome is coding. Depicted is the 44-way phastCons conservation score track from the UCSC Genome Browser for human chromosome 10, together with the position of the exons. For non-exonic region, the conservation score is depicted with a scale from black (score 0) to green (score 1), for exons, the colour ranges from blue (score 0) via purple to red (score 1).

The right-hand image shows the same data but the colours have been inverted and hue-rotated by 180°. When viewed on screen, the colour of the left-hand image are more distinguishable, while the right-hand version looks cleare when printed on paper.

R code

Here is the R code used to generate these images.


library( HilbertVisGUI )

hilbertDisplay( 1:1e6, palettePos = rainbow( 1e5, end=5/6 ) )

Chip-Seq example

See the vignette in the ShortRead package.

Chip-chip example

readGFF <- function( file, chr ) {

   # read in a GFF file:
   a <- read.table( file, )

   # select the desired chromsome by comparing with column 1:
   b <- subset( a, V1==chr )

   # make a wiggle vector with the scores
   w <- makeWiggleVector( b$V4, b$V5, b$V6, max(b$V5) + 10 )

   # make another wiggle vector, this time ignoring the score and simply
   # putting a 1 everywhere. This leaves 0 at all base pairs not
   # interrogated by probes:
   wm <- makeWiggleVector( b$V4, b$V5, rep( 1, nrow(b) ), max(b$V5)+10 )

   # Replace all the non-interrogated base pairs with NA
   w[ wm == 0 ] <- NA
   # return this

# Read the data:
w6.9 <- readGFF( "66028_ratio.gff", "chr9" )
w7.9 <- readGFF( "71157_ratio.gff", "chr9" )
w8.9 <- readGFF( "85777_ratio.gff", "chr9" )

# Display it
hilbertDisplay( w6.9, w7.9, w8.9 )

Array-CGH example

This image was made with the stand-alone version, simply loading the GFF file with the data.

Simon Anders, EMBL-EBI,; last change: 2009-02-16