Supplementary S1. Hilbert curve preserves the locality

Author: Zuguang Gu ( z.gu@dkfz.de )

Date: 2016-02-27


A Hilbert curve folds the one-dimensional axis into a two-dimensional space while still preserving the locality of the data points. This means that two data points which are close to each other in one-dimensional space are also close to each other after folding. To visualize this property, we first generate a Hilbert curve with level 5. In the following animation, the point traverses the Hilbert curve from lowest to highest x value of the original 1D axis.

library(HilbertCurve)
for(i in 1:1024) {
    hc = HilbertCurve(1, 1024, level = 5, reference = TRUE, arrow = FALSE)
    hc_points(hc, x1 = i, np = NULL, pch = 16, size = unit(2, "mm"))
}

video of chunk move-on-hilbert-curve

Next, we calculate the pairwise distance between data points in the 2D space and visualize it in a heatmap. The x-axis (top) and the y-axis (left) indicate the position of the data points on the original 1D axis.

library(HilbertVis)
pos = HilbertVis::hilbertCurve(5)
mat = as.matrix(dist(pos))
library(ComplexHeatmap)

ht = Heatmap(mat, name = "dist", cluster_rows = FALSE, cluster_columns = FALSE, 
    show_row_names = FALSE, show_column_names = FALSE, 
    heatmap_legend_param = list(title = "euclidean_dist"))
draw(ht, padding = unit(c(5, 5, 5, 2), "mm"))
decorate_heatmap_body("dist", {
    grid.segments(c(0.25, 0.5, 0.75, 0, 0, 0), c(0, 0, 0, 0.25, 0.5, 0.75), 
          c(0.25, 0.5, 0.75, 1, 1, 1), c(1, 1, 1, 0.25, 0.5, 0.75), gp = gpar(lty = 2))
    grid.text(rev(c(256, 512, 768, 1024)), 0, c(0, 256, 512, 768)/1024, just = "bottom", 
        rot = 90, gp = gpar(fontsize = 10))
    grid.text(c(1, 256, 512, 768, 1024), c(1, 256, 512, 768, 1024)/1024, 1, just = "bottom",
        gp = gpar(fontsize = 10))
})

plot of chunk unnamed-chunk-2

The blue area around the diagonal shows that data points which are close on the 1D axis are still close in the 2D space. However, as evident from the blue areas which are apart from the diagonal, due to the folding of the curve there are some data points which are close in the two-dimensional space but distant on the one-dimensional axis (e.g, 1 ~ 256 which are close to data points in 768 ~ 1024; bottom left or top right area in the heatmap).