Author: Zuguang Gu ( z.gu@dkfz.de )
Date: 2016-02-27
Colors are important graphical representations for the values associated with input regions.
Nice colors help to easily interpret the Hilbert curve.
Generally, colors should be passed to the low-level graphic functions e.g. hc_points()
, hc_layer()
as a vector.
The HilbertCurve package does not take care of how to generate colors. However, in this supplementary,
we give some solutions that make color mapping simple.
In below, we only discuss color mapping for continuous values.
There are several ways to map continuous values to colors in R such as colorRamp()
function, but here
we recommend colorRamp2()
from the circlize package which makes color mapping simple and robust.
colorRamp2()
generates a color mapping function. colorRamp2()
needs two arguments which are a vector of
break values and a vector of corresponding colors, also color space and transparency can be set as well.
Color mapping is applied by linear interpolated between corresponding breaks.
For example, to generate a color mapping function which maps methylation values (which always range between 0 and 1):
library(circlize)
col_fun = colorRamp2(c(0, 0.5, 1), c("blue", "white", "red"))
class(col_fun)
## [1] "function"
Then, col_fun()
can be applied to methylation values to get corresponding colors.
col_fun(0.25)
## [1] "#B38BFFFF"
col_fun(0.75)
## [1] "#FF9E81FF"
x = seq(0, 1, length = 100)
color = col_fun(x)
plot(seq_along(x), pch = 16, col = color, ann = FALSE)
It is also easy to change to another color palette.
library(RColorBrewer)
col_fun = colorRamp2(seq(0, 1, length = 11), # 11 breaks
rev(brewer.pal(11, "Spectral"))) # 11 colors
plot(seq_along(x), pch = 16, col = col_fun(x), ann = FALSE)
colorRamp2()
is useful to generate a color mapping function which is robust to outliers.
If values are away from the maximal or minimal breaks, colors corresponding to the maximal
or minimal breaks will be assigned to these outliers. In following example, the first element
of x
is modified to be an outlier. Following plot shows due to the outlier, majority of the
data points have colors very close to white.
set.seed(123)
x = abs(rnorm(100))
x[1] = 30
col_fun = colorRamp2(range(x), c("white", "red"))
plot(seq_along(x), pch = 16, col = col_fun(x))
With using colorRamp2()
, the maximul break can be set to the 95% percentile so that outlier
will not affect the color mapping. We use this feature to generating colors for histone modification singals
in Supplementary File S5.
col_fun = colorRamp2(quantile(x, c(0, 0.95)), c("white", "red"))
plot(seq_along(x), pch = 16, col = col_fun(x))
In following examples, we apply colors to a real Hilbert curve.
library(HilbertCurve)
gr = generateRandomBed(nr = 100, fun = rnorm)
col_fun = colorRamp2(c(-2, 0, 2), c("green", "white", "red"))
hc = GenomicHilbertCurve(mode = "pixel", level = 9)
hc_layer(hc, gr, col = col_fun(gr[, 4]))
hc_map(hc, fill = NA, border = "#CCCCCC", add = TRUE)
With using the color mapping function that is generated by colorRamp2()
, it is possible to generate
a legend as well.
library(ComplexHeatmap)
cm = ColorMapping(col_fun = col_fun)
lgd = color_mapping_legend(cm, title = "rnorm", plot = FALSE)
hc = GenomicHilbertCurve(mode = "pixel", level = 9, legend = lgd)
hc_layer(hc, gr, col = col_fun(gr[, 4]))
hc_map(hc, fill = NA, border = "#CCCCCC", add = TRUE)
Another advantage of using colorRamp2()
is that the colors which are generated by the color mapping function
can be revert back to the original values by the companion function col2value()
.
Then a new set of color theme can be generated by a new color mapping function.
This functionality is extremely useful to change color theme only for the overlapping regions
when overlaying a new layer to the curve in order to highlight correspondence between
two sources of information.
As a demonstration, in following plot, coloring for chromosome 1-5 is changed to blue-white-red.
hc = GenomicHilbertCurve(mode = "pixel", level = 9)
hc_layer(hc, gr, col = col_fun(gr[, 4]))
chr_selected = hc@background[1:5]
col_fun_new = colorRamp2(c(-2, 0, 2), c("blue", "white", "red"))
hc_layer(hc, chr_selected, col = "#00000020",
overlay = function(r0, g0, b0, r, g, b, alpha) {
# non-white areas
l = !(r0 == 1 & g0 == 1 & b0 == 1)
# original value
v = col2value(r0[l], g0[l], b0[l], col_fun = col_fun)
# new color theme
col_new = col_fun_new(v, return_rgb = TRUE)
r0[l] = col_new[, 1]
g0[l] = col_new[, 2]
b0[l] = col_new[, 3]
default_overlay(r0, g0, b0, r, g, b, alpha)
})
hc_map(hc, fill = NA, border = "#CCCCCC", add = TRUE)
In Figure 1C in the manuscript as well as Supplementary S5, the overlapping regions between gene bodies and histone modifications are highlighted by white-purple theme to highlight the enrichment of these two types of genomic featuers.
sessionInfo()
## R version 3.2.2 (2015-08-14)
## Platform: x86_64-pc-linux-gnu (64-bit)
##
## locale:
## [1] C
##
## attached base packages:
## [1] stats4 parallel methods grid stats graphics grDevices
## [8] utils datasets base
##
## other attached packages:
## [1] RColorBrewer_1.1-2 ComplexHeatmap_1.6.0 circlize_0.3.4
## [4] HilbertCurve_1.1.3 GenomicRanges_1.22.3 GenomeInfoDb_1.6.3
## [7] IRanges_2.4.6 S4Vectors_0.9.19 BiocGenerics_0.16.1
##
## loaded via a namespace (and not attached):
## [1] whisker_0.3-2 knitr_1.12.3 XVector_0.10.0
## [4] magrittr_1.5 zlibbioc_1.16.0 colorspace_1.2-6
## [7] lattice_0.20-33 rjson_0.2.15 stringr_1.0.0
## [10] tools_3.2.2 png_0.1-7 formatR_1.2.1
## [13] HilbertVis_1.28.0 GlobalOptions_0.0.8 dendextend_1.1.2
## [16] shape_1.4.2 evaluate_0.8 stringi_1.0-1
## [19] GetoptLong_0.1.1