In this post I test the performance (the running time) of four heatmap
functions: gplots::heatmap.2()
, heatmap()
which is natively supported in R,
ComplexHeatmap::Heatmap()
and pheatmap::pheatmap()
.
We generate a 1000x1000 random matrix.
library(ComplexHeatmap)
library(pheatmap)
library(gplots)
library(microbenchmark)
set.seed(123)
n = 1000
mat = matrix(rnorm(n*n), nrow = n)
First I test drawing heatmaps as well as drawing dendrograms (with applying clustering):
t1 = microbenchmark(
"heatmap()" = {
pdf(NULL)
heatmap(mat)
dev.off()
},
"heatmap.2()" = {
pdf(NULL)
heatmap.2(mat, trace = "none")
dev.off()
},
"Heatmap()" = {
pdf(NULL)
draw(Heatmap(mat))
dev.off()
},
"pheatmap()" = {
pdf(NULL)
pheatmap(mat)
dev.off()
},
times = 5
)
print(t1, unit = "s")
## Unit: seconds
## expr min lq mean median uq max neval
## heatmap() 15.93 16.03 17.05 16.13 17.25 19.90 5
## heatmap.2() 16.15 17.06 17.09 17.19 17.38 17.69 5
## Heatmap() 20.75 21.55 22.27 21.90 21.96 25.17 5
## pheatmap() 15.66 15.89 19.77 16.21 16.64 34.44 5
The running time for all four heatmap functions looks similar, it might due to that
clustering uses most of the running time. Heatmap()
runs the longest, perhaps
because Heatmap()
applies additional manipulations on the dendrograms such as
dendrogram reordering.
Next I suppress the clustering on both rows and columns and with no dendrogram.
t2 = microbenchmark(
"heatmap()" = {
pdf(NULL)
heatmap(mat, Rowv = NA, Colv = NA)
dev.off()
},
"heatmap.2()" = {
pdf(NULL)
heatmap.2(mat, dendrogram = "none", trace = "none")
dev.off()
},
"Heatmap()" = {
pdf(NULL)
draw(Heatmap(mat, cluster_rows = FALSE, cluster_columns = FALSE))
dev.off()
},
"pheatmap()" = {
pdf(NULL)
pheatmap(mat, cluster_rows = FALSE, cluster_cols = FALSE)
dev.off()
},
times = 5
)
print(t2, unit = "s")
## Unit: seconds
## expr min lq mean median uq max neval
## heatmap() 0.2546 0.266 0.3192 0.2683 0.3141 0.4931 5
## heatmap.2() 15.0519 15.315 15.3524 15.4163 15.4787 15.5001 5
## Heatmap() 2.7637 2.841 2.9421 2.9303 2.9693 3.2059 5
## pheatmap() 1.1940 1.225 4.3730 1.2677 1.3535 16.8250 5
Now heatmap.2()
now is the slowest if only draw the heatmap bodies.
Next I perform clustering in advance and send the clustering objects to the heatmap functions. In this setting, dendrograms are also drawn along with the heatmaps.
row_hc = hclust(dist(mat))
col_hc = hclust(dist(t(mat)))
t3 = microbenchmark(
"heatmap()" = {
pdf(NULL)
heatmap(mat, Rowv = as.dendrogram(row_hc), Colv = as.dendrogram(col_hc))
dev.off()
},
"heatmap.2()" = {
pdf(NULL)
heatmap.2(mat, Rowv = row_hc, Colv = col_hc, trace = "none")
dev.off()
},
"Heatmap()" = {
pdf(NULL)
draw(Heatmap(mat, cluster_rows = row_hc, cluster_columns = col_hc))
dev.off()
},
"pheatmap()" = {
pdf(NULL)
pheatmap(mat, cluster_rows = row_hc, cluster_cols = col_hc)
dev.off()
},
times = 5
)
print(t3, unit = "s")
## Unit: seconds
## expr min lq mean median uq max neval
## heatmap() 1.462 1.473 1.503 1.475 1.506 1.599 5
## heatmap.2() 15.864 15.888 16.165 16.163 16.327 16.585 5
## Heatmap() 5.777 5.803 5.956 6.003 6.066 6.130 5
## pheatmap() 1.308 1.321 4.413 1.488 1.544 16.406 5
Finally I put the mean running time into a table for easy comparison:
heatmap() |
heatmap.2() |
Heatmap() |
pheatmap() |
|
do clustering, draw dendrograms | 17.05s |
17.09s |
22.27s |
19.77s |
no clusteirng, no dendrogram | 0.32s |
15.35s |
2.94s |
4.37s |
only draw dendrograms | 1.50s |
16.17s |
5.96s |
4.41s |
The following plots illustrate the mean running time for the four matrices with different dimensions.
Session info:
sessionInfo()
## R version 4.0.2 (2020-06-22)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Catalina 10.15.5
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
##
## attached base packages:
## [1] grid stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] cowplot_1.1.0 ggplot2_3.3.2
## [3] microbenchmark_1.4-7 gplots_3.1.0
## [5] pheatmap_1.0.12 ComplexHeatmap_2.7.1.1003
## [7] GetoptLong_1.0.4 knitr_1.30
##
## loaded via a namespace (and not attached):
## [1] circlize_0.4.12.1004 shape_1.4.5 gtools_3.8.2
## [4] tidyselect_1.1.0 xfun_0.19 purrr_0.3.4
## [7] colorspace_2.0-0 vctrs_0.3.4 generics_0.1.0
## [10] htmltools_0.5.0 stats4_4.0.2 yaml_2.2.1
## [13] rlang_0.4.8 pillar_1.4.6 withr_2.3.0
## [16] glue_1.4.2 BiocGenerics_0.34.0 RColorBrewer_1.1-2
## [19] matrixStats_0.57.0 lifecycle_0.2.0 stringr_1.4.0
## [22] munsell_0.5.0 blogdown_0.17 gtable_0.3.0
## [25] GlobalOptions_0.1.2 caTools_1.18.0 evaluate_0.14
## [28] labeling_0.4.2 IRanges_2.22.2 Cairo_1.5-12.2
## [31] parallel_4.0.2 highr_0.8 Rcpp_1.0.5
## [34] KernSmooth_2.23-18 scales_1.1.1 S4Vectors_0.26.1
## [37] magick_2.5.2 farver_2.0.3 rjson_0.2.20
## [40] png_0.1-7 digest_0.6.27 stringi_1.5.3
## [43] bookdown_0.21 dplyr_1.0.2 clue_0.3-57
## [46] tools_4.0.2 bitops_1.0-6 magrittr_2.0.1
## [49] tibble_3.0.4 cluster_2.1.0 crayon_1.3.4
## [52] pkgconfig_2.0.3 ellipsis_0.3.1 rmarkdown_2.5
## [55] R6_2.5.0 compiler_4.0.2