4 min read

Which heatmap function is faster?

In this post I test the performance (the running time) of four heatmap functions: gplots::heatmap.2(), heatmap() which is natively supported in R, ComplexHeatmap::Heatmap() and pheatmap::pheatmap().

We generate a 1000x1000 random matrix.

library(ComplexHeatmap)
library(pheatmap)
library(gplots)
library(microbenchmark)

set.seed(123)
n = 1000
mat = matrix(rnorm(n*n), nrow = n)

First I test drawing heatmaps as well as drawing dendrograms (with applying clustering):

t1 = microbenchmark(
    "heatmap()" = {
        pdf(NULL) 
        heatmap(mat) 
        dev.off()
    },
    "heatmap.2()" = {
        pdf(NULL) 
        heatmap.2(mat, trace = "none") 
        dev.off()
    },
    "Heatmap()" = {
        pdf(NULL)
        draw(Heatmap(mat)) 
        dev.off()
    },
    "pheatmap()" = {
        pdf(NULL) 
        pheatmap(mat)
        dev.off()
    },
    times = 5
)
print(t1, unit = "s")
## Unit: seconds
##         expr   min    lq  mean median    uq   max neval
##    heatmap() 15.93 16.03 17.05  16.13 17.25 19.90     5
##  heatmap.2() 16.15 17.06 17.09  17.19 17.38 17.69     5
##    Heatmap() 20.75 21.55 22.27  21.90 21.96 25.17     5
##   pheatmap() 15.66 15.89 19.77  16.21 16.64 34.44     5

The running time for all four heatmap functions looks similar, it might due to that clustering uses most of the running time. Heatmap() runs the longest, perhaps because Heatmap() applies additional manipulations on the dendrograms such as dendrogram reordering.

Next I suppress the clustering on both rows and columns and with no dendrogram.

t2 = microbenchmark(
    "heatmap()" = {
        pdf(NULL)
        heatmap(mat, Rowv = NA, Colv = NA)
        dev.off()
    },
    "heatmap.2()" = {
        pdf(NULL)
        heatmap.2(mat, dendrogram = "none", trace = "none")
        dev.off()
    },
    "Heatmap()" = {
        pdf(NULL)
        draw(Heatmap(mat, cluster_rows = FALSE, cluster_columns = FALSE))
        dev.off()
    },
    "pheatmap()" = {
        pdf(NULL)
        pheatmap(mat, cluster_rows = FALSE, cluster_cols = FALSE)
        dev.off()
    },
    times = 5
)
print(t2, unit = "s")
## Unit: seconds
##         expr     min     lq    mean  median      uq     max neval
##    heatmap()  0.2546  0.266  0.3192  0.2683  0.3141  0.4931     5
##  heatmap.2() 15.0519 15.315 15.3524 15.4163 15.4787 15.5001     5
##    Heatmap()  2.7637  2.841  2.9421  2.9303  2.9693  3.2059     5
##   pheatmap()  1.1940  1.225  4.3730  1.2677  1.3535 16.8250     5

Now heatmap.2() now is the slowest if only draw the heatmap bodies.

Next I perform clustering in advance and send the clustering objects to the heatmap functions. In this setting, dendrograms are also drawn along with the heatmaps.

row_hc = hclust(dist(mat))
col_hc = hclust(dist(t(mat)))
t3 = microbenchmark(
    "heatmap()" = {
        pdf(NULL)
        heatmap(mat, Rowv = as.dendrogram(row_hc), Colv = as.dendrogram(col_hc))
        dev.off()
    },
    "heatmap.2()" = {
        pdf(NULL)
        heatmap.2(mat, Rowv = row_hc, Colv = col_hc, trace = "none")
        dev.off()
    },
    "Heatmap()" = {
        pdf(NULL)
        draw(Heatmap(mat, cluster_rows = row_hc, cluster_columns = col_hc))
        dev.off()
    },
    "pheatmap()" = {
        pdf(NULL)
        pheatmap(mat, cluster_rows = row_hc, cluster_cols = col_hc)
        dev.off()
    },
    times = 5
)
print(t3, unit = "s")
## Unit: seconds
##         expr    min     lq   mean median     uq    max neval
##    heatmap()  1.462  1.473  1.503  1.475  1.506  1.599     5
##  heatmap.2() 15.864 15.888 16.165 16.163 16.327 16.585     5
##    Heatmap()  5.777  5.803  5.956  6.003  6.066  6.130     5
##   pheatmap()  1.308  1.321  4.413  1.488  1.544 16.406     5

Finally I put the mean running time into a table for easy comparison:

heatmap() heatmap.2() Heatmap() pheatmap()
do clustering, draw dendrograms 17.05s 17.09s 22.27s 19.77s
no clusteirng, no dendrogram 0.32s 15.35s 2.94s 4.37s
only draw dendrograms 1.50s 16.17s 5.96s 4.41s

The following plots illustrate the mean running time for the four matrices with different dimensions.

Session info:

sessionInfo()
## R version 4.0.2 (2020-06-22)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Catalina 10.15.5
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
## 
## attached base packages:
## [1] grid      stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
## [1] cowplot_1.1.0             ggplot2_3.3.2            
## [3] microbenchmark_1.4-7      gplots_3.1.0             
## [5] pheatmap_1.0.12           ComplexHeatmap_2.7.1.1003
## [7] GetoptLong_1.0.4          knitr_1.30               
## 
## loaded via a namespace (and not attached):
##  [1] circlize_0.4.12.1004 shape_1.4.5          gtools_3.8.2        
##  [4] tidyselect_1.1.0     xfun_0.19            purrr_0.3.4         
##  [7] colorspace_2.0-0     vctrs_0.3.4          generics_0.1.0      
## [10] htmltools_0.5.0      stats4_4.0.2         yaml_2.2.1          
## [13] rlang_0.4.8          pillar_1.4.6         withr_2.3.0         
## [16] glue_1.4.2           BiocGenerics_0.34.0  RColorBrewer_1.1-2  
## [19] matrixStats_0.57.0   lifecycle_0.2.0      stringr_1.4.0       
## [22] munsell_0.5.0        blogdown_0.17        gtable_0.3.0        
## [25] GlobalOptions_0.1.2  caTools_1.18.0       evaluate_0.14       
## [28] labeling_0.4.2       IRanges_2.22.2       Cairo_1.5-12.2      
## [31] parallel_4.0.2       highr_0.8            Rcpp_1.0.5          
## [34] KernSmooth_2.23-18   scales_1.1.1         S4Vectors_0.26.1    
## [37] magick_2.5.2         farver_2.0.3         rjson_0.2.20        
## [40] png_0.1-7            digest_0.6.27        stringi_1.5.3       
## [43] bookdown_0.21        dplyr_1.0.2          clue_0.3-57         
## [46] tools_4.0.2          bitops_1.0-6         magrittr_2.0.1      
## [49] tibble_3.0.4         cluster_2.1.0        crayon_1.3.4        
## [52] pkgconfig_2.0.3      ellipsis_0.3.1       rmarkdown_2.5       
## [55] R6_2.5.0             compiler_4.0.2