When developing R packages, we should try to avoid directly setting dependencies on “heavy packages”. The “heaviness” for a package means, the number of additional dependency packages it brings to. If your package directly depends on a heavy package, it would bring several consequences:
sessionInfo()
).In the DESCRIPTION file of your package, there are “direct dependency pakcages” listed in the Depends
, Imports
and LinkingTo
fields. There are also “indirect dependency packages” that can be found recursively for each of the direct dependency packages. Here what we called “dependency packages” are the union of the direct and indirect dependency packages.
There are also packages listed in Suggests
and Enhances
fields in DESCRIPTION file, but they are not enforced to be installed when installing your package. Of course, they also have “indirect dependency packages”. To get rid of the heavy packages that are not often used in your package, it is better to move them into the Suggests
/Enhances
fields and to load/install them only when they are needed.
Here the pkgndep package checks the heaviness of the dependency packages of your package. For each package listed in the Depends
, Imports
, LinkingTo
and Suggests
/Enhances
fields in the DESCRIPTION file, pkgndep checks how many additional packages your package requires. The summary of the dependency is visualized by a customized heatmap.
As an example, I am developing a package called cola which depends on a lot of other packages. The dependency heatmap looks like follows:
In the heatmap, rows are the packages listed in Depends
, Imports
and Suggests
fields, columns are the additional dependency packages required for each row package. The barplots on the right show the number of required package, the number of imported functions/methods/classes (parsed from NAMESPACE file) and the quantitative measure “heaviness” (the definition of heaviness will be introduced later).
We can see if all the packages are put in the Depends
or Imports
field (i.e. movig all suggsted packages to Imports
), in total 248 packages are required, which are really a lot. Actually some of the heavy packages such as WGCNA, clusterProfiler and ReactomePA (the last three packages in the heatmap rows) are not very frequently used in cola, moving them to Suggests
field and using them only when they are needed greatly helps to reduce the heaviness of cola. Now the number of required packages are reduced to only 64.
Gu Z. et al., pkgndep: a tool for analyzing dependency heaviness of R packages. Bioinformatics 2022. https://doi.org/10.1093/bioinformatics/btac449
Gu Z, On the Dependency Heaviness of CRAN/Bioconductor Ecosystem. arXiv 2022. https://doi.org/10.48550/arXiv.2208.11674
To use this package:
library(pkgndep)
pkg = pkgndep("package-name")
dependency_heatmap(pkg)
or
pkg = pkgndep("path-of-the-package")
dependency_heatmap(pkg)
An executable example:
## ComplexHeatmap, version 2.9.4
## 30 additional packages are required for installing 'ComplexHeatmap'
## 117 additional packages are required if installing packages listed in all fields in DESCRIPTION
dependency_heatmap(pkg)