Skip to contents

Apply functions on every node in a dendrogram

Usage

dend_node_apply(dend, fun)

edit_node(dend, fun = function(d, index) d)

Arguments

dend

A dendrogram object.

fun

A self-defined function.

Value

dend_node_apply() returns a vector or a list, depends on whether fun returns a scalar or more complex values.

edit_node() returns a dendrogram object.

Details

dend_node_apply() returns a vector or a list as the same length as the number of nodes in the dendrogram.

The self-defined function can have one single argument which is the sub-dendrogram at a certain node. E.g. to get the number of members at every node:

dend_node_apply(dend, function(d) attr(d, "members"))

The self-defined function can have a second argument, which is the index of current sub-dendrogram in the complete dendrogram. E.g. dend[[1]] is the first child node of the complete dendrogram and dend[[c(1, 2)]] is the second child node of dend[[1]], et al. This makes that at a certain node, it is possible to get information of its child nodes and parent nodes.

dend_node_apply(dend, function(d, index) {
    dend[[c(index, 1)]] # is the first child node of d, or simply d[[1]]
    dend[[index[-length(index)]]] # is the parent node of d
    ...
})

Note for the top node, the value of index is NULL.

In edit_node(), if fun only has one argument, it is basically the same as stats::dendrapply(), but it can have a second argument which is the index of the node in the dendrogram, which makes it possible to get information of child nodes and parent nodes for a specific node.

As an example, we first assign random values to every node in the dendrogram:

mat = matrix(rnorm(100), 10)
dend = as.dendrogram(hclust(dist(mat)))
dend = edit_node(dend, function(d) {attr(d, 'score') = runif(1); d})

Then for every node, we take the maximal absolute difference to all its child nodes and parent node as the attribute abs_diff.

dend = edit_node(dend, function(d, index) {
    n = length(index)
    s = attr(d, "score")
    if(is.null(index)) {  # d is the top node
        s_children = sapply(d, function(x) attr(x, "score"))
        s_parent = NULL
    } else if(is.leaf(d)) { # d is the leaf
        s_children = NULL
        s_parent = attr(dend[[index[-n]]], "score")
    } else {
        s_children = sapply(d, function(x) attr(x, "score"))
        s_parent = attr(dend[[index[-n]]], "score")
    }
    abs_diff = max(abs(s - c(s_children, s_parent)))
    attr(d, "abs_diff") = abs_diff
    return(d)
})

Examples

mat = matrix(rnorm(100), 10)
dend = as.dendrogram(hclust(dist(mat)))
# number of members on every node
dend_node_apply(dend, function(d) attr(d, "members"))
#>  [1] 10  2  1  1  8  1  7  2  1  1  5  2  1  1  3  1  2  1  1
# the depth on every node
dend_node_apply(dend, function(d, index) length(index))
#>  [1] 0 1 2 2 1 2 2 3 4 4 3 4 5 5 4 5 5 6 6