Cluster terms based on their similarity matrix
Usage
cluster_terms(
mat,
method = "binary_cut",
control = list(),
verbose = se_opt$verbose
)
cluster_by_kmeans(mat, max_k = max(2, min(round(nrow(mat)/5), 100)), ...)
cluster_by_pam(mat, max_k = max(2, min(round(nrow(mat)/10), 100)), ...)
cluster_by_dynamicTreeCut(mat, minClusterSize = 5, ...)
cluster_by_fast_greedy(mat, ...)
cluster_by_leading_eigen(mat, ...)
cluster_by_louvain(mat, ...)
cluster_by_walktrap(mat, ...)
cluster_by_mclust(mat, G = seq_len(max(2, min(round(nrow(mat)/5), 100))), ...)
cluster_by_apcluster(mat, s = apcluster::negDistMat(r = 2), ...)
cluster_by_hdbscan(mat, minPts = 5, ...)
cluster_by_MCL(mat, addLoops = TRUE, ...)Arguments
- mat
A similarity matrix.
- method
The clustering methods. Value should be in
all_clustering_methods().- control
A list of parameters passed to the corresponding clustering function.
- verbose
Whether to print messages.
- max_k
Maximal k for k-means/PAM clustering. K-means/PAM clustering is applied from k = 2 to k = max_k.
- ...
Other arguments.
- minClusterSize
Minimal number of objects in a cluster. Pass to
dynamicTreeCut::cutreeDynamic().- G
Passed to the
Gargument inmclust::Mclust()which is the number of clusters.- s
Passed to the
sargument inapcluster::apcluster().- minPts
Passed to the
minPtsargument indbscan::hdbscan().- addLoops
Passed to the
addLoopsargument inMCL::mcl().
Details
New clustering methods can be registered by register_clustering_methods().
Please note it is better to directly use cluster_terms() for clustering while not the individual cluster_by_* functions
because cluster_terms() does additional cluster label adjustment.
By default, there are the following clustering methods and corresponding clustering functions:
kmeansseecluster_by_kmeans().dynamicTreeCutseecluster_by_dynamicTreeCut().mclustseecluster_by_mclust().apclusterseecluster_by_apcluster().hdbscanseecluster_by_hdbscan().fast_greedyseecluster_by_fast_greedy().louvainseecluster_by_louvain().walktrapseecluster_by_walktrap().MCLseecluster_by_MCL().binary_cutseebinary_cut().
The additional argument in individual clustering functions can be set with the control argument
in cluster_terms().
cluster_by_kmeans(): The best k for k-means clustering is determined according to the "elbow" or "knee" method on
the distribution of within-cluster sum of squares (WSS) on each k. All other arguments are passed
from ... to stats::kmeans().
cluster_by_pam(): PAM is applied by fpc::pamk() which can automatically select the best k.
All other arguments are passed from ... to fpc::pamk().
cluster_by_dynamicTreeCut(): All other arguments are passed from ... to dynamicTreeCut::cutreeDynamic().
cluster_by_fast_greedy(): All other arguments are passed from ... to igraph::cluster_fast_greedy().
cluster_by_leading_eigen(): All other arguments are passed from ... to igraph::cluster_leading_eigen().
cluster_by_louvain(): All other arguments are passed from ... to igraph::cluster_louvain().
cluster_by_walktrap(): All other arguments are passed from ... to igraph::cluster_walktrap().
cluster_by_mclust(): All other arguments are passed from ... to mclust::Mclust().
cluster_by_apcluster(): All other arguments are passed from ... to apcluster::apcluster().
cluster_by_hdbscan(): All other arguments are passed from ... to dbscan::hdbscan().
cluster_by_MCL(): All other arguments are passed from ... to MCL::mcl().