register_partition_methods.Rd
Register user-defined partitioning methods
register_partition_methods(..., scale_method = c("z-score", "min-max", "none"))
A named list of functions.
Normally, data matrix is scaled by rows before sent to the partition function. The default scaling is applied by scale
. However, some partition functions may not accept negative values which are produced by scale
. Here scale_method
can be set to min-max
which scales rows by (x - min)/(max - min)
. Note here scale_method
only means the method to scale rows. When scale_rows
is set to FALSE
in consensus_partition
or run_all_consensus_partition_methods
, there will be no row scaling when doing partitioning. The value for scale_method
can be a vector if user specifies more than one partition function.
The user-defined function should accept at least two arguments. The first two arguments are the data
matrix and the number of subgroups. The third optional argument should always be ...
so that parameters
for the partition function can be passed by partition_param
from consensus_partition
.
If users forget to add ...
, it is added internally.
The function should return a vector of partitions (or class labels) or an object which can be recognized by cl_membership
.
The partition function should be applied on columns (Users should be careful with this because some R functions apply on rows and
some R functions apply on columns). E.g. following is how we register kmeans
partition method:
register_partition_methods(
kmeans = function(mat, k, ...) {
# mat is transposed because kmeans() applies on rows
kmeans(t(mat), centers = k, ...)$centers
}
)
The registered partitioning methods will be used as defaults in run_all_consensus_partition_methods
.
To remove a partitioning method, use remove_partition_methods
.
There are following default partitioning methods:
hierarchcial clustering with Euclidean distance, later columns are partitioned by cutree
. If users want to use another distance metric or clustering method, consider to register a new partitioning method. E.g. register_partition_methods(hclust_cor = function(mat, k) cutree(hclust(as.dist(cor(mat)))))
.
by kmeans
.
by skmeans
.
by pam
.
by Mclust
. mclust is applied to the first three principle components from rows.
Users can register two other pre-defined partitioning methods by register_NMF
and register_SOM
.
No value is returned.
all_partition_methods
lists all registered partitioning methods.
all_partition_methods()
#> [1] "hclust" "kmeans" "skmeans" "pam" "mclust"
#> attr(,"scale_method")
#> [1] "z-score" "z-score" "z-score" "z-score" "z-score"
register_partition_methods(
random = function(mat, k) sample(k, ncol(mat), replace = TRUE)
)
all_partition_methods()
#> [1] "hclust" "kmeans" "skmeans" "pam" "mclust" "random"
#> attr(,"scale_method")
#> [1] "z-score" "z-score" "z-score" "z-score" "z-score" "z-score"
remove_partition_methods("random")