Register user-defined partitioning methods

register_partition_methods(..., scale_method = c("z-score", "min-max", "none"))

Arguments

...

A named list of functions.

scale_method

Normally, data matrix is scaled by rows before sent to the partition function. The default scaling is applied by scale. However, some partition functions may not accept negative values which are produced by scale. Here scale_method can be set to min-max which scales rows by (x - min)/(max - min). Note here scale_method only means the method to scale rows. When scale_rows is set to FALSE in consensus_partition or run_all_consensus_partition_methods, there will be no row scaling when doing partitioning. The value for scale_method can be a vector if user specifies more than one partition function.

Details

The user-defined function should accept at least two arguments. The first two arguments are the data matrix and the number of subgroups. The third optional argument should always be ... so that parameters for the partition function can be passed by partition_param from consensus_partition. If users forget to add ..., it is added internally.

The function should return a vector of partitions (or class labels) or an object which can be recognized by cl_membership.

The partition function should be applied on columns (Users should be careful with this because some R functions apply on rows and some R functions apply on columns). E.g. following is how we register kmeans partition method:


  register_partition_methods(
      kmeans = function(mat, k, ...) {
          # mat is transposed because kmeans() applies on rows
          kmeans(t(mat), centers = k, ...)$centers
      }
  )  

The registered partitioning methods will be used as defaults in run_all_consensus_partition_methods.

To remove a partitioning method, use remove_partition_methods.

There are following default partitioning methods:

"hclust"

hierarchcial clustering with Euclidean distance, later columns are partitioned by cutree. If users want to use another distance metric or clustering method, consider to register a new partitioning method. E.g. register_partition_methods(hclust_cor = function(mat, k) cutree(hclust(as.dist(cor(mat))))).

"kmeans"

by kmeans.

"skmeans"

by skmeans.

"pam"

by pam.

"mclust"

by Mclust. mclust is applied to the first three principle components from rows.

Users can register two other pre-defined partitioning methods by register_NMF and register_SOM.

Value

No value is returned.

See also

all_partition_methods lists all registered partitioning methods.

Author

Zuguang Gu <z.gu@dkfz.de>

Examples

all_partition_methods()
#> [1] "hclust"  "kmeans"  "skmeans" "pam"     "mclust" 
#> attr(,"scale_method")
#> [1] "z-score" "z-score" "z-score" "z-score" "z-score"
register_partition_methods(
    random = function(mat, k) sample(k, ncol(mat), replace = TRUE)
)
all_partition_methods()
#> [1] "hclust"  "kmeans"  "skmeans" "pam"     "mclust"  "random" 
#> attr(,"scale_method")
#> [1] "z-score" "z-score" "z-score" "z-score" "z-score" "z-score"
remove_partition_methods("random")