run_all_consensus_partition_methods.Rd
Consensus partitioning for all combinations of methods
run_all_consensus_partition_methods(data,
top_value_method = all_top_value_methods(),
partition_method = all_partition_methods(),
max_k = 6, k = NULL,
top_n = NULL,
mc.cores = 1, cores = mc.cores, anno = NULL, anno_col = NULL,
sample_by = "row", p_sampling = 0.8, partition_repeat = 50,
scale_rows = NULL, verbose = TRUE, help = cola_opt$help)
A numeric matrix where subgroups are found by columns.
Method which are used to extract top n rows. Allowed methods are in all_top_value_methods
and can be self-added by register_top_value_methods
.
Method which are used to partition samples. Allowed methods are in all_partition_methods
and can be self-added by register_partition_methods
.
Maximal number of subgroups to try. The function will try 2:max_k
subgroups.
Alternatively, you can specify a vector k.
Number of rows with top values. The value can be a vector with length > 1. When n > 5000, the function only randomly sample 5000 rows from top n rows. If top_n
is a vector, paritition will be applied to every values in top_n
and consensus partition is summarized from all partitions.
Number of cores to use. This argument will be removed in future versions.
Number of cores, or a cluster
object returned by makeCluster
.
A data frame with known annotation of columns.
A list of colors (color is defined as a named vector) for the annotations. If anno
is a data frame, anno_col
should be a named list where names correspond to the column names in anno
.
Should randomly sample the matrix by rows or by columns?
Proportion of the top n rows to sample.
Number of repeats for the random sampling.
Whether to scale rows. If it is TRUE
, scaling method defined in register_partition_methods
is used.
Whether to print messages.
Whether to print help messages.
The function performs consensus partitioning by consensus_partition
for all combinations of top-value methods and partitioning methods.
It also adjsuts the subgroup labels for all methods and for all k to make them as consistent as possible.
A ConsensusPartitionList-class
object. Simply type object in the interactive R session
to see which functions can be applied on it.
# \dontrun{
set.seed(123)
m = cbind(rbind(matrix(rnorm(20*20, mean = 1), nr = 20),
matrix(rnorm(20*20, mean = -1), nr = 20)),
rbind(matrix(rnorm(20*20, mean = -1), nr = 20),
matrix(rnorm(20*20, mean = 1), nr = 20))
) + matrix(rnorm(40*40), nr = 40)
rl = run_all_consensus_partition_methods(data = m, top_n = c(20, 30, 40))
#> * on a 40x40 matrix.
#> * calculate top-values.
#> - calculate SD score for 40 rows.
#> - calculate CV score for 40 rows.
#> - calculate MAD score for 40 rows.
#> - calculate ATC score for 40 rows.
#> ------------------------------------------------------------
#> * running partition by SD:skmeans. 1/20
#> * run SD:skmeans on a 40x40 matrix.
#> * SD values have already been calculated. Get from cache.
#> * rows are scaled before sent to partition, method: 'z-score' (x - mean)/sd
#> * get top 20 rows by SD method
#> * get top 30 rows by SD method
#> * get top 40 rows by SD method
#> * wrap results for k = 2
#> * wrap results for k = 3
#> * wrap results for k = 4
#> * wrap results for k = 5
#> * wrap results for k = 6
#> * adjust class labels between different k.
#> * SD:skmeans used 1.007467 mins.
#> ------------------------------------------------------------
#> * running partition by CV:skmeans. 2/20
#> * run CV:skmeans on a 40x40 matrix.
#> * CV values have already been calculated. Get from cache.
#> * rows are scaled before sent to partition, method: 'z-score' (x - mean)/sd
#> * get top 20 rows by CV method
#> * get top 30 rows by CV method
#> * get top 40 rows by CV method
#> * wrap results for k = 2
#> * wrap results for k = 3
#> * wrap results for k = 4
#> * wrap results for k = 5
#> * wrap results for k = 6
#> * adjust class labels between different k.
#> * CV:skmeans used 55.825 secs.
#> ------------------------------------------------------------
#> * running partition by MAD:skmeans. 3/20
#> * run MAD:skmeans on a 40x40 matrix.
#> * MAD values have already been calculated. Get from cache.
#> * rows are scaled before sent to partition, method: 'z-score' (x - mean)/sd
#> * get top 20 rows by MAD method
#> * get top 30 rows by MAD method
#> * get top 40 rows by MAD method
#> * wrap results for k = 2
#> * wrap results for k = 3
#> * wrap results for k = 4
#> * wrap results for k = 5
#> * wrap results for k = 6
#> * adjust class labels between different k.
#> * MAD:skmeans used 57.137 secs.
#> ------------------------------------------------------------
#> * running partition by ATC:skmeans. 4/20
#> * run ATC:skmeans on a 40x40 matrix.
#> * ATC values have already been calculated. Get from cache.
#> * rows are scaled before sent to partition, method: 'z-score' (x - mean)/sd
#> * get top 20 rows by ATC method
#> * get top 30 rows by ATC method
#> * get top 40 rows by ATC method
#> * wrap results for k = 2
#> * wrap results for k = 3
#> * wrap results for k = 4
#> * wrap results for k = 5
#> * wrap results for k = 6
#> * adjust class labels between different k.
#> * ATC:skmeans used 56.117 secs.
#> ------------------------------------------------------------
#> * running partition by SD:mclust. 5/20
#> * run SD:mclust on a 40x40 matrix.
#> * SD values have already been calculated. Get from cache.
#> * rows are scaled before sent to partition, method: 'z-score' (x - mean)/sd
#> * get top 20 rows by SD method
#> * get top 30 rows by SD method
#> * get top 40 rows by SD method
#> * wrap results for k = 2
#> * wrap results for k = 3
#> * wrap results for k = 4
#> * wrap results for k = 5
#> * wrap results for k = 6
#> * adjust class labels between different k.
#> * SD:mclust used 10.391 secs.
#> ------------------------------------------------------------
#> * running partition by CV:mclust. 6/20
#> * run CV:mclust on a 40x40 matrix.
#> * CV values have already been calculated. Get from cache.
#> * rows are scaled before sent to partition, method: 'z-score' (x - mean)/sd
#> * get top 20 rows by CV method
#> * get top 30 rows by CV method
#> * get top 40 rows by CV method
#> * wrap results for k = 2
#> * wrap results for k = 3
#> * wrap results for k = 4
#> * wrap results for k = 5
#> * wrap results for k = 6
#> * adjust class labels between different k.
#> * CV:mclust used 10.819 secs.
#> ------------------------------------------------------------
#> * running partition by MAD:mclust. 7/20
#> * run MAD:mclust on a 40x40 matrix.
#> * MAD values have already been calculated. Get from cache.
#> * rows are scaled before sent to partition, method: 'z-score' (x - mean)/sd
#> * get top 20 rows by MAD method
#> * get top 30 rows by MAD method
#> * get top 40 rows by MAD method
#> * wrap results for k = 2
#> * wrap results for k = 3
#> * wrap results for k = 4
#> * wrap results for k = 5
#> * wrap results for k = 6
#> * adjust class labels between different k.
#> * MAD:mclust used 10.229 secs.
#> ------------------------------------------------------------
#> * running partition by ATC:mclust. 8/20
#> * run ATC:mclust on a 40x40 matrix.
#> * ATC values have already been calculated. Get from cache.
#> * rows are scaled before sent to partition, method: 'z-score' (x - mean)/sd
#> * get top 20 rows by ATC method
#> * get top 30 rows by ATC method
#> * get top 40 rows by ATC method
#> * wrap results for k = 2
#> * wrap results for k = 3
#> * wrap results for k = 4
#> * wrap results for k = 5
#> * wrap results for k = 6
#> * adjust class labels between different k.
#> * ATC:mclust used 10.183 secs.
#> ------------------------------------------------------------
#> * running partition by SD:pam. 9/20
#> * run SD:pam on a 40x40 matrix.
#> * SD values have already been calculated. Get from cache.
#> * rows are scaled before sent to partition, method: 'z-score' (x - mean)/sd
#> * get top 20 rows by SD method
#> * get top 30 rows by SD method
#> * get top 40 rows by SD method
#> * wrap results for k = 2
#> * wrap results for k = 3
#> * wrap results for k = 4
#> * wrap results for k = 5
#> * wrap results for k = 6
#> * adjust class labels between different k.
#> * SD:pam used 2.237 secs.
#> ------------------------------------------------------------
#> * running partition by CV:pam. 10/20
#> * run CV:pam on a 40x40 matrix.
#> * CV values have already been calculated. Get from cache.
#> * rows are scaled before sent to partition, method: 'z-score' (x - mean)/sd
#> * get top 20 rows by CV method
#> * get top 30 rows by CV method
#> * get top 40 rows by CV method
#> * wrap results for k = 2
#> * wrap results for k = 3
#> * wrap results for k = 4
#> * wrap results for k = 5
#> * wrap results for k = 6
#> * adjust class labels between different k.
#> * CV:pam used 2.291 secs.
#> ------------------------------------------------------------
#> * running partition by MAD:pam. 11/20
#> * run MAD:pam on a 40x40 matrix.
#> * MAD values have already been calculated. Get from cache.
#> * rows are scaled before sent to partition, method: 'z-score' (x - mean)/sd
#> * get top 20 rows by MAD method
#> * get top 30 rows by MAD method
#> * get top 40 rows by MAD method
#> * wrap results for k = 2
#> * wrap results for k = 3
#> * wrap results for k = 4
#> * wrap results for k = 5
#> * wrap results for k = 6
#> * adjust class labels between different k.
#> * MAD:pam used 2.295 secs.
#> ------------------------------------------------------------
#> * running partition by ATC:pam. 12/20
#> * run ATC:pam on a 40x40 matrix.
#> * ATC values have already been calculated. Get from cache.
#> * rows are scaled before sent to partition, method: 'z-score' (x - mean)/sd
#> * get top 20 rows by ATC method
#> * get top 30 rows by ATC method
#> * get top 40 rows by ATC method
#> * wrap results for k = 2
#> * wrap results for k = 3
#> * wrap results for k = 4
#> * wrap results for k = 5
#> * wrap results for k = 6
#> * adjust class labels between different k.
#> * ATC:pam used 2.267 secs.
#> ------------------------------------------------------------
#> * running partition by SD:kmeans. 13/20
#> * run SD:kmeans on a 40x40 matrix.
#> * SD values have already been calculated. Get from cache.
#> * rows are scaled before sent to partition, method: 'z-score' (x - mean)/sd
#> * get top 20 rows by SD method
#> * get top 30 rows by SD method
#> * get top 40 rows by SD method
#> * wrap results for k = 2
#> * wrap results for k = 3
#> * wrap results for k = 4
#> * wrap results for k = 5
#> * wrap results for k = 6
#> * adjust class labels between different k.
#> * SD:kmeans used 2.461 secs.
#> ------------------------------------------------------------
#> * running partition by CV:kmeans. 14/20
#> * run CV:kmeans on a 40x40 matrix.
#> * CV values have already been calculated. Get from cache.
#> * rows are scaled before sent to partition, method: 'z-score' (x - mean)/sd
#> * get top 20 rows by CV method
#> * get top 30 rows by CV method
#> * get top 40 rows by CV method
#> * wrap results for k = 2
#> * wrap results for k = 3
#> * wrap results for k = 4
#> * wrap results for k = 5
#> * wrap results for k = 6
#> * adjust class labels between different k.
#> * CV:kmeans used 2.507 secs.
#> ------------------------------------------------------------
#> * running partition by MAD:kmeans. 15/20
#> * run MAD:kmeans on a 40x40 matrix.
#> * MAD values have already been calculated. Get from cache.
#> * rows are scaled before sent to partition, method: 'z-score' (x - mean)/sd
#> * get top 20 rows by MAD method
#> * get top 30 rows by MAD method
#> * get top 40 rows by MAD method
#> * wrap results for k = 2
#> * wrap results for k = 3
#> * wrap results for k = 4
#> * wrap results for k = 5
#> * wrap results for k = 6
#> * adjust class labels between different k.
#> * MAD:kmeans used 2.501 secs.
#> ------------------------------------------------------------
#> * running partition by ATC:kmeans. 16/20
#> * run ATC:kmeans on a 40x40 matrix.
#> * ATC values have already been calculated. Get from cache.
#> * rows are scaled before sent to partition, method: 'z-score' (x - mean)/sd
#> * get top 20 rows by ATC method
#> * get top 30 rows by ATC method
#> * get top 40 rows by ATC method
#> * wrap results for k = 2
#> * wrap results for k = 3
#> * wrap results for k = 4
#> * wrap results for k = 5
#> * wrap results for k = 6
#> * adjust class labels between different k.
#> * ATC:kmeans used 2.504 secs.
#> ------------------------------------------------------------
#> * running partition by SD:hclust. 17/20
#> * run SD:hclust on a 40x40 matrix.
#> * SD values have already been calculated. Get from cache.
#> * rows are scaled before sent to partition, method: 'z-score' (x - mean)/sd
#> * get top 20 rows by SD method
#> * get top 30 rows by SD method
#> * get top 40 rows by SD method
#> * wrap results for k = 2
#> * wrap results for k = 3
#> * wrap results for k = 4
#> * wrap results for k = 5
#> * wrap results for k = 6
#> * adjust class labels between different k.
#> * SD:hclust used 2.143 secs.
#> ------------------------------------------------------------
#> * running partition by CV:hclust. 18/20
#> * run CV:hclust on a 40x40 matrix.
#> * CV values have already been calculated. Get from cache.
#> * rows are scaled before sent to partition, method: 'z-score' (x - mean)/sd
#> * get top 20 rows by CV method
#> * get top 30 rows by CV method
#> * get top 40 rows by CV method
#> * wrap results for k = 2
#> * wrap results for k = 3
#> * wrap results for k = 4
#> * wrap results for k = 5
#> * wrap results for k = 6
#> * adjust class labels between different k.
#> * CV:hclust used 2.128 secs.
#> ------------------------------------------------------------
#> * running partition by MAD:hclust. 19/20
#> * run MAD:hclust on a 40x40 matrix.
#> * MAD values have already been calculated. Get from cache.
#> * rows are scaled before sent to partition, method: 'z-score' (x - mean)/sd
#> * get top 20 rows by MAD method
#> * get top 30 rows by MAD method
#> * get top 40 rows by MAD method
#> * wrap results for k = 2
#> * wrap results for k = 3
#> * wrap results for k = 4
#> * wrap results for k = 5
#> * wrap results for k = 6
#> * adjust class labels between different k.
#> * MAD:hclust used 2.119 secs.
#> ------------------------------------------------------------
#> * running partition by ATC:hclust. 20/20
#> * run ATC:hclust on a 40x40 matrix.
#> * ATC values have already been calculated. Get from cache.
#> * rows are scaled before sent to partition, method: 'z-score' (x - mean)/sd
#> * get top 20 rows by ATC method
#> * get top 30 rows by ATC method
#> * get top 40 rows by ATC method
#> * wrap results for k = 2
#> * wrap results for k = 3
#> * wrap results for k = 4
#> * wrap results for k = 5
#> * wrap results for k = 6
#> * adjust class labels between different k.
#> * ATC:hclust used 2.12 secs.
#> ------------------------------------------------------------
#> * adjust class labels according to the consensus classifications from all methods.
#> - get reference class labels from all methods, all k.
#> - adjust class labels for each single method, each single k.
#> ------------------------------------------------------------
# }