consensus_partition.Rd
Consensus partition
consensus_partition(data,
top_value_method = "ATC",
top_n = NULL,
partition_method = "skmeans",
max_k = 6,
k = NULL,
sample_by = "row",
p_sampling = 0.8,
partition_repeat = 50,
partition_param = list(),
anno = NULL,
anno_col = NULL,
scale_rows = NULL,
verbose = TRUE,
mc.cores = 1, cores = mc.cores,
prefix = "",
.env = NULL,
help = cola_opt$help)
A numeric matrix where subgroups are found by columns.
A single top-value method. Available methods are in all_top_value_methods
. Use register_top_value_methods
to add a new top-value method.
Number of rows with top values. The value can be a vector with length > 1. When n > 5000, the function only randomly sample 5000 rows from top n rows. If top_n
is a vector, paritition will be applied to every values in top_n
and consensus partition is summarized from all partitions.
A single partitioning method. Available methods are in all_partition_methods
. Use register_partition_methods
to add a new partition method.
Maximal number of subgroups to try. The function will try for 2:max_k
subgroups
Alternatively, you can specify a vector k.
Should randomly sample the matrix by rows or by columns?
Proportion of the submatrix which contains the top n rows to sample.
Number of repeats for the random sampling.
Parameters for the partition method which are passed to ...
in a registered partitioning method. See register_partition_methods
for detail.
A data frame with known annotation of samples. The annotations will be plotted in heatmaps and the correlation to predicted subgroups will be tested.
A list of colors (color is defined as a named vector) for the annotations. If anno
is a data frame, anno_col
should be a named list where names correspond to the column names in anno
.
Whether to scale rows. If it is TRUE
, scaling method defined in register_partition_methods
is used.
Whether print messages.
Multiple cores to use. This argument will be removed in future versions.
Number of cores, or a cluster
object returned by makeCluster
.
Internally used.
An environment, internally used.
Whether to print help messages.
The function performs analysis in following steps:
calculate scores for rows by top-value method,
for each top_n value, take top n rows,
randomly sample p_sampling
rows from the top_n-row matrix and perform partitioning for partition_repeats
times,
collect partitions from all individual partitions and summarize a consensus partition.
A ConsensusPartition-class
object. Simply type object in the interactive R session
to see which functions can be applied on it.
run_all_consensus_partition_methods
runs consensus partitioning with multiple top-value methods
and multiple partitioning methods.
set.seed(123)
m = cbind(rbind(matrix(rnorm(20*20, mean = 1, sd = 0.5), nr = 20),
matrix(rnorm(20*20, mean = 0, sd = 0.5), nr = 20),
matrix(rnorm(20*20, mean = 0, sd = 0.5), nr = 20)),
rbind(matrix(rnorm(20*20, mean = 0, sd = 0.5), nr = 20),
matrix(rnorm(20*20, mean = 1, sd = 0.5), nr = 20),
matrix(rnorm(20*20, mean = 0, sd = 0.5), nr = 20)),
rbind(matrix(rnorm(20*20, mean = 0.5, sd = 0.5), nr = 20),
matrix(rnorm(20*20, mean = 0.5, sd = 0.5), nr = 20),
matrix(rnorm(20*20, mean = 1, sd = 0.5), nr = 20))
) + matrix(rnorm(60*60, sd = 0.5), nr = 60)
res = consensus_partition(m, partition_repeat = 10, top_n = c(10, 20, 50))
#> * run ATC:skmeans on a 60x60 matrix.
#> * calculating ATC values.
#> * rows are scaled before sent to partition, method: 'z-score' (x - mean)/sd
#> * get top 10 rows by ATC method
#> Loading required package: foreach
#> Loading required package: rngtools
#> * get top 20 rows by ATC method
#> * get top 50 rows by ATC method
#> * wrap results for k = 2
#> * wrap results for k = 3
#> * wrap results for k = 4
#> * wrap results for k = 5
#> * wrap results for k = 6
#> * adjust class labels between different k.
#> * ATC:skmeans used 12.832 secs.
res
#> A 'ConsensusPartition' object with k = 2, 3, 4, 5, 6.
#> On a matrix with 60 rows and 60 columns.
#> Top rows (10, 20, 50) are extracted by 'ATC' method.
#> Subgroups are detected by 'skmeans' method.
#> Performed in total 150 partitions by row resampling.
#> Best k for subgroups seems to be 2.
#>
#> Following methods can be applied to this 'ConsensusPartition' object:
#> [1] "cola_report" "collect_classes"
#> [3] "collect_plots" "collect_stats"
#> [5] "colnames" "compare_partitions"
#> [7] "compare_signatures" "consensus_heatmap"
#> [9] "dimension_reduction" "functional_enrichment"
#> [11] "get_anno" "get_anno_col"
#> [13] "get_classes" "get_consensus"
#> [15] "get_matrix" "get_membership"
#> [17] "get_param" "get_signatures"
#> [19] "get_stats" "is_best_k"
#> [21] "is_stable_k" "membership_heatmap"
#> [23] "ncol" "nrow"
#> [25] "plot_ecdf" "predict_classes"
#> [27] "rownames" "select_partition_number"
#> [29] "show" "suggest_best_k"
#> [31] "test_to_known_factors" "top_rows_heatmap"