Consensus partitioning only with a subset of columns

consensus_partition_by_down_sampling(data,
    top_value_method = "ATC",
    top_n = NULL,
    partition_method = "skmeans",
    max_k = 6, k = NULL,
    subset = min(round(ncol(data)*0.2), 250), pre_select = TRUE,
    verbose = TRUE, prefix = "", anno = NULL, anno_col = NULL,
    predict_method = "centroid",
    dist_method = c("euclidean", "correlation", "cosine"),
    .env = NULL, .predict = TRUE, mc.cores = 1, cores = mc.cores, ...)

Arguments

data

A numeric matrix where subgroups are found by columns.

top_value_method

A single top-value method. Available methods are in all_top_value_methods. Use register_top_value_methods to add a new top-value method.

top_n

Number of rows with top values. The value can be a vector with length > 1. When n > 5000, the function only randomly sample 5000 rows from top n rows. If top_n is a vector, paritition will be applied to every values in top_n and consensus partition is summarized from all partitions.

partition_method

A single partitioning method. Available methods are in all_partition_methods. Use register_partition_methods to add a new partition method.

max_k

Maximal number of subgroups to try. The function will try for 2:max_k subgroups

k

Alternatively, you can specify a vector k.

subset

Number of columns to randomly sample, or a vector of selected indices.

pre_select

Whether to pre-select by k-means.

verbose

Whether to print messages.

prefix

Internally used.

anno

Annotation data frame.

anno_col

Annotation colors.

predict_method

Method for predicting class labels. Possible values are "centroid", "svm" and "randomForest".

dist_method

Method for predict the class for other columns.

.env

An environment, internally used.

.predict

Internally used.

mc.cores

Number of cores. This argument will be removed in future versions.

cores

Number of cores, or a cluster object returned by makeCluster.

...

All pass to consensus_partition.

Details

The function performs consensus partitioning only with a small subset of columns and the class of other columns are predicted by predict_classes,ConsensusPartition-method.

Examples

# \dontrun{
data(golub_cola)
m = get_matrix(golub_cola)

set.seed(123)
golub_cola_ds = consensus_partition_by_down_sampling(m, subset = 50,
  anno = get_anno(golub_cola), anno_col = get_anno_col(golub_cola),
  top_value_method = "SD", partition_method = "kmeans")
#> * apply consensus_partition_by_down_sampling() with 50 columns.
#> * run SD:kmeans on a 4116x50 matrix.
#> * calculating SD values.
#> * rows are scaled before sent to partition, method: 'z-score' (x - mean)/sd
#> * get top 368 rows by SD method
#> * wrap results for k = 2
#> * wrap results for k = 3
#> * wrap results for k = 4
#> * wrap results for k = 5
#> * wrap results for k = 6
#> * adjust class labels between different k.
#> * SD:kmeans used 1.28 secs.
#> * predict class for 72 samples with k = 2
#>   * take top 500/689 most significant signatures for prediction.
#> * predict class for 72 samples with k = 3
#>   * take top 500/787 most significant signatures for prediction.
#> * predict class for 72 samples with k = 4
#>   * take top 500/1018 most significant signatures for prediction.
#> * predict class for 72 samples with k = 5
#>   * take top 500/1005 most significant signatures for prediction.
#> * predict class for 72 samples with k = 6
#>   * take top 500/887 most significant signatures for prediction.
# }