Predict classes for new samples based on cola classification

# S4 method for ConsensusPartition
predict_classes(object, k, mat,
    silhouette_cutoff = 0.5,
    fdr_cutoff = cola_opt$fdr_cutoff,
    group_diff = cola_opt$group_diff,
    scale_rows = object@scale_rows,
    diff_method = "Ftest",
    method = "centroid",
    dist_method = c("euclidean", "correlation", "cosine"), nperm = 1000,
    p_cutoff = 0.05, plot = TRUE, col_fun = NULL,
    split_by_sigatures = FALSE, force = FALSE,
    verbose = TRUE, help = TRUE, prefix = "",
    mc.cores = 1, cores = mc.cores)

Arguments

object: A ConsensusPartition-class object.
k: Number of subgroups to get the classifications.
mat: The new matrix where the sample classes are going to be predicted. The number of rows should be the same as the original matrix for cola analysis (also make sure the row orders are the same). Be careful that the scaling of mat should be the same as that applied in cola analysis.
silhouette_cutoff: Send to get_signatures,ConsensusPartition-method for determining signatures.
fdr_cutoff: Send to get_signatures,ConsensusPartition-method for determining signatures.
group_diff: Send to get_signatures,ConsensusPartition-method for determining signatures.
scale_rows: Send to get_signatures,ConsensusPartition-method for determining signatures.
diff_method: Send to get_signatures,ConsensusPartition-method for determining signatures.
method: Method for predicting class labels. Possible values are "centroid", "svm" and "randomForest".
dist_method: Distance method. Value should be "euclidean", "correlation" or "cosine". Send to predict_classes,matrix-method.
nperm: Number of permutatinos. It is used when dist_method is set to "euclidean" or "cosine". Send to predict_classes,matrix-method.
p_cutoff: Cutoff for the p-values for determining class assignment. Send to predict_classes,matrix-method.
plot: Whether to draw the plot that visualizes the process of prediction. Send to predict_classes,matrix-method.
col_fun: A color mapping function generated from colorRamp2. It is set to both heatmaps.
split_by_sigatures: Should the heatmaps be split based on k-means on the main heatmap, or on the patterns of the signature heatmap.
force: If the value is TRUE and when get_signatures,ConsensusPartition-method internally failed, top 1000 rows with the highest between-group mean difference are used for constructing the signature centroid matrix. It is basically used internally.
verbose: Whether to print messages. Send to predict_classes,matrix-method.
help: Whether to print help messages.
prefix: Used internally.
mc.cores: Number of cores. This argument will be removed in future versions.
cores: Number of cores, or a cluster object returned by makeCluster.

Details

The prediction is based on the signature centroid matrix from cola classification. The processes are as follows:

1. For the provided ConsensusPartition-class object and a selected k, the signatures that discriminate classes are extracted by get_signatures,ConsensusPartition-method. If number of signatures is more than 2000, only 2000 signatures are randomly sampled. 2. The signature centroid matrix is a k-column matrix where each column is the centroid of samples in the corresponding class, i.e. the mean across samples. If rows were scaled in cola analysis, the signature centroid matrix is the mean of scaled values and vise versa. Please note the samples with silhouette score less than silhouette_cutoff are removed for calculating the centroids. 3. With the signature centroid matrix and the new matrix, it calls predict_classes,matrix-method to perform the prediction. Please see more details of the prediction on that help page. Please note, the scales of the new matrix should be the same as the matrix used for cola analysis.

Value

A data frame with two columns: the class labels (in numeric) and the corresponding p-values.

Examples

# \donttest{
data(golub_cola)
res = golub_cola["ATC:skmeans"]
mat = get_matrix(res)
# note scaling should be applied here because the matrix was scaled in the cola analysis
mat2 = t(scale(t(mat)))
cl = predict_classes(res, k = 3, mat2)
#> The matrix has been scaled in cola analysis, thus the new matrix should
#> also be scaled with the same method ('z-score'). Please double check.
#> Set `help = FALSE` to suppress this message. 
#> 
#> * take top 500/2058 most significant signatures for prediction.

# compare the real classification and the predicted classification
data.frame(cola_class = get_classes(res, k = 3)[, "class"],
           predicted = cl[, "class"])
#>    cola_class predicted
#> 1           3         3
#> 2           3         3
#> 3           1         1
#> 4           1         1
#> 5           1         1
#> 6           3         3
#> 7           1         1
#> 8           1         1
#> 9           1         1
#> 10          1         1
#> 11          1         1
#> 12          1         1
#> 13          1         1
#> 14          1         1
#> 15          1         1
#> 16          1         1
#> 17          1         1
#> 18          3         3
#> 19          3         3
#> 20          1         1
#> 21          2         2
#> 22          2         2
#> 23          2         2
#> 24          2         2
#> 25          1         1
#> 26          2         2
#> 27          2         2
#> 28          1         1
#> 29          2         2
#> 30          2         2
#> 31          1         1
#> 32          2         2
#> 33          2         2
#> 34          2         2
#> 35          3         3
#> 36          1         1
#> 37          3         3
#> 38          3         3
#> 39          1         1
#> 40          3         3
#> 41          3         3
#> 42          3         3
#> 43          1         1
#> 44          1         1
#> 45          1         1
#> 46          2         2
#> 47          1         1
#> 48          1         1
#> 49          1         1
#> 50          1         1
#> 51          1         1
#> 52          3         3
#> 53          1         1
#> 54          1         1
#> 55          1         1
#> 56          2         2
#> 57          3         3
#> 58          1         1
#> 59          2         2
#> 60          1         1
#> 61          3         3
#> 62          2         2
#> 63          2         2
#> 64          2         2
#> 65          2         2
#> 66          2         2
#> 67          2         2
#> 68          1         1
#> 69          2         2
#> 70          2         2
#> 71          2         2
#> 72          2         2
# change to correlation method
cl = predict_classes(res, k = 3, mat2, dist_method = "correlation")
#> The matrix has been scaled in cola analysis, thus the new matrix should
#> also be scaled with the same method ('z-score'). Please double check.
#> Set `help = FALSE` to suppress this message. 
#> 
#> * take top 500/2058 most significant signatures for prediction.

# compare the real classification and the predicted classification
data.frame(cola_class = get_classes(res, k = 3)[, "class"],
           predicted = cl[, "class"]) 
#>    cola_class predicted
#> 1           3         3
#> 2           3         3
#> 3           1         1
#> 4           1         1
#> 5           1         1
#> 6           3         3
#> 7           1         1
#> 8           1         1
#> 9           1         1
#> 10          1         1
#> 11          1         1
#> 12          1         1
#> 13          1         1
#> 14          1         1
#> 15          1         1
#> 16          1         1
#> 17          1         1
#> 18          3         3
#> 19          3         3
#> 20          1         1
#> 21          2         2
#> 22          2         2
#> 23          2         2
#> 24          2         2
#> 25          1         1
#> 26          2         2
#> 27          2         2
#> 28          1         1
#> 29          2         2
#> 30          2         2
#> 31          1         1
#> 32          2         2
#> 33          2         2
#> 34          2         2
#> 35          3         3
#> 36          1         1
#> 37          3         3
#> 38          3         3
#> 39          1         1
#> 40          3         3
#> 41          3         3
#> 42          3         3
#> 43          1         1
#> 44          1         1
#> 45          1         1
#> 46          2         2
#> 47          1         1
#> 48          1         1
#> 49          1         1
#> 50          1         1
#> 51          1         1
#> 52          3         3
#> 53          1         1
#> 54          1         1
#> 55          1         1
#> 56          2         2
#> 57          3         3
#> 58          1         1
#> 59          2         2
#> 60          1         1
#> 61          3         3
#> 62          2         2
#> 63          2         2
#> 64          2         2
#> 65          2         2
#> 66          2         2
#> 67          2         2
#> 68          1         1
#> 69          2         2
#> 70          2         2
#> 71          2         2
#> 72          2         2
# }

Predict classes for new samples based on cola classification

Arguments

Details

Value

See also

Examples