Compare the Numbers of Samplings

This document contains results for testing the number of samplings for consensus partitioning on the two datasets ( TCGA GBM microarray dataset and HSMM single cell RNASeq dataset). The numbers of random samplings were tested for 25, 50, 100 and 200. We tested both row sampling and column sampling. For each combination of parameters, cola ran for 100 times. The scripts for the analysis can be found here.

For each dataset, there are four plots:

Scatter plots showing the variability of the consensus partitioning metrics. Three metrics (1-PAC, mean silhouette and the concordance scores) are tested.
Line plots showing the mean concordance between the 100 cola runs.
Barplots showing the mean concordance of the consensus parititons between 25 samplings and 200 samplings.
Scatter plots showing the relation between 1-PAC under 25/200 samplings and the concordance.

TCGA GBM microarray dataset

1-PAC, by row

Figure S7.1A. Variability of 1-PAC scores from the 100 cola runs. Consensus partitionings were applied by row sampling.

1-PAC, by column

Figure S7.1B. Variability of 1-PAC scores from the 100 cola runs. Consensus partitionings were applied by column sampling.

Mean silhouette, by row

Figure S7.1C. Variability of mean silhouette scores from the 100 cola runs. Consensus partitionings were applied by row sampling.

Mean silhouette, by column

Figure S7.1D. Variability of mean silhouette scores from the 100 cola runs. Consensus partitionings were applied by column sampling.

Concordance, by row

Figure S7.1E. Variability of concordance scores from the 100 cola runs. Consensus partitionings were applied by row sampling.

Concordance, by column

Figure S7.1F. Variability of concordance scores from the 100 cola runs. Consensus partitionings were applied by column sampling.

by row

Figure S7.2A. Mean concordance in the 100 cola runs. Consensus partitionings were applied by row sampling.

by column

Figure S7.2B. Mean concordance in the 100 cola runs. Consensus partitionings were applied by column sampling.

by row

Figure S7.3A. Mean concordance of consensus partitioning with 25 and 200 samplings. Consensus partitionings were applied by row sampling.

by column

Figure S7.3B. Mean concordance of consensus partitioning with 25 and 200 samplings. Consensus partitionings were applied by column sampling.

by row

Figure S7.4A. Relations between mean 1-PAC from 25/200 samplings and concordance. Consensus partitionings were applied by row sampling.

by column

Figure S7.4B. Relations between mean 1-PAC from 25/200 samplings and concordance. Consensus partitionings were applied by column sampling.

HSMM single cell RNASeq dataset

1-PAC, by row

Figure S7.5A. Variability of 1-PAC scores from the 100 cola runs. Consensus partitionings were applied by row sampling.

1-PAC, by column

Figure S7.5B. Variability of 1-PAC scores from the 100 cola runs. Consensus partitionings were applied by column sampling.

Mean silhouette, by row

Figure S7.5C. Variability of mean silhouette scores from the 100 cola runs. Consensus partitionings were applied by row sampling.

Mean silhouette, by column

Figure S7.5D. Variability of mean silhouette scores from the 100 cola runs. Consensus partitionings were applied by column sampling.

Concordance, by row

Figure S7.5E. Variability of concordance scores from the 100 cola runs. Consensus partitionings were applied by row sampling.

Concordance, by column

Figure S7.5F. Variability of concordance scores from the 100 cola runs. Consensus partitionings were applied by row sampling.

by row

Figure S7.6A. Mean concordance in the 100 cola runs. Consensus partitionings were applied by row sampling.

by column

Figure S7.6B. Mean concordance in the 100 cola runs. Consensus partitionings were applied by column sampling.

by row

Figure S7.7A. Mean concordance of consensus partitioning with 25 and 200 samplings. Consensus partitionings were applied by row sampling.

by column

Figure S7.7B. Mean concordance of consensus partitioning with 25 and 200 samplings. Consensus partitionings were applied by column sampling.

by row

Figure S7.8A. Relations between mean 1-PAC from 25/200 samplings and concordance. Consensus partitionings were applied by row sampling.

by column

Figure S7.8B. Relations between mean 1-PAC from 25/200 samplings and concordance. Consensus partitionings were applied by column sampling.

Compare the Numbers of Samplings

Zuguang Gu (z.gu@dkfz.de)

2020-07-07

TCGA GBM microarray dataset

1-PAC, by row

1-PAC, by column

Mean silhouette, by row

Mean silhouette, by column

Concordance, by row

Concordance, by column

by row

by column

by row

by column

by row

by column

HSMM single cell RNASeq dataset

1-PAC, by row

1-PAC, by column

Mean silhouette, by row

Mean silhouette, by column

Concordance, by row

Concordance, by column

by row

by column

by row

by column

by row

by column