cola_examples

MCF10CA single cell dataset

The dataset is from project (https://www.ebi.ac.uk/ena/data/view/PRJEB26737). Only the samples in (https://www.ebi.ac.uk/ena/data/view/ERS2487269) and (https://www.ebi.ac.uk/ena/data/view/ERS2487270) are used.

The raw reads are processed by STAR and htseq-count. Genes with low counts as well as non-protein coding genes are filtered out. The TPM table is MCF10CA_scRNAseq_tpm.rds.

RDS files generated by cola (use readRDS() to load into R (>= 3.6.0)):

MCF10CA_scRNAseq_subgroup.rds

HTML reports for cola analysis:

MCF10CA_scRNAseq_subgroup_cola_report

Following code performs the analysis.

Prepare the input matrix:

library(cola)

tpm = readRDS("MCF10CA_scRNAseq_tpm.rds")
m = log2(tpm + 1)

cell_type = ifelse(grepl("round", colnames(m)), "round", "aberrant")
cell_col = cell_type = c("aberrant" = "red", "round" = "blue")

m = adjust_matrix(m)

Perform the consensus partitioning:

register_NMF()

set.seed(123)
rl = run_all_consensus_partition_methods(
    m, 
    mc.cores = 4,
    anno = data.frame(cell_type = cell_type), 
    anno_col = list(cell_type = cell_col)
)

saveRDS(rl, file = "MCF10CA_scRNAseq_subgroup.rds")
cola_report(rl, output_dir = "MCF10CA_scRNAseq_subgroup_cola_report", mc.cores = 4)