library(rGREAT)
library(eulerr)
In this document, we will compare the enrichment results from online GREAT and local GREAT. The four datasets are all from UCSC table browser. Parameters are:
clade = Mammal
genome = Human
assembly = GRCh37/hg19
group = Regulation
track = ENCODE 3 TFBS
table: A549 JUN, A549 ELF1, H1-hESC RXRA, GM12878 MYB
And in the “Retrieve and display data” section:
output format = BED - browser extensible data
Then click the button “get output”.
We first read the files into GRanges
objects:
read_bed = function(f) {
df = read.table(f)
df = df[df[, 1] %in% paste0("chr", c(1:22, "X", "Y")), ]
GRanges(seqnames = df[, 1], ranges = IRanges(df[, 2] + 1, df[, 3]))
}
grl = list()
grl$A549_JUN = read_bed("data/tb_encTfChipPkENCFF708LCH_A549_JUN_hg19.bed")
grl$A549_ELF1 = read_bed("data/tb_encTfChipPkENCFF533NIV_A549_ELF1_hg19.bed")
grl$H1_hESC_RXRA = read_bed("data/tb_encTfChipPkENCFF369JAI_H1_hESC_RXRA_hg19.bed")
grl$GM12878_MYB = read_bed("data/tb_encTfChipPkENCFF215YWS_GM12878_MYB_hg19.bed")
sapply(grl, length)
## A549_JUN A549_ELF1 H1_hESC_RXRA GM12878_MYB
## 1726 11577 2092 3748
Apply both online and local GREAT analysis. Note online GREAT exclude gap regions, and in local GREAT, by default gap regions are removed as well.
gr = grl$A549_JUN
job = submitGreatJob(gr)
tbl = getEnrichmentTables(job)
tb1 = tbl[["GO Biological Process"]]
res = great(gr, "GO:BP", "hg19")
tb2 = getEnrichmentTable(res)
tb1
and tb2
contain the full table of all GO terms under test. First we take the common GO terms in the two result tables.
cn = intersect(tb1$ID, tb2$id)
length(cn)
## [1] 4567
rownames(tb1) = tb1$ID
rownames(tb2) = tb2$id
tb1 = tb1[cn, ]
tb2 = tb2[cn, ]
The significant GO terms from the two tables.
lt2 = list(online = tb1$ID[tb1$Binom_Adjp_BH < 0.001],
local = tb2$id[tb2$p_adjust < 0.001])
plot(eule