A general example

In this document, we will discuss the use of background regions. We first demonstrate it with a ChIP-seq TFBS dataset from UCSC table browser. Parameters are:

In the “Select dataset” section:

clade = Mammal
genome = Human
assembly = GRCh37/hg19
group = Regulation
track = ENCODE 3 TFBS
table: GM12878 MYB

And in the “Retrieve and display data” section:

output format = BED - browser extensible data

Then click the button “get output”.

We first read it as a GRanges object.

library(rGREAT)
df = read.table("data/tb_encTfChipPkENCFF215YWS_GM12878_MYB_hg19.bed")
df = df[df[, 1] %in% paste0("chr", c(1:22, "X", "Y")), ]
gr = GRanges(seqnames = df[, 1], ranges = IRanges(df[, 2] + 1, df[, 3]))

The next two GREAT analysis uses the whole genome as background and excludes gap regions.

res1 = great(gr, "GO:BP", "hg19", exclude = NULL)
res2 = great(gr, "GO:BP", "hg19", exclude = "gap")

And we compare the significant GO terms:

tb1 = getEnrichmentTable(res1)
tb2 = getEnrichmentTable(res2)

library(eulerr)
lt = list(
    genome = tb1$id[tb1$p_adjust < 0.001],
    exclude_gap = tb2$id[tb2$p_adjust < 0.001]
)
plot(euler(lt), quantities = TRUE)