Get enrichment tables from GREAT web server

# S4 method for GreatJob
getEnrichmentTables(object, ontology = NULL, category = "GO",
    request_interval = 10, max_tries = 100, download_by = c("json", "tsv"),
    verbose = TRUE)

Arguments

object

A GreatJob-class object returned by submitGreatJob.

ontology

Ontology names. Valid values are in availableOntologies. ontology is prior to category argument.

category

Pre-defined ontology categories. One category can contain more than one ontologies. Valid values are in availableCategories

request_interval

Time interval for two requests. Default is 300 seconds.

max_tries

Maximal times for automatically reconnecting GREAT web server.

download_by

Internally used. The complete enrichment table is provided as json data on the website, but there is no information of gene-region association. By setting download_by = 'tsv', another URL from GREAT will be envoked which also contains detailed information of which genes are associated with each input region, but due to the size of the output, only top 500 terms will be returned. So if you do not really want the gene-region association column, take the default value of this argument. The columns that contain statistics are identical.

verbose

Whether to print messages.

Value

The structure of the data frames are same as the tables available on GREAT website.

Author

Zuguang gu <z.gu@dkfz.de>

Examples

job = readRDS(system.file("extdata", "GreatJob.rds", package = "rGREAT"))
tbl = getEnrichmentTables(job)
#> The default enrichment table does not contain informatin of associated
#> genes for each input region. You can set `download_by = 'tsv'` to
#> download the complete table, but note only the top 500 regions can be
#> retreived. See the following link:
#> 
#> https://great-help.atlassian.net/wiki/spaces/GREAT/pages/655401/Export#Export-GlobalExport
#> 
#> Except the additional gene-region association column if taking 'tsv' as
#> the source of result, all other columns are the same if you choose
#> 'json' (the default) as the source. Or you can try the local GREAT
#> analysis with the function `great()`.
names(tbl)
#> [1] "GO Molecular Function" "GO Biological Process" "GO Cellular Component"
head(tbl[[1]])
#>           ID
#> 1 GO:0070696
#> 2 GO:0033612
#> 3 GO:0070700
#> 4 GO:0039706
#> 5 GO:0043997
#> 6 GO:0016628
#>                                                                                    name
#> 1                        transmembrane receptor protein serine/threonine kinase binding
#> 2                                              receptor serine/threonine kinase binding
#> 3                                                                  BMP receptor binding
#> 4                                                                   co-receptor binding
#> 5                                  histone acetyltransferase activity (H4-K12 specific)
#> 6 oxidoreductase activity, acting on the CH-CH group of donors, NAD or NADP as acceptor
#>   Binom_Genome_Fraction Binom_Expected Binom_Observed_Region_Hits
#> 1          3.455733e-03    3.455733000                         11
#> 2          3.596413e-03    3.596413000                         11
#> 3          2.138904e-03    2.138904000                          8
#> 4          2.267472e-03    2.267472000                          8
#> 5          3.014312e-06    0.003014312                          1
#> 6          2.076881e-03    2.076881000                          7
#>   Binom_Fold_Enrichment Binom_Region_Set_Coverage Binom_Raw_PValue
#> 1              3.183116                     0.011     0.0008981541
#> 2              3.058603                     0.011     0.0012301660
#> 3              3.740233                     0.008     0.0016393810
#> 4              3.528158                     0.008     0.0023418690
#> 5            331.750600                     0.001     0.0030097780
#> 6              3.370439                     0.007     0.0054752450
#>   Binom_Adjp_BH Hyper_Total_Genes Hyper_Expected Hyper_Observed_Gene_Hits
#> 1             1                13     0.99380020                        5
#> 2             1                15     1.14669300                        5
#> 3             1                 9     0.68801550                        4
#> 4             1                10     0.76446170                        4
#> 5             1                 1     0.07644617                        1
#> 6             1                24     1.83470800                        5
#>   Hyper_Fold_Enrichment Hyper_Gene_Set_Coverage Hyper_Term_Gene_Coverage
#> 1              5.031192            0.0035260930                0.3846154
#> 2              4.360367            0.0035260930                0.3333333
#> 3              5.813822            0.0028208740                0.4444444
#> 4              5.232440            0.0028208740                0.4000000
#> 5             13.081100            0.0007052186                1.0000000
#> 6              2.725229            0.0035260930                0.2083333
#>   Hyper_Raw_PValue Hyper_Adjp_BH
#> 1      0.001982437     0.4919942
#> 2      0.004066220     0.6862153
#> 3      0.003134656     0.6297673
#> 4      0.004910112     0.7832638
#> 5      0.076446170     1.0000000
#> 6      0.032459480     1.0000000
job
#> Submit time: 2023-04-01 09:44:07 
#>   Note the results may only be avaiable on GREAT server for 24 hours.
#> Version: 4.0.4 
#> Genome: 
#> Inputs: 1000 regions
#> Mode: Basal plus extension 
#>   Proximal: 5 kb upstream, 1 kb downstream,
#>   plus Distal: up to 1000 kb
#> Include curated regulatory domains
#> 
#> Enrichment tables for following ontologies have been downloaded:
#>   GO Biological Process
#>   GO Cellular Component
#>   GO Molecular Function
#> 

tbl = getEnrichmentTables(job, ontology = "GO Molecular Function")
#> The default enrichment table does not contain informatin of associated
#> genes for each input region. You can set `download_by = 'tsv'` to
#> download the complete table, but note only the top 500 regions can be
#> retreived. See the following link:
#> 
#> https://great-help.atlassian.net/wiki/spaces/GREAT/pages/655401/Export#Export-GlobalExport
#> 
#> Except the additional gene-region association column if taking 'tsv' as
#> the source of result, all other columns are the same if you choose
#> 'json' (the default) as the source. Or you can try the local GREAT
#> analysis with the function `great()`.
tbl = getEnrichmentTables(job, category = "GO")
#> The default enrichment table does not contain informatin of associated
#> genes for each input region. You can set `download_by = 'tsv'` to
#> download the complete table, but note only the top 500 regions can be
#> retreived. See the following link:
#> 
#> https://great-help.atlassian.net/wiki/spaces/GREAT/pages/655401/Export#Export-GlobalExport
#> 
#> Except the additional gene-region association column if taking 'tsv' as
#> the source of result, all other columns are the same if you choose
#> 'json' (the default) as the source. Or you can try the local GREAT
#> analysis with the function `great()`.