submitGreatJob.Rd
Perform online GREAT analysis
submitGreatJob(gr, bg = NULL,
gr_is_zero_based = FALSE,
species = "hg19",
genome = species,
includeCuratedRegDoms = TRUE,
rule = c("basalPlusExt", "twoClosest", "oneClosest"),
adv_upstream = 5.0,
adv_downstream = 1.0,
adv_span = 1000.0,
adv_twoDistance = 1000.0,
adv_oneDistance = 1000.0,
request_interval = 60,
max_tries = 10,
version = DEFAULT_VERSION,
base_url = "http://great.stanford.edu/public/cgi-bin",
use_name_column = FALSE,
verbose = help, help = great_opt$verbose)
A GRanges
object or a data frame which contains at least three columns (chr, start and end).
Not supported any more. See explanations in section "When_background_regions_are_set".
Are start positions in gr
zero-based?
Genome. "hg38", "hg19", "mm10", "mm9" are supported in GREAT version 4.x.x, "hg19", "mm10", "mm9", "danRer7" are supported in GREAT version 3.x.x and "hg19", "hg18", "mm9", "danRer7" are supported in GREAT version 2.x.x.
The same as genome
but it will be deprecated soon.
Whether to include curated regulatory domains, see https://great-help.atlassian.net/wiki/spaces/GREAT/pages/655443/Association+Rules#AssociationRules-CuratedRegulatoryDomains .
How to associate genomic regions to genes. See 'Details' section.
Unit: kb, only used when rule is basalPlusExt
.
Unit: kb, only used when rule is basalPlusExt
.
Unit: kb, only used when rule is basalPlusExt
.
Unit: kb, only used when rule is twoClosest
.
Unit: kb, only used when rule is oneClosest
.
Time interval for two requests. Default is 300 seconds.
Maximal times for aotumatically reconnecting GREAT web server.
Version of GREAT. The value should be "4.0.4", "3.0.0", "2.0.2". Shorten version numbers can also be used, such as using "4" or "4.0" is same as "4.0.4".
the url of cgi-bin
path, only used when it is explicitly specified.
If the input is a data frame, whether to use the fourth column as the "names" of regions?
Whether to print help messages.
Whether to print help messages. This argument will be replaced by verbose
in future versions.
Note: On Aug 19 2019 GREAT released version 4(https://great-help.atlassian.net/wiki/spaces/GREAT/pages/655442/Version+History ) where it supports hg38
genome and removes some ontologies such pathways. submitGreatJob
still
takes hg19
as default. hg38
can be specified by the genome = "hg38"
argument.
To use the older versions such as 3.0.0, specify as submitGreatJob(..., version = "3.0.0")
.
Note it does not use the standard GREAT API. This function directly send data to GREAT web server by HTTP POST.
Following text is copied from GREAT web site ( http://great.stanford.edu/public/html/ )
Explanation of rule
and settings with names started with 'adv_' (advanced settings):
Mode 'Basal plus extension'. Gene regulatory domain definition: Each gene is assigned a basal regulatory domain of a minimum distance upstream and downstream of the TSS (regardless of other nearby genes, controlled by adv_upstream
and adv_downstream
argument). The gene regulatory domain is extended in both directions to the nearest gene's basal domain but no more than the maximum extension in one direction (controlled by adv_span
).
Mode 'Two nearest genes'. Gene regulatory domain definition: Each gene is assigned a regulatory domain that extends in both directions to the nearest gene's TSS (controlled by adv_twoDistance
) but no more than the maximum extension in one direction.
Mode 'Single nearest gene'. Gene regulatory domain definition: Each gene is assigned a regulatory domain that extends in both directions to the midpoint between the gene's TSS and the nearest gene's TSS (controlled by adv_oneDistance
) but no more than the maximum extension in one direction.
Note when bg
argument is set to a list of background regions, GREAT uses a completely different test!
When bg
is set, gr
should be exactly subset of bg
. For example, let's say a background region list contains
five regions: [1, 10], [15, 23], [34, 38], [40, 49], [54, 63]
, gr
can only be a subset of the five regions, which
means gr
can take [15, 23], [40, 49]
, but it cannot take [16, 20], [39, 51]
. In this setting, regions are taken
as single units and Fisher's exact test is applied for calculating the enrichment (by testing number of regions in the 2x2 contigency table).
Check https://great-help.atlassian.net/wiki/spaces/GREAT/pages/655452/File+Formats#FileFormats-Whatshouldmybackgroundregionsfilecontain? for more explanations.
Please note from rGREAT 1.99.0, setting bg
is not supported any more and this argument will be removed in the future. You can either directly use GREAT website or use other Bioconductor packages such as "LOLA" to perform
the Fisher's exact test-based analysis.
If you want to restrict the input regions to background regions (by intersections) and still to apply Binomial test there, please
consider to use local GREAT by great
.
A GreatJob-class
object which can be used to get results from GREAT server. The following methods can be applied on it:
getEnrichmentTables,GreatObject-method
to retreive the result tables.
getRegionGeneAssociations,GreatObject-method
to get the associations between input regions and genes.
plotRegionGeneAssociations,GreatObject-method
to plot the associations bewteen input regions and genes.
shinyReport,GreatObject-method
to view the results by a shiny application.
great
for the local implementation of GREAT algorithm.
set.seed(123)
gr = randomRegions(nr = 1000, genome = "hg19")
job = submitGreatJob(gr)
#> Note: On Aug 19 2019 GREAT released version 4 which supports hg38
#> genome and removes some ontologies such pathways. submitGreatJob()
#> still takes hg19 as default. hg38 can be specified by argument `genome
#> = "hg38"`. To use the older versions such as 3.0.0, specify as
#> submitGreatJob(..., version = "3"). Set argument `help` to `FALSE` to
#> turn off this message.
job
#> Submit time: 2024-02-27 14:19:03
#> Note the results may only be avaiable on GREAT server for 24 hours.
#> Version: 4.0.4
#> Genome: hg19
#> Inputs: 1000 regions
#> Mode: Basal plus extension
#> Proximal: 5 kb upstream, 1 kb downstream,
#> plus Distal: up to 1000 kb
#> Include curated regulatory domains
#>
#> Enrichment tables for following ontologies have been downloaded:
#> None
#>
# more parameters can be set for the job
if(FALSE) { # suppress running it when building the package
# current GREAT version is 4.0.4
job = submitGreatJob(gr, genome = "hg19")
job = submitGreatJob(gr, adv_upstream = 10, adv_downstream = 2, adv_span = 2000)
job = submitGreatJob(gr, rule = "twoClosest", adv_twoDistance = 2000)
job = submitGreatJob(gr, rule = "oneClosest", adv_oneDistance = 2000)
}