Enrichment analysis on offspring terms — dag_enrich_on

The analysis task is to evaluate how significant a term includes terms.

dag_enrich_on_offsprings(dag, terms, min_hits = 3, min_offspring = 10)

Arguments

dag: An ontology_DAG object.
terms: A vector of term names.
min_hits: Minimal number of terms in an offspring set.
min_offspring: Minimal size of the offspring set.

Value

A data frame with the following columns:

term: Term names.
n_hits: Number of terms in terms intersecting to t's offspring terms.
n_offspring: Number of offspring terms of t (including t itself).
n_terms: Number of terms in term intersecting to all terms in the DAG.
n_all: Number of all terms in the DAG.
log2_fold_enrichment: Defined as log2(observation/expected).
z_score: Defined as (observed-expected)/sd.
p_value: P-values from hypergeometric test.
p_adjust: Adjusted p-values from the BH method.

The number of rows in the data frame is the same as the number of terms in the DAG.

Details

Given a list of terms in terms, the function tests whether they are enriched in a term's offspring terms. The test is based on the hypergeometric distribution. In the following 2x2 contigency table, S is the set of terms, for a term t in the DAG, T is the set of its offspring plus the t itself, the aim is to test whether S is over-represented in T.

If there is a significant p-value, we can say the term t preferably includes terms in term.

+----------+------+----------+-----+
|          | in S | not in S | all |
+----------+------+----------+-----+
| in T     |  x11 |    x12   | x10 |
| not in T |  x21 |    x22   | x20 |
+----------+------+----------+-----+
| all      |  x01 |    x02   |  x  |
+----------+------+----------+-----+

Examples

# \dontrun{
dag = create_ontology_DAG_from_GO_db() 
#> relations: is_a, part_of
terms = random_terms(dag, 100)
df = dag_enrich_on_offsprings(dag, terms)
# }
1
#> [1] 1