Number of annotated items

n_annotations(
  dag,
  terms = NULL,
  uniquify = simona_opt$anno_uniquify,
  use_cache = simona_opt$use_cache
)

has_annotation(dag)

Arguments

dag: An ontology_DAG object.
terms: A vector of term names. If it is set, the returned vector will be subsetted to the terms that have been set here.
uniquify: Whether to uniquify items that are annotated to the term? See Details. It is suggested to always be TRUE.
use_cache: Internally used.

Value

n_annotations() returns an integer vector.

has_annotation() returns a logical scalar.

Details

Due to the nature of the DAG, a parent term includes all annotated items of its child terms, and an ancestor term includes all annotated items from its offspring recursively. In current tools, there are two different implementations to deal with such recursive merging.

For a term t, denote S_1, S_2, ... as the sets of annotated items for its child 1, 2, ..., also denote S_t as the set of items that are directly annotated to t. The first method takes the union of annotated items on t and all its child terms:

n = length(union(S_t, S_1, S_2, ...))

And the second method takes the sum of numbers of items on t and on all its child terms:

n = sum(length(s_t) + length(S_1) + length(S_2) + ...)

In n_annotations(), when uniquify = TRUE, the first method is used; and when uniquify = FALSE, the second method is used.

For some annotation sources, it is possible that an item is annotated to multiple terms, thus, the second method which simply adds numbers of all its child terms may not be proper because an item may be counted duplicatedly, thus over-estimating n. The two methods are identical only if an item is annotated to a unique term in the DAG.

We suggest to always set uniquify = TRUE (the default), and the scenario of uniquify = FALSE is only for the testing or benchmarking purpose.

Examples

parents  = c("a", "a", "b", "b", "c", "d")
children = c("b", "c", "c", "d", "e", "f")
annotation = list(
    "a" = c("t1", "t2", "t3"),
    "b" = c("t3", "t4"),
    "c" = "t5",
    "d" = "t7",
    "e" = c("t4", "t5", "t6", "t7"),
    "f" = "t8"
)
dag = create_ontology_DAG(parents, children, annotation = annotation)
n_annotations(dag)
#> a b c d e f 
#> 8 6 4 2 4 1 
#> attr(,"N")
#> [1] 8