IC_Wang_2007
IC_Wang_2007(
dag,
contribution_factor = c(is_a = 0.8, part_of = 0.6),
use_cache = simona_opt$use_cache,
verbose = simona_opt$verbose
)Each relation is weighted by a value less than 1 based on the semantic relation, i.e. 0.8 for "is_a" and 0.6 for "part_of".
For a term t and one of its ancestor term a, it first calculates an "S-value" which corresponds to a path from a to t where
the accumulated multiplication of weights along the path reaches maximal:
S(a->t) = max_{path}(prod_{node on the paty}(w))Here max goes over all possible paths from a to t, and prod() multiplies edge weights in a certain path.
The formula can be transformed as (we simply rewrite S(a->t) to S):
1/S = min(prod(1/w))
log(1/S) = log(min(prod(1/w)))
= min(sum(log(1/w)))Since w < 1, log(1/w) is positive. According to the equation, the path (a->...->t) is actually the shortest path from a to t by taking
log(1/w) as the weight, and log(1/S) is the weighted shortest distance.
If S(a->t) can be thought as the maximal semantic contribution from a to t, the information content is calculated
as the sum from all t's ancestors (including t itself):
IC = sum_{a in t's ancestors + t}(S(a->t))Paper link: doi:10.1093/bioinformatics/btm087 .
The contribution of different semantic relations can be set with the contribution_factor parameter. The value should be a named numeric
vector where names should cover the relations defined in relations set in create_ontology_DAG(). For example, if there are two relations
"relation_a" and "relation_b" set in the DAG, the value for contribution_factor can be set as:
term_IC(dag, method = "IC_Wang",
control = list(contribution_factor = c("relation_a" = 0.8, "relation_b" = 0.6)))Note the IC_Wang_2007 method is normally used within the Sim_Wang_2007 semantic similarity method.