Sim_HRSS_2013
Sim_HRSS_2013(dag, terms, verbose = simona_opt$verbose)It is similar as the Sim_RSS_2013 method, but it uses information content instead of the distance to adjust the similarity.
It first defines the semantic distance between term a and b as the sum of the distance to their MICA term c:
D(a, b) = D(c, a) + D(c, b)And the distance between an ancestor to a term is:
D(c, a) = IC(a) - IC(c) # if c is an ancestor of a
D(a, b) = D(c, a) + D(c, b) = IC(a) + IC(b) - 2*IC(c) # if c is the MICA of a and bSimilarly, the similarity is also corrected by the position of MICA term and a and b in the DAG:
1/(1 + D(a, b)) * alpha/(alph + beta)Now alpha is the IC of the MICA term:
alpha = IC(c)And beta is the average of the maximal semantic distance of a and b to leaves.
beta = 0.5*(IC(l_a) - IC(a) + IC(l_b) - IC(b))where l_a is the leaf that a can reach with the highest IC (i.e. most informative leaf), and so is l_b.
Paper link: doi:10.1371/journal.pone.0066745 .