Sim_SSDD_2013
Sim_SSDD_2013(
dag,
terms,
distance = "shortest_distances_via_NCA",
verbose = simona_opt$verbose
)
It is similar as the Sim_Shen_2010 which also sums content along the path passing through LCA term. Instead of summing the information content, the Sim_SSDD_2013 sums up a so-called "T-value":
= 1 - atan(sum_{x in the path}(T(x)))/(pi/2) sim
Each term has a T-value and it measures the semantic content a term averagely inherited from its parents
and distributed to its offsprings. The T-value of root is 1. Assume a term t
has two parents p1
and p1
,
The T-value for term t
is averaged from its
*T(p1) + w2*T(p2))/2 (w1
Since the parent may have other child terms, a factor w1
or w2
is multiplied to T(p1)
and T(p2)
. Taking
p1
as an example, it has n_p
offsprings (including itself) and t
has n_t
offsprings (including itself),
this means n_t/n_p
of information is transmitted from p1
to downstream via t
, thus w1
is defined as n_t/n_p
.
Paper link: doi:10.1016/j.ygeno.2013.04.010 .