Skip to contents

Calculate Gene Ontology (GO) semantic similarity matrix

Usage

GO_similarity(
  go_id,
  ont = NULL,
  db = "org.Hs.eg.db",
  measure = "Sim_XGraSM_2013"
)

guess_ont(go_id, db = "org.Hs.eg.db")

random_GO(n, ont = c("BP", "CC", "MF"), db = "org.Hs.eg.db")

Arguments

go_id

A vector of GO IDs.

ont

Sub-ontology of GO. Value should be one of "BP", "CC" or "MF". If it is not specified, the function automatically identifies it by random sampling 10 IDs from go_id (see guess_ont()).

db

Annotation database. It should be an OrgDb package name from https://bioconductor.org/packages/release/BiocViews.html#___OrgDb. The value can also directly be an OrgDb object.

measure

Semantic measure for the GO similarity, pass to simona::term_sim(). All valid values are in simona::all_term_sim_methods().

n

Number of GO IDs.

Value

GO_similarity() returns a symmetric matrix.

guess_ont() returns a single character scalar of "BP", "CC" or "MF". If there are more than one ontologies detected. It returns NULL.

random_GO() returns a vector of GO IDs.

Details

The default similarity method is "Sim_XGraSM_2013". Since the semantic similarities are calculated based on gene annotations to GO terms, I suggest users also try the following methods:

  • "Sim_Lin_1998"

  • "Sim_Resnik_1999"

  • "Sim_Relevance_2006"

  • "Sim_SimIC_2010"

  • "Sim_XGraSM_2013"

  • "Sim_EISI_2015"

  • "Sim_AIC_2014"

  • "Sim_Wang_2007"

  • "Sim_GOGO_2018"

In guess_ont(), only 10 random GO IDs are checked.

In random_GO(), only GO terms with gene annotations are sampled.

Examples

# \donttest{
go_id = random_GO(100)
#> relations: is_a, part_of, regulates, negatively_regulates, positively_regulates
#> 
#> IC_method: IC_annotation
mat = GO_similarity(go_id)
#> You haven't provided value for `ont`, guess it as `BP`.
#> term_sim_method: Sim_XGraSM_2013
#> IC_method: IC_annotation
# }
# \donttest{
go_id = random_GO(100)
guess_ont(go_id)
#> [1] "BP"
# }