Keyword enrichment for GO terms

keyword_enrichment_from_GO(go_id, min_bg = 5, min_term = 2)

Arguments

go_id

A vector of GO IDs.

min_bg

Minimal number of GO terms (in the background, i.e. all GO temrs in the GO database) that contain a specific keyword.

min_term

Minimal number of GO terms (GO terms in go_id) that contain a specific keyword.

Details

The enrichment is applied by Fisher's exact test. For a keyword, there is the following 2x2 contigency table:


                      | contains the keyword | does not contain the keyword
    In the GO set     |          s11         |          s12
    Not in the GO set |          s21         |          s22  

where s11, s12, s21 and s22 are number of GO terms in each category.

Value

A data frame with keyword enrichment results.

Examples

# \dontrun{
go_id = random_GO(100)
keyword_enrichment_from_GO(go_id)
#>            keyword n_term  n_bg            p         padj
#> 1         positive     21  3338 1.610552e-05 0.0008858037
#> 2       regulation     42 10445 3.913273e-05 0.0021131673
#> 3      acetylation      3   102 1.652342e-03 0.0875741136
#> 4         synaptic      4   233 2.005367e-03 0.1042790660
#> 5         vascular      3   116 2.383949e-03 0.1215813808
#> 6        telomeric      2    48 5.368536e-03 0.2684267755
#> 7          meiosis      2    63 9.092313e-03 0.4455233527
#> 8          glucose      2    83 1.538534e-02 0.7384961011
#> 9     transmission      2    96 2.022839e-02 0.9507342165
#> 10  proteincoupled      2   116 2.873885e-02 1.0000000000
#> 11        negative     13  3274 3.387229e-02 1.0000000000
#> 12        endosome      2   127 3.392720e-02 1.0000000000
#> 13     disassembly      2   133 3.689944e-02 1.0000000000
#> 14 phosphorylation      2   137 3.893466e-02 1.0000000000
#> 15         histone      3   344 4.377305e-02 1.0000000000
#> 16      initiation      2   147 4.420400e-02 1.0000000000
#> 17          stress      2   162 5.256851e-02 1.0000000000
#> 18          muscle      4   623 5.372262e-02 1.0000000000
#> 19          smooth      2   185 6.637452e-02 1.0000000000
#> 20        assembly      4   746 9.013424e-02 1.0000000000
#> 21           viral      2   227 9.420181e-02 1.0000000000
#> 22             via      3   513 1.114850e-01 1.0000000000
#> 23           cycle      2   259 1.172488e-01 1.0000000000
#> 24      processing      2   279 1.323016e-01 1.0000000000
#> 25            cell     12  3738 1.397014e-01 1.0000000000
#> 26     interleukin      2   294 1.438659e-01 1.0000000000
#> 27      activation      2   298 1.469853e-01 1.0000000000
#> 28         protein      6  1623 1.635079e-01 1.0000000000
#> 29       transport      5  1291 1.700122e-01 1.0000000000
#> 30         vesicle      2   359 1.960134e-01 1.0000000000
#> 31        response      7  2112 2.034871e-01 1.0000000000
#> 32      polymerase      2   372 2.067285e-01 1.0000000000
#> 33              ii      2   373 2.075555e-01 1.0000000000
#> 34       metabolic      6  1787 2.199560e-01 1.0000000000
#> 35       migration      2   389 2.208330e-01 1.0000000000
#> 36       secretion      2   410 2.383628e-01 1.0000000000
#> 37    organization      2   419 2.459004e-01 1.0000000000
#> 38       catabolic      5  1547 2.744703e-01 1.0000000000
#> 39   transcription      2   475 2.928899e-01 1.0000000000
#> 40             ion      2   526 3.353872e-01 1.0000000000
#> 41          growth      2   545 3.510498e-01 1.0000000000
#> 42        cellular      3   932 3.545236e-01 1.0000000000
#> 43          factor      2   554 3.584270e-01 1.0000000000
#> 44             rna      2   614 4.067680e-01 1.0000000000
#> 45             dna      2   687 4.631534e-01 1.0000000000
#> 46        receptor      4  1561 4.743773e-01 1.0000000000
#> 47        membrane      3  1219 5.249771e-01 1.0000000000
#> 48   morphogenesis      2   798 5.426343e-01 1.0000000000
#> 49         pathway      3  1287 5.618735e-01 1.0000000000
#> 50         process     13  5825 5.696686e-01 1.0000000000
#> 51        involved      4  1850 6.089469e-01 1.0000000000
#> 52 differentiation      2   933 6.279394e-01 1.0000000000
#> 53       signaling      3  1429 6.331736e-01 1.0000000000
#> 54     development      3  1455 6.453391e-01 1.0000000000
#> 55        activity      2 10285 1.000000e+00 1.0000000000
# }