Keyword enrichment for GO terms
Arguments
- go_id
A vector of GO IDs.
- min_bg
Minimal number of GO terms (in the background, i.e. all GO temrs in the GO database) that contain a specific keyword.
- min_term
Minimal number of GO terms (GO terms in
go_id
) that contain a specific keyword.
Details
The enrichment is applied by Fisher's exact test. For a keyword, there is the following 2x2 contigency table:
| contains the keyword | does not contain the keyword
| s11 | s12
In the GO set in the GO set | s21 | s22 Not
where s11
, s12
, s21
and s22
are the counts of GO terms in the four categories.
Examples
# \donttest{
go_id = random_GO(100)
keyword_enrichment_from_GO(go_id)
#> keyword n_term n_bg p padj
#> 1 built 2 5 0.0000507144 0.003245722
#> 2 domains 2 5 0.0000507144 0.003245722
#> 3 based 2 8 0.0001413703 0.008764960
#> 4 superfamily 2 14 0.0004553889 0.027778722
#> 5 regulation 39 10445 0.0004587624 0.027778722
#> 6 osteoclast 2 15 0.0005246716 0.030955622
#> 7 respiration 2 16 0.0005987379 0.034726799
#> 8 glia 2 17 0.0006775663 0.038621281
#> 9 receptors 2 18 0.0007611353 0.042623576
#> 10 guided 2 19 0.0008494233 0.046718279
#> 11 adaptive 2 24 0.0013609020 0.073488709
#> 12 migration 5 389 0.0019775906 0.104812301
#> 13 calciumdependent 2 29 0.0019871758 0.104812301
#> 14 adenosine 2 37 0.0032215001 0.164296505
#> 15 cerebral 2 39 0.0035737292 0.178686462
#> 16 radial 2 39 0.0035737292 0.178686462
#> 17 monophosphate 2 42 0.0041341544 0.198439411
#> 18 cardiac 4 295 0.0046578742 0.218920087
#> 19 motility 2 52 0.0062738567 0.288597407
#> 20 somatic 2 60 0.0082767213 0.372452460
#> 21 immunoglobulin 2 66 0.0099423153 0.437461872
#> 22 inflammatory 2 83 0.0153853354 0.661569424
#> 23 cortex 2 86 0.0164523217 0.690997512
#> 24 neurotransmitter 2 88 0.0171807518 0.704410824
#> 25 nucleoside 2 90 0.0179227079 0.716908317
#> 26 recombination 2 90 0.0179227079 0.716908317
#> 27 tissue 2 121 0.0310544603 1.000000000
#> 28 negative 13 3274 0.0338722862 1.000000000
#> 29 metabolic 8 1787 0.0504289202 1.000000000
#> 30 involved 8 1850 0.0593864344 1.000000000
#> 31 blood 2 177 0.0614455024 1.000000000
#> 32 detection 2 182 0.0645108931 1.000000000
#> 33 positive 12 3338 0.0751441243 1.000000000
#> 34 establishment 2 209 0.0819010201 1.000000000
#> 35 production 3 467 0.0903408024 1.000000000
#> 36 immune 2 238 0.1019652751 1.000000000
#> 37 response 8 2112 0.1069953416 1.000000000
#> 38 morphogenesis 4 798 0.1083531078 1.000000000
#> 39 activation 2 298 0.1469853367 1.000000000
#> 40 stimulus 2 314 0.1595983149 1.000000000
#> 41 cellular 4 932 0.1620160107 1.000000000
#> 42 protein 6 1623 0.1635079279 1.000000000
#> 43 process 17 5825 0.1646747271 1.000000000
#> 44 mitochondrial 2 321 0.1651789873 1.000000000
#> 45 muscle 3 623 0.1685057945 1.000000000
#> 46 pathway 5 1287 0.1685219263 1.000000000
#> 47 transport 5 1291 0.1700121702 1.000000000
#> 48 vesicle 2 359 0.1960133950 1.000000000
#> 49 apoptotic 2 375 0.2092105723 1.000000000
#> 50 signaling 5 1429 0.2243614616 1.000000000
#> 51 proliferation 2 454 0.2752746754 1.000000000
#> 52 transmembrane 3 868 0.3148038518 1.000000000
#> 53 kinase 2 504 0.3171227576 1.000000000
#> 54 localization 2 517 0.3279304662 1.000000000
#> 55 ion 2 526 0.3353872079 1.000000000
#> 56 factor 2 554 0.3584269711 1.000000000
#> 57 development 4 1455 0.4209315010 1.000000000
#> 58 dna 2 687 0.4631534118 1.000000000
#> 59 cell 9 3738 0.4769800615 1.000000000
#> 60 catabolic 3 1547 0.6861147181 1.000000000
#> 61 receptor 3 1561 0.6920083901 1.000000000
#> 62 membrane 2 1219 0.7675549006 1.000000000
#> 63 biosynthetic 3 2035 0.8458255727 1.000000000
#> 64 activity 4 10285 0.9999999857 1.000000000
# }