GSEA using fgsea
Usage
fgsea_wrapper(s, gs, min_size = 5, max_size = 2000, ...)
fgsea_go(s, org_db = org.Hs.eg.db::org.Hs.eg.db, ontology = "BP", ...)
fgsea_kegg(s, organism = "hsa", db = "pathway", ...)
fgsea_msigdb(s, collection = "h.all", version = "2024.1.Hs", ...)
fgsea_reactome(s, organism = "HSA", ...)
fgsea_keywords(s, organism = "human", ...)
fgsea_phenotype(s, organism = "human", ...)
fgsea_disease(s, organism = "human", ...)Arguments
- s
A numeric vector of gene scores. It should be named with gene IDs.
- gs
A list of gene sets. In
fgsea_wrapper(), genes should have the same ID types ass.- min_size
Minimal size of gene sets for analysis.
- max_size
Maximal size of gene sets for analysis.
- ...
Other argument passed to
fgsea_wrapper()or further tofgsea::fgsea().- org_db
An
OrgDbobject for the organism. It can be from org.*.db packages or downloaded by the AnnotationHub package.- ontology
Namespace of GO. Value should be one of "BP", "CC" or "MF".
- organism
See Details.
- db
A KEGG database. The value can be one of "pathway", "module", "ko", "network", "disease" and "drug".
- collection
Collection of the MSigDB gene sets. All possible values can be found via
list_msigdb_versions().- version
Version of the MSigDB database. All possible values can be found via
list_msigdb_collections().
Details
Except fgsea_wrapper(), gene IDs in s in all fgsea_*() functions must be EntreZ IDs.
The value should be set differently for specific fgsea_*() functions.
for
fgsea_kegg(), the value should be a KEGG organism code, such as "hsa" or "mmu".for
fgsea_reactome(), the value should a prefix of the Reactome pathway ID that represents the organism. E.g. "HSA" for human.for
fgsea_keywords(), the value can be a organism name, e.g. "human", the latin name or the taxon ID. Please checkUniProtKeywords::load_keyword_genesets().for
fgsea_phenotype()andfgsea_disease(), the value can only be one of "human", "mouse" and "rat".
All valid values for fgsea_reactome() are:
Examples
data(p53_dataset)
s = p53_dataset$s2n
gs = p53_dataset$gs
fgsea_wrapper(s, gs) |> head()
#> gene_set p_value p_adjust log2_p_err es
#> 1 P53_UP 4.227646e-06 0.001910979 0.6105269 -0.5986774
#> 2 hsp27Pathway 7.768208e-06 0.001910979 0.5933255 -0.7758925
#> 3 HTERT_UP 4.256855e-05 0.004785001 0.5573322 0.3709129
#> 4 GPCRs_Class_A_Rhodopsin-like 5.045670e-05 0.004785001 0.5573322 -0.4299558
#> 5 p53hypoxiaPathway 5.403876e-05 0.004785001 0.5573322 -0.6816006
#> 6 p53Pathway 5.835367e-05 0.004785001 0.5573322 -0.7438596
#> nes n_gs leading_edge
#> 1 -2.130912 40 CDKN1A, ....
#> 2 -2.170911 15 TNFRSF6,....
#> 3 1.859947 109 AHR, DR1....
#> 4 -1.843462 111 NTSR2, A....
#> 5 -2.045472 20 CDKN1A, ....
#> 6 -2.128121 16 CDKN1A, ....
s2 = convert_to_entrez(s)
#> gene id might be SYMBOL (p = 0.670 )
#> 'select()' returned 1:many mapping between keys and columns
fgsea_go(s2) |> head()
#> Warning: There were 11 pathways for which P-values were not calculated properly due to unbalanced (positive and negative) gene-level statistic values. For such pathways pval, padj, NES, log2err are set to NA. You can try to increase the value of the argument nPermSimple (for example set it nPermSimple = 10000)
#> Warning: For some of the pathways the P-values were likely overestimated. For such pathways log2err is set to NA.
#> gene_set p_value p_adjust log2_p_err es nes n_gs
#> 1 GO:0007186 8.597994e-11 6.348759e-07 0.8390889 -0.3600615 -1.849246 512
#> 2 GO:0002682 1.823802e-08 6.733478e-05 0.7337620 -0.2997243 -1.584662 912
#> 3 GO:0006955 2.851258e-08 7.017896e-05 0.7337620 -0.2937466 -1.558659 1011
#> 4 GO:0022402 4.700567e-07 7.684013e-04 0.6749629 0.2522013 1.554392 577
#> 5 GO:0002376 5.203151e-07 7.684013e-04 0.6594444 -0.2643578 -1.407153 1450
#> 6 GO:0002684 2.574958e-06 3.168915e-03 0.6272567 -0.2981775 -1.552579 645
#> leading_edge description
#> 1 23620, 2.... G protein-coupled receptor signaling pathway
#> 2 1026, 58.... regulation of immune system process
#> 3 581, 677.... immune response
#> 4 29899, 1.... cell cycle process
#> 5 1026, 58.... immune system process
#> 6 1026, 58.... positive regulation of immune system process
fgsea_kegg(s2) |> head()
#> gene_set p_value p_adjust log2_p_err es nes n_gs
#> 1 hsa04080 2.636467e-08 9.306729e-06 0.7337620 -0.4052099 -1.952654 234
#> 2 hsa03010 2.524439e-07 4.455634e-05 0.6749629 -0.5122149 -2.136079 85
#> 3 hsa04060 3.510965e-06 4.131236e-04 0.6272567 -0.3936669 -1.858073 193
#> 4 hsa04940 1.211598e-05 1.069235e-03 0.5933255 -0.5971851 -2.117767 36
#> 5 hsa04081 5.355755e-05 3.781163e-03 0.5573322 -0.3766489 -1.750193 180
#> 6 hsa05330 8.941644e-05 4.953516e-03 0.5384341 -0.5990447 -2.008731 28
#> leading_edge description
#> 1 23620, 2.... Neuroactive ligand-receptor interaction - Homo sapiens (human)
#> 2 6223, 61.... Ribosome - Homo sapiens (human)
#> 3 355, 343.... Cytokine-cytokine receptor interaction - Homo sapiens (human)
#> 4 355, 355.... Type I diabetes mellitus - Homo sapiens (human)
#> 5 2100, 15.... Hormone signaling - Homo sapiens (human)
#> 6 355, 356.... Allograft rejection - Homo sapiens (human)
fgsea_msigdb(s2) |> head()
#> gene_set p_value p_adjust log2_p_err es
#> 1 HALLMARK_G2M_CHECKPOINT 2.159796e-08 1.079898e-06 0.7337620 0.4116411
#> 2 HALLMARK_ALLOGRAFT_REJECTION 2.351987e-05 3.426884e-04 0.5756103 -0.3856414
#> 3 HALLMARK_E2F_TARGETS 2.669948e-05 3.426884e-04 0.5756103 0.3555237
#> 4 HALLMARK_MITOTIC_SPINDLE 2.741507e-05 3.426884e-04 0.5756103 0.3662740
#> 5 HALLMARK_UV_RESPONSE_DN 5.844880e-05 5.844880e-04 0.5573322 0.3538389
#> 6 HALLMARK_PROTEIN_SECRETION 5.860902e-04 4.884085e-03 0.4772708 0.3610550
#> nes n_gs leading_edge
#> 1 2.150537 141 7290, 18....
#> 2 -1.782324 168 355, 677....
#> 3 1.821589 129 5810, 11....
#> 4 1.839499 116 9371, 83....
#> 5 1.805681 122 5195, 21....
#> 6 1.691100 80 3998, 93....