GSEA using fgsea — fgsea_wrapper • GSEAtopics

GSEA using fgsea

Usage

fgsea_wrapper(s, gs, min_size = 5, max_size = 2000, ...)

fgsea_go(s, org_db = org.Hs.eg.db::org.Hs.eg.db, ontology = "BP", ...)

fgsea_kegg(s, organism = "hsa", db = "pathway", ...)

fgsea_msigdb(s, collection = "h.all", version = "2024.1.Hs", ...)

fgsea_reactome(s, organism = "HSA", ...)

fgsea_keywords(s, organism = "human", ...)

fgsea_phenotype(s, organism = "human", ...)

fgsea_disease(s, organism = "human", ...)

Arguments

s: A numeric vector of gene scores. It should be named with gene IDs.
gs: A list of gene sets. In fgsea_wrapper(), genes should have the same ID types as s.
min_size: Minimal size of gene sets for analysis.
max_size: Maximal size of gene sets for analysis.
...: Other argument passed to fgsea_wrapper() or further to fgsea::fgsea().
org_db: An OrgDb object for the organism. It can be from org.*.db packages or downloaded by the AnnotationHub package.
ontology: Namespace of GO. Value should be one of "BP", "CC" or "MF".
organism: See Details.
db: A KEGG database. The value can be one of "pathway", "module", "ko", "network", "disease" and "drug".
collection: Collection of the MSigDB gene sets. All possible values can be found via list_msigdb_versions().
version: Version of the MSigDB database. All possible values can be found via list_msigdb_collections().

Details

Except fgsea_wrapper(), gene IDs in s in all fgsea_*() functions must be EntreZ IDs.

The value should be set differently for specific fgsea_*() functions.

for fgsea_kegg(), the value should be a KEGG organism code, such as "hsa" or "mmu".
for fgsea_reactome(), the value should a prefix of the Reactome pathway ID that represents the organism. E.g. "HSA" for human.
for fgsea_keywords(), the value can be a organism name, e.g. "human", the latin name or the taxon ID. Please check UniProtKeywords::load_keyword_genesets().
for fgsea_phenotype() and fgsea_disease(), the value can only be one of "human", "mouse" and "rat".

All valid values for fgsea_reactome() are:

c("BTA", "CEL", "CFA", "DRE", "DDI", "DME", "GGA", "HSA", "MMU",
  "MTU", "PFA", "RNO", "SCE", "SPO", "SSC", "XTR")

Examples

data(p53_dataset)
s = p53_dataset$s2n
gs = p53_dataset$gs

fgsea_wrapper(s, gs) |> head()
#>                       gene_set      p_value    p_adjust log2_p_err         es
#> 1                       P53_UP 4.227646e-06 0.001910979  0.6105269 -0.5986774
#> 2                 hsp27Pathway 7.768208e-06 0.001910979  0.5933255 -0.7758925
#> 3                     HTERT_UP 4.256855e-05 0.004785001  0.5573322  0.3709129
#> 4 GPCRs_Class_A_Rhodopsin-like 5.045670e-05 0.004785001  0.5573322 -0.4299558
#> 5            p53hypoxiaPathway 5.403876e-05 0.004785001  0.5573322 -0.6816006
#> 6                   p53Pathway 5.835367e-05 0.004785001  0.5573322 -0.7438596
#>         nes n_gs leading_edge
#> 1 -2.130912   40 CDKN1A, ....
#> 2 -2.170911   15 TNFRSF6,....
#> 3  1.859947  109 AHR, DR1....
#> 4 -1.843462  111 NTSR2, A....
#> 5 -2.045472   20 CDKN1A, ....
#> 6 -2.128121   16 CDKN1A, ....

s2 = convert_to_entrez(s)
#>   gene id might be SYMBOL (p =  0.670 )
#> 'select()' returned 1:many mapping between keys and columns

fgsea_go(s2) |> head()
#> Warning: There were 11 pathways for which P-values were not calculated properly due to unbalanced (positive and negative) gene-level statistic values. For such pathways pval, padj, NES, log2err are set to NA. You can try to increase the value of the argument nPermSimple (for example set it nPermSimple = 10000)
#> Warning: For some of the pathways the P-values were likely overestimated. For such pathways log2err is set to NA.
#>     gene_set      p_value     p_adjust log2_p_err         es       nes n_gs
#> 1 GO:0007186 8.597994e-11 6.348759e-07  0.8390889 -0.3600615 -1.849246  512
#> 2 GO:0002682 1.823802e-08 6.733478e-05  0.7337620 -0.2997243 -1.584662  912
#> 3 GO:0006955 2.851258e-08 7.017896e-05  0.7337620 -0.2937466 -1.558659 1011
#> 4 GO:0022402 4.700567e-07 7.684013e-04  0.6749629  0.2522013  1.554392  577
#> 5 GO:0002376 5.203151e-07 7.684013e-04  0.6594444 -0.2643578 -1.407153 1450
#> 6 GO:0002684 2.574958e-06 3.168915e-03  0.6272567 -0.2981775 -1.552579  645
#>   leading_edge                                  description
#> 1 23620, 2.... G protein-coupled receptor signaling pathway
#> 2 1026, 58....          regulation of immune system process
#> 3 581, 677....                              immune response
#> 4 29899, 1....                           cell cycle process
#> 5 1026, 58....                        immune system process
#> 6 1026, 58.... positive regulation of immune system process
fgsea_kegg(s2) |> head()
#>   gene_set      p_value     p_adjust log2_p_err         es       nes n_gs
#> 1 hsa04080 2.636467e-08 9.306729e-06  0.7337620 -0.4052099 -1.952654  234
#> 2 hsa03010 2.524439e-07 4.455634e-05  0.6749629 -0.5122149 -2.136079   85
#> 3 hsa04060 3.510965e-06 4.131236e-04  0.6272567 -0.3936669 -1.858073  193
#> 4 hsa04940 1.211598e-05 1.069235e-03  0.5933255 -0.5971851 -2.117767   36
#> 5 hsa04081 5.355755e-05 3.781163e-03  0.5573322 -0.3766489 -1.750193  180
#> 6 hsa05330 8.941644e-05 4.953516e-03  0.5384341 -0.5990447 -2.008731   28
#>   leading_edge                                                    description
#> 1 23620, 2.... Neuroactive ligand-receptor interaction - Homo sapiens (human)
#> 2 6223, 61....                                Ribosome - Homo sapiens (human)
#> 3 355, 343....  Cytokine-cytokine receptor interaction - Homo sapiens (human)
#> 4 355, 355....                Type I diabetes mellitus - Homo sapiens (human)
#> 5 2100, 15....                       Hormone signaling - Homo sapiens (human)
#> 6 355, 356....                     Allograft rejection - Homo sapiens (human)
fgsea_msigdb(s2) |> head()
#>                       gene_set      p_value     p_adjust log2_p_err         es
#> 1      HALLMARK_G2M_CHECKPOINT 2.159796e-08 1.079898e-06  0.7337620  0.4116411
#> 2 HALLMARK_ALLOGRAFT_REJECTION 2.351987e-05 3.426884e-04  0.5756103 -0.3856414
#> 3         HALLMARK_E2F_TARGETS 2.669948e-05 3.426884e-04  0.5756103  0.3555237
#> 4     HALLMARK_MITOTIC_SPINDLE 2.741507e-05 3.426884e-04  0.5756103  0.3662740
#> 5      HALLMARK_UV_RESPONSE_DN 5.844880e-05 5.844880e-04  0.5573322  0.3538389
#> 6   HALLMARK_PROTEIN_SECRETION 5.860902e-04 4.884085e-03  0.4772708  0.3610550
#>         nes n_gs leading_edge
#> 1  2.150537  141 7290, 18....
#> 2 -1.782324  168 355, 677....
#> 3  1.821589  129 5810, 11....
#> 4  1.839499  116 9371, 83....
#> 5  1.805681  122 5195, 21....
#> 6  1.691100   80 3998, 93....