Load keyword genesets for a specific species

load_keyword_genesets(taxon_id = 9606, category = NULL, as_table = FALSE)



The taxon ID. To make it more flexible, you can also provide the Latin name or the normal name of the species.


Category of keywords. There are the following categories: "Biological process", "Cellular component", "Coding sequence diversity", "Developmental stage", "Disease", "Domain", "Ligand", "Molecular function", "Post-translational modification", "Technical term".


If true, the returned value will be a two-column data frame.


Following are the supported species (with more than 1000 genes annotated):

  • "10090": Mus musculus / house mouse

  • "10116": Rattus norvegicus / Norway rat

  • "208964": Pseudomonas aeruginosa PAO1 / strain, g-proteobacteria

  • "224308": Bacillus subtilis subsp. subtilis str. 168 / strain, firmicutes

  • "237561": Candida albicans SC5314 / strain, budding yeasts

  • "243232": Methanocaldococcus jannaschii DSM 2661 / strain, euryarchaeotes

  • "284812": Schizosaccharomyces pombe 972h- / strain, ascomycete fungi

  • "3702": Arabidopsis thaliana / thale cress

  • "39947": Oryza sativa Japonica Group / (Japanese rice), monocots

  • "44689": Dictyostelium discoideum / species, cellular slime molds

  • "559292": Saccharomyces cerevisiae S288C / strain, budding yeasts

  • "6239": Caenorhabditis elegans / species, nematodes

  • "623": Shigella flexneri / species, enterobacteria

  • "7227": Drosophila melanogaster / (fruit fly), species, flies

  • "7955": Danio rerio / (zebrafish), species, bony fishes

  • "83332": Mycobacterium tuberculosis H37Rv / strain, high G+C Gram-positive bacteria

  • "83333": Escherichia coli K-12 / strain, enterobacteria

  • "83334": Escherichia coli O157:H7 / serotype, enterobacteria

  • "8355": Xenopus laevis / (African clawed frog), species, frogs & toads

  • "8364": Xenopus tropicalis / (tropical clawed frog), species, frogs & toads

  • "9031": Gallus gallus / (chicken), species, birds

  • "9601": Pongo abelii / (Sumatran orangutan), species, primates

  • "9606": Homo sapiens / human

  • "9823": Sus scrofa / (pig), species, even-toed ungulates

  • "9913": Bos taurus / cattle

  • "99287": Salmonella enterica subsp. enterica serovar Typhimurium str. LT2 / strain, enterobacteria


If as_table is set to FALSE, it returns a list of gene sets where Entrez IDs are the gene IDs. If as_table is set to TRUE, it returns a two-column data frame.


lt = load_keyword_genesets(9606)
#> $`3Fe-4S`
#> [1] "6390"
#> $`4Fe-4S`
#>  [1] "6059"   "48"     "50"     "54901"  "64428"  "51654"  "57019"  "1663"  
#>  [9] "1763"   "5424"   "5426"   "1806"   "55140"  "2068"   "2110"   "64789" 
#> [17] "83990"  "3658"   "11019"  "4337"   "4595"   "4719"   "4720"   "374291"
#> [25] "4728"   "4723"   "4682"   "10101"  "5558"   "5471"   "5980"   "55316" 
#> [33] "91543"  "51750"  "6390"   "441250" "55253" 
tb = load_keyword_genesets(9606, as_table = TRUE)
#>   keyword   gene
#> 1  2Fe-2S   2230
#> 2  2Fe-2S 150209
#> 3  2Fe-2S    316
#> 4  2Fe-2S  55847
#> 5  2Fe-2S 493856
#> 6  2Fe-2S 284106
# load_keyword_genesets("mouse")