getBioMartGenes.Rd
Get genes from BioMart
getBioMartGenes(dataset, add_chr_prefix = FALSE)
A BioMart dataset or a taxon ID. For a proper value, please see supportedOrganisms
.
Whether to add "chr" prefix to chromosome names? If it is ture, it uses GenomeInfoDb::seqlevelsStyle(gr) = "UCSC"
to add the prefix.
Note add_chr_prefix
is just a helper argument. You can basically do the same as:
gr = getBioMartGenes("hsapiens_gene_ensembl")
seqlevelsStyle(gr) = "UCSC"
A GRanges
object.
gr = getBioMartGenes("hsapiens_gene_ensembl")
gr
#> GRanges object with 69299 ranges and 4 metadata columns:
#> seqnames ranges strand | ensembl_gene_id
#> <Rle> <IRanges> <Rle> | <character>
#> ENSG00000000003 X 100627108-100639991 - | ENSG00000000003
#> ENSG00000000005 X 100584936-100599885 + | ENSG00000000005
#> ENSG00000000419 20 50934867-50959140 - | ENSG00000000419
#> ENSG00000000457 1 169849631-169894267 - | ENSG00000000457
#> ENSG00000000460 1 169662007-169854080 + | ENSG00000000460
#> ... ... ... ... . ...
#> ENSG00000291313 14 103334237-103335932 + | ENSG00000291313
#> ENSG00000291314 X 10566888-10576955 - | ENSG00000291314
#> ENSG00000291315 3 40312086-40312214 + | ENSG00000291315
#> ENSG00000291316 8 144449582-144465430 - | ENSG00000291316
#> ENSG00000291317 8 144463817-144465667 - | ENSG00000291317
#> gene_biotype entrezgene_id external_gene_name
#> <character> <CharacterList> <character>
#> ENSG00000000003 protein_coding 7105 TSPAN6
#> ENSG00000000005 protein_coding 64102 TNMD
#> ENSG00000000419 protein_coding 8813 DPM1
#> ENSG00000000457 protein_coding 57147 SCYL3
#> ENSG00000000460 protein_coding 55732 C1orf112
#> ... ... ... ...
#> ENSG00000291313 protein_coding <NA> <NA>
#> ENSG00000291314 protein_coding <NA> <NA>
#> ENSG00000291315 protein_coding <NA> <NA>
#> ENSG00000291316 protein_coding 157542 <NA>
#> ENSG00000291317 protein_coding 84773 TMEM276
#> -------
#> seqinfo: 445 sequences from an unspecified genome; no seqlengths
gr = getBioMartGenes("hsapiens_gene_ensembl", add_chr_prefix = TRUE)
gr
#> GRanges object with 69299 ranges and 4 metadata columns:
#> seqnames ranges strand | ensembl_gene_id
#> <Rle> <IRanges> <Rle> | <character>
#> ENSG00000000003 chrX 100627108-100639991 - | ENSG00000000003
#> ENSG00000000005 chrX 100584936-100599885 + | ENSG00000000005
#> ENSG00000000419 chr20 50934867-50959140 - | ENSG00000000419
#> ENSG00000000457 chr1 169849631-169894267 - | ENSG00000000457
#> ENSG00000000460 chr1 169662007-169854080 + | ENSG00000000460
#> ... ... ... ... . ...
#> ENSG00000291313 chr14 103334237-103335932 + | ENSG00000291313
#> ENSG00000291314 chrX 10566888-10576955 - | ENSG00000291314
#> ENSG00000291315 chr3 40312086-40312214 + | ENSG00000291315
#> ENSG00000291316 chr8 144449582-144465430 - | ENSG00000291316
#> ENSG00000291317 chr8 144463817-144465667 - | ENSG00000291317
#> gene_biotype entrezgene_id external_gene_name
#> <character> <CharacterList> <character>
#> ENSG00000000003 protein_coding 7105 TSPAN6
#> ENSG00000000005 protein_coding 64102 TNMD
#> ENSG00000000419 protein_coding 8813 DPM1
#> ENSG00000000457 protein_coding 57147 SCYL3
#> ENSG00000000460 protein_coding 55732 C1orf112
#> ... ... ... ...
#> ENSG00000291313 protein_coding <NA> <NA>
#> ENSG00000291314 protein_coding <NA> <NA>
#> ENSG00000291315 protein_coding <NA> <NA>
#> ENSG00000291316 protein_coding 157542 <NA>
#> ENSG00000291317 protein_coding 84773 TMEM276
#> -------
#> seqinfo: 445 sequences from an unspecified genome; no seqlengths