Get genome data from NCBI

getGenomeDataFromNCBI(refseq_assembly_accession, return_granges = FALSE)

Arguments

refseq_assembly_accession

The RefSeq accession number for the assembly, such as "GCF_000001405.40" for human.

return_granges

If the assembly is already on chromosome level, it will directly construct a GRanges object where "chromosomes" are only used and chromosome lengths are corrected fitted in its seqlengths.

Details

Only protein coding genes are used.

Value

If return_granges is set to FALSE, it returns a list of two data frames:

genome

A data frame of several columns.

gene

A data frame for genes. The first column contains the RefSeq accession numbers of the corresponding contigs. If the genome is assembled on the chromosome level, the first column corresponds to chromosomes. The contig names can be converted to other names with the information in the genome data frame.

Examples

if(FALSE) {
getGenomeDataFromNCBI("GCF_000001405.40", return_granges = TRUE)
getGenomeDataFromNCBI("GCF_000001405.40")
}