Find Probe IDs by Gene Name — find.probe.by.gene • BioUtils

Searches the gene annotation data frame to find the probe IDs corresponding to one or more gene names or descriptions.

Usage

find.probe.by.gene(genes, gene.names)

Arguments

genes: Data frame of gene annotations as returned by extract.expression()$gene. The second column (index 2) is expected to contain gene titles or descriptions.
gene.names: Character vector of one or more gene names or descriptions to search for. Matching is exact and case-sensitive.

Value

Integer vector of probe IDs corresponding to the matched genes. The length of the vector equals the number of matches found. Returns an empty integer vector if no matches are found.

Details

Probe IDs are the integer row names of the genes annotation data frame. These IDs are used directly to index rows of the expression matrix in get.gene.expression(). If multiple gene names are supplied, all matching probe IDs are returned as a vector, which is then passed as-is to get.gene.expression() to retrieve a multi-row expression matrix.

Examples

# \donttest{
geo <- extract.expression(load.geo.soft(accession = "GDS3268", log.transform = TRUE))
#> GDS3268 not found locally, downloading from NCBI GEO...
#> Using locally cached version of GDS3268 found here:
#> /tmp/RtmpxRZSjV/GDS3268.soft.gz 
#> Warning: NaNs produced
#> Using locally cached version of GPL1708 found here:
#> /tmp/RtmpxRZSjV/GPL1708.annot.gz 

# Single gene
probe <- find.probe.by.gene(geo$gene, "mucin 20, cell surface associated")

# Multiple genes
probes <- find.probe.by.gene(geo$gene, c(
  "mucin 20, cell surface associated",
  "alcohol dehydrogenase 1A (class I), alpha polypeptide"
))
# }