Skip to contents

get_cai calculates the Codon Adaptation Index (CAI) for each input coding sequence. CAI measures how similar the codon usage of a gene is to that of highly expressed genes, serving as an indicator of translational efficiency. Higher CAI values suggest better adaptation to the translational machinery.

Usage

get_cai(cf, rscu, level = "subfam")

Arguments

cf

A matrix of codon frequencies as calculated by count_codons(). Rows represent sequences and columns represent codons.

rscu

An RSCU table containing CAI weights for each codon. This table should be generated using est_rscu() based on highly expressed genes, or prepared manually with appropriate weight values.

level

Character string specifying the analysis level: "subfam" (default, analyzes codon subfamilies) or "amino_acid" (analyzes at amino acid level).

Value

A named numeric vector of CAI values ranging from 0 to 1. Names correspond to sequence identifiers from the input matrix. Values closer to 1 indicate higher similarity to highly expressed genes.

References

Sharp PM, Li WH. 1987. The codon Adaptation Index–a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res 15:1281-1295.

Examples

# Calculate CAI for yeast genes based on RSCU of highly expressed genes
heg <- head(yeast_exp[order(-yeast_exp$fpkm), ], n = 500)
cf_all <- count_codons(yeast_cds)
cf_heg <- cf_all[heg$gene_id, ]
rscu_heg <- est_rscu(cf_heg)
cai <- get_cai(cf_all, rscu_heg)
head(cai)
#>   YPL071C   YLL050C   YMR172W   YOR185C   YLL032C   YBR225W 
#> 0.5590442 0.8212905 0.5112301 0.6534497 0.5670395 0.5485641 
hist(cai, main = "Distribution of CAI values")