Calculate effective number of codons (ENC)

get_enc computes the effective number of codons (ENC) for each coding sequence, which quantifies the degree of codon usage bias. Lower ENC values indicate stronger bias (fewer codons are used), while higher values indicate more uniform codon usage.

Usage

get_enc(cf, codon_table = get_codon_table(), level = "subfam")

Arguments

cf: A matrix of codon frequencies as calculated by count_codons(). Rows represent sequences and columns represent codons.
codon_table: A codon table defining the genetic code, derived from get_codon_table() or create_codon_table().
level: Character string specifying the analysis level: "subfam" (default, analyzes codon subfamilies) or "amino_acid" (analyzes at amino acid level).

Value

A named numeric vector of ENC values. Names correspond to sequence identifiers from the input matrix. ENC values typically range from 20 (maximum bias) to 61 (uniform usage).

References

Wright F. 1990. The 'effective number of codons' used in a gene. Gene 87:23-29.

Sun X, Yang Q, Xia X. 2013. An improved implementation of effective number of codons (NC). Mol Biol Evol 30:191-196.

Examples

# Calculate ENC for yeast genes
cf_all <- count_codons(yeast_cds)
enc <- get_enc(cf_all)
head(enc)
#>  YPL071C  YLL050C  YMR172W  YOR185C  YLL032C  YBR225W 
#> 52.93616 44.57694 56.03833 50.82037 53.34254 53.85807 
hist(enc, main = "Distribution of ENC values")