est_rscu
returns the RSCU value of codons
Usage
est_rscu(
cf,
weight = 1,
pseudo_cnt = 1,
codon_table = get_codon_table(),
level = "subfam"
)
Arguments
- cf
matrix of codon frequencies as calculated by
count_codons()
.- weight
a vector of the same length as
seqs
that gives different weights to CDSs when count codons. for example, it could be gene expression levels.- pseudo_cnt
pseudo count to avoid dividing by zero. This may occur when only a few sequences are available for RSCU calculation.
- codon_table
a table of genetic code derived from
get_codon_table
orcreate_codon_table
.- level
"subfam" (default) or "amino_acid". For which level to determine RSCU.
References
Sharp PM, Tuohy TM, Mosurski KR. 1986. Codon usage in yeast: cluster analysis clearly differentiates highly and lowly expressed genes. Nucleic Acids Res 14:5125-5143.
Examples
# compute RSCU of all yeast genes
cf_all <- count_codons(yeast_cds)
est_rscu(cf_all)
#> aa_code amino_acid codon subfam cts prop w_cai rscu
#> <char> <char> <char> <char> <num> <num> <num> <num>
#> 1: F Phe TTT Phe_TT 79149 0.5946835 1.0000000 1.1893671
#> 2: F Phe TTC Phe_TT 53945 0.4053165 0.6815666 0.8106329
#> 3: L Leu TTA Leu_TT 77584 0.4968747 0.9875765 0.9937494
#> 4: L Leu TTG Leu_TT 78560 0.5031253 1.0000000 1.0062506
#> 5: S Ser TCT Ser_TC 68480 0.3590299 1.0000000 1.4361195
#> 6: S Ser TCC Ser_TC 41295 0.2165053 0.6030286 0.8660211
#> 7: S Ser TCA Ser_TC 55198 0.2893955 0.8060484 1.1575818
#> 8: S Ser TCG Ser_TC 25762 0.1350694 0.3762065 0.5402776
#> 9: Y Tyr TAT Tyr_TA 55654 0.5696637 1.0000000 1.1393273
#> 10: Y Tyr TAC Tyr_TA 42042 0.4303363 0.7554218 0.8606727
#> 11: C Cys TGT Cys_TG 24113 0.6208388 1.0000000 1.2416776
#> 12: C Cys TGC Cys_TG 14726 0.3791612 0.6107241 0.7583224
#> 13: W Trp TGG Trp_TG 30566 1.0000000 1.0000000 1.0000000
#> 14: L Leu CTT Leu_CT 36969 0.2960679 0.9384678 1.1842716
#> 15: L Leu CTC Leu_CT 16801 0.1345559 0.4265117 0.5382238
#> 16: L Leu CTA Leu_CT 39393 0.3154801 1.0000000 1.2619204
#> 17: L Leu CTG Leu_CT 31703 0.2538961 0.8047926 1.0155842
#> 18: P Pro CCT Pro_CC 38941 0.3095868 0.7628656 1.2383474
#> 19: P Pro CCC Pro_CC 20258 0.1610580 0.3968696 0.6442319
#> 20: P Pro CCA Pro_CC 51046 0.4058210 1.0000000 1.6232838
#> 21: P Pro CCG Pro_CC 15538 0.1235342 0.3044057 0.4941369
#> 22: H His CAT His_CA 40077 0.6428010 1.0000000 1.2856020
#> 23: H His CAC His_CA 22270 0.3571990 0.5556914 0.7143980
#> 24: Q Gln CAA Gln_CA 77278 0.6826948 1.0000000 1.3653895
#> 25: Q Gln CAG Gln_CA 35917 0.3173052 0.4647834 0.6346105
#> 26: R Arg CGT Arg_CG 18306 0.4462945 1.0000000 1.7851780
#> 27: R Arg CGC Arg_CG 7918 0.1930522 0.4325668 0.7722087
#> 28: R Arg CGA Arg_CG 9151 0.2231107 0.4999181 0.8924427
#> 29: R Arg CGG Arg_CG 5641 0.1375427 0.3081881 0.5501706
#> 30: I Ile ATT Ile_AT 88446 0.4621442 1.0000000 1.3864325
#> 31: I Ile ATC Ile_AT 49094 0.2565261 0.5550782 0.7695784
#> 32: I Ile ATA Ile_AT 53841 0.2813297 0.6087487 0.8439890
#> 33: M Met ATG Met_AT 61057 1.0000000 1.0000000 1.0000000
#> 34: T Thr ACT Thr_AC 58292 0.3424508 1.0000000 1.3698031
#> 35: T Thr ACC Thr_AC 36147 0.2123567 0.6201088 0.8494269
#> 36: T Thr ACA Thr_AC 51798 0.3043008 0.8885973 1.2172033
#> 37: T Thr ACG Thr_AC 23982 0.1408917 0.4114216 0.5635666
#> 38: N Asn AAT Asn_AA 105623 0.5979721 1.0000000 1.1959442
#> 39: N Asn AAC Asn_AA 71012 0.4020279 0.6723188 0.8040558
#> 40: K Lys AAA Lys_AA 123449 0.5820258 1.0000000 1.1640516
#> 41: K Lys AAG Lys_AA 88653 0.4179742 0.7181369 0.8359484
#> 42: S Ser AGT Ser_AG 42680 0.5901361 1.0000000 1.1802721
#> 43: S Ser AGC Ser_AG 29642 0.4098639 0.6945245 0.8197279
#> 44: R Arg AGA Arg_AG 61208 0.6859688 1.0000000 1.3719377
#> 45: R Arg AGG Arg_AG 28020 0.3140312 0.4577922 0.6280623
#> 46: V Val GTT Val_GT 63153 0.3851488 1.0000000 1.5405951
#> 47: V Val GTC Val_GT 32925 0.2008014 0.5213605 0.8032054
#> 48: V Val GTA Val_GT 35748 0.2180176 0.5660607 0.8720704
#> 49: V Val GTG Val_GT 32143 0.1960323 0.5089781 0.7841291
#> 50: A Ala GCT Ala_GC 58801 0.3666234 1.0000000 1.4664938
#> 51: A Ala GCC Ala_GC 35734 0.2228035 0.6077174 0.8912138
#> 52: A Ala GCA Ala_GC 47400 0.2955396 0.8061120 1.1821583
#> 53: A Ala GCG Ala_GC 18449 0.1150335 0.3137648 0.4601342
#> 54: D Asp GAT Asp_GA 109757 0.6539559 1.0000000 1.3079118
#> 55: D Asp GAC Asp_GA 58078 0.3460441 0.5291551 0.6920882
#> 56: E Glu GAA Glu_GA 132048 0.6987459 1.0000000 1.3974918
#> 57: E Glu GAG Glu_GA 56930 0.3012541 0.4311354 0.6025082
#> 58: G Gly GGT Gly_GG 65720 0.4515325 1.0000000 1.8061298
#> 59: G Gly GGC Gly_GG 28880 0.1984253 0.4394486 0.7937012
#> 60: G Gly GGA Gly_GG 32779 0.2252132 0.4987751 0.9008526
#> 61: G Gly GGG Gly_GG 18168 0.1248291 0.2764565 0.4993164
#> aa_code amino_acid codon subfam cts prop w_cai rscu
# compute RSCU of highly expressed (top 500) yeast genes
heg <- head(yeast_exp[order(-yeast_exp$fpkm), ], n = 500)
cf_heg <- count_codons(yeast_cds[heg$gene_id])
est_rscu(cf_heg)
#> aa_code amino_acid codon subfam cts prop w_cai rscu
#> <char> <char> <char> <char> <num> <num> <num> <num>
#> 1: F Phe TTT Phe_TT 2681 0.40005967 0.66683242 0.80011933
#> 2: F Phe TTC Phe_TT 4021 0.59994033 1.00000000 1.19988067
#> 3: L Leu TTA Leu_TT 3178 0.32133832 0.47348823 0.64267664
#> 4: L Leu TTG Leu_TT 6713 0.67866168 1.00000000 1.35732336
#> 5: S Ser TCT Ser_TC 4602 0.48916047 1.00000000 1.95664187
#> 6: S Ser TCC Ser_TC 2885 0.30669501 0.62698240 1.22678002
#> 7: S Ser TCA Ser_TC 1380 0.14675877 0.30002172 0.58703507
#> 8: S Ser TCG Ser_TC 539 0.05738576 0.11731479 0.22954304
#> 9: Y Tyr TAT Tyr_TA 1856 0.36648905 0.57850467 0.73297809
#> 10: Y Tyr TAC Tyr_TA 3209 0.63351095 1.00000000 1.26702191
#> 11: C Cys TGT Cys_TG 1285 0.80425266 1.00000000 1.60850532
#> 12: C Cys TGC Cys_TG 312 0.19574734 0.24339036 0.39149468
#> 13: W Trp TGG Trp_TG 1621 1.00000000 1.00000000 1.00000000
#> 14: L Leu CTT Leu_CT 1048 0.29309863 0.68651832 1.17239452
#> 15: L Leu CTC Leu_CT 279 0.07823414 0.18324607 0.31293657
#> 16: L Leu CTA Leu_CT 1527 0.42693490 1.00000000 1.70773959
#> 17: L Leu CTG Leu_CT 721 0.20173233 0.47251309 0.80692931
#> 18: P Pro CCT Pro_CC 1648 0.25330261 0.38955823 1.01321045
#> 19: P Pro CCC Pro_CC 423 0.06513057 0.10016537 0.26052227
#> 20: P Pro CCA Pro_CC 4232 0.65023041 1.00000000 2.60092166
#> 21: P Pro CCG Pro_CC 203 0.03133641 0.04819277 0.12534562
#> 22: H His CAT His_CA 1509 0.49185668 0.96794872 0.98371336
#> 23: H His CAC His_CA 1559 0.50814332 1.00000000 1.01628664
#> 24: Q Gln CAA Gln_CA 4792 0.85650465 1.00000000 1.71300929
#> 25: Q Gln CAG Gln_CA 802 0.14349535 0.16753599 0.28699071
#> 26: R Arg CGT Arg_CG 1409 0.86450031 1.00000000 3.45800123
#> 27: R Arg CGC Arg_CG 142 0.08767627 0.10141844 0.35070509
#> 28: R Arg CGA Arg_CG 38 0.02391171 0.02765957 0.09564684
#> 29: R Arg CGG Arg_CG 38 0.02391171 0.02765957 0.09564684
#> 30: I Ile ATT Ile_AT 4991 0.52409449 1.00000000 1.57228346
#> 31: I Ile ATC Ile_AT 3872 0.40661417 0.77584135 1.21984252
#> 32: I Ile ATA Ile_AT 659 0.06929134 0.13221154 0.20787402
#> 33: M Met ATG Met_AT 3093 1.00000000 1.00000000 1.00000000
#> 34: T Thr ACT Thr_AC 4102 0.45266990 1.00000000 1.81067961
#> 35: T Thr ACC Thr_AC 3032 0.33462048 0.73921521 1.33848191
#> 36: T Thr ACA Thr_AC 1425 0.15732568 0.34755057 0.62930274
#> 37: T Thr ACG Thr_AC 501 0.05538394 0.12234950 0.22153575
#> 38: N Asn AAT Asn_AA 2651 0.36930790 0.58555973 0.73861579
#> 39: N Asn AAC Asn_AA 4528 0.63069210 1.00000000 1.26138421
#> 40: K Lys AAA Lys_AA 4395 0.36551093 0.57607129 0.73102187
#> 41: K Lys AAG Lys_AA 7630 0.63448907 1.00000000 1.26897813
#> 42: S Ser AGT Ser_AG 987 0.57375145 1.00000000 1.14750290
#> 43: S Ser AGC Ser_AG 733 0.42624855 0.74291498 0.85249710
#> 44: R Arg AGA Arg_AG 4804 0.90268646 1.00000000 1.80537291
#> 45: R Arg AGG Arg_AG 517 0.09731354 0.10780437 0.19462709
#> 46: V Val GTT Val_GT 5528 0.49689943 1.00000000 1.98759774
#> 47: V Val GTC Val_GT 3497 0.31437045 0.63266413 1.25748180
#> 48: V Val GTA Val_GT 935 0.08411971 0.16928920 0.33647884
#> 49: V Val GTG Val_GT 1163 0.10461041 0.21052632 0.41844163
#> 50: A Ala GCT Ala_GC 7045 0.55401793 1.00000000 2.21607171
#> 51: A Ala GCC Ala_GC 3479 0.27362793 0.49389725 1.09451172
#> 52: A Ala GCA Ala_GC 1725 0.13571316 0.24496168 0.54285265
#> 53: A Ala GCG Ala_GC 465 0.03664098 0.06613682 0.14656393
#> 54: D Asp GAT Asp_GA 4820 0.54678462 1.00000000 1.09356924
#> 55: D Asp GAC Asp_GA 3995 0.45321538 0.82887368 0.90643076
#> 56: E Glu GAA Glu_GA 8649 0.81673119 1.00000000 1.63346237
#> 57: E Glu GAG Glu_GA 1940 0.18326881 0.22439306 0.36653763
#> 58: G Gly GGT Gly_GG 8092 0.76718172 1.00000000 3.06872689
#> 59: G Gly GGC Gly_GG 1194 0.11328088 0.14765847 0.45312352
#> 60: G Gly GGA Gly_GG 778 0.07384586 0.09625602 0.29538345
#> 61: G Gly GGG Gly_GG 481 0.04569153 0.05955764 0.18276614
#> aa_code amino_acid codon subfam cts prop w_cai rscu