feb ebf bfe ecb cbb fec K-MERS FROM REDUCED AMINO ACID ALPHABETS RETAIN EVOLUTIONARILY CONSERVED BIOCHEMICAL PROPERTIES 8 Amino acid C A, G, P, S, T D, E, N, Q H, K, R I, L, M, V F, W, Y Property Sulfur polymerization Small Acid and amide Basic Hydrophobic Aromatic Dayhoff a b c d e f Protein: FLAWLESS Dayhoff: febfecbb Dayhoff MO (1965). An Atlas of Protein Sequence. Phillips R, Kondev J, & Theriot J. (2012) Physical Biology of the Cell
Peris, P., López, D., & Campos, M. (2008). IgTM: An algorithm to predict transmembrane domains and topology in proteins. BMC Bioinformatics, 9(1), 1029–11. http://doi.org/10.1186/1471-2105-9-367 Reduced alphabet k-mers are resilient to amnio acid changes Recent paper used reduced amino acid encodings to identify orthologous genes FLAWLESS FLAWLESS FLAWLESS FLAWLESS FLAWLES FLAWVESS FLAWVESS FLAWLESS FLAWVESS FLAWVESS febfecbb febfecbb febfecbb febfecbb febfecbb febfecbb febfecbb febfecbb febfecbb febfecbb Dayhoff alphabet: no change in k-mers! Protein alphabet: k k-mers affected Single amino acid change: Leucine (L) → Valine (V) FLA LAW ESS LES AWL WLE L → V Original L → V Original VES AWV WVE FLA LAW ESS k = 3