– 👨👩👧👧$$ • Maybe 3, But correct is 1. – Using ZWJ all code points. – Cursor is only move one word. • Because there is no corresponding glyph in the font – There is allowed list emoji-zwj-sequences.txt in Unicode. • https://unicode.org/Public/17.0.0/emoji/emoji-zwj-sequences.txt
grapheme cluster in 👨👦👦 – Let's call it Emoji Bomb , but there is Bomb Emoji . 💣️ 💣️ • In addition, it cannot be displayed because it crashes just by displaying it on the screen.
ä̷͔̟͓̬̯̟͍̭͉͈̮͙̣̯̬͚̞̭̍̀̾͠m̴̡̧̛̝̯̹̗̹̤̲̺̟̥̈̏͊̔̑̍͆̌̀̚͝͝b̴̢̢̫̝̠̗̼̬̻̮̺̭͔̘͑̆̎̚ r̷̡̡̲̼̖͎̫̮̜͇̬͌͘g̷̹͍͎̬͕͓͕̐̃̈́̓̆̚͝ẻ̵̡̼̬̥̹͇̭͔̯̉͛̈́̕r̸̮̖̻̮̣̗͚͖̝̂͌̾̓̀̿̔̀͋̈́͌̈́̋͜ • No limits. – SNS people plays using zalgo text. – Like emojis, it is also possible to send a large number of code points with one grapheme cluster. • In addition, Symfony fixes grapheme_strlen to mb_strlen. – https://github.com/symfony/symfony/pull/13527/files
I think limit of 32 code points. – https://unicode.org/reports/tr51/#valid-emoji-tag-sequences • Emoji – https://unicode.org/reports/tr15/#Stream_Safe_Text_Format • NKFD • The most of code point in human language is “Hakṣhmalawarayaṁ(ཧྐྵྨླྺྼྻྂ)”. 9 code points(1 Base Character+8Combining Character) – https://stackoverflow.com/questions/11978912/how-to-protect -against-diacritics-such-as-zalgo-text
or grapheme cluster? • It seems better to match the requirements of the application. – In CJK, If you have to be careful with kanji, you need grapheme clusters • Identity becomes important, such as people's names and place names... • However, Approves code point unit. – Grapheme cluster is very slow in performance.