Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Remove AS::Mb::Unicode::UnicodeDatabase
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Fumiaki MATSUSHIMA
August 05, 2017
Programming
1.5k
4
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Remove AS::Mb::Unicode::UnicodeDatabase
ぎんざRuby会議01 発表資料
https://ginzarb.github.io/kaigi01/
Fumiaki MATSUSHIMA
August 05, 2017
More Decks by Fumiaki MATSUSHIMA
See All by Fumiaki MATSUSHIMA
Learning from performance improvements on GraphQL Ruby
mtsmfm
1
1.3k
Ruby で作る Ruby (物理)
mtsmfm
1
280
GraphQL Ruby benchmark
mtsmfm
1
900
タイムアウトにご用心 / Timeout might break application state
mtsmfm
6
2.7k
Build REST API with GraphQL Ruby
mtsmfm
0
390
GraphQL Ruby をちょっとだけ速くした / Make graphql-ruby faster a bit
mtsmfm
1
780
Gaming PC on GCP
mtsmfm
0
810
How to introduce GraphQL to an existing React-Redux application
mtsmfm
1
310
Canary release in StudySapuri
mtsmfm
0
3.3k
Other Decks in Programming
See All in Programming
メソッドのジェネリクスでGoの夢は広がるか? / Kyoto.go #65
utgwkk
3
840
A2UI という光を覗いてみる
satohjohn
1
140
ローカルLLMを使ってB2Bサービスを作っていての学び
yaotti
0
200
AIだと陥りがちなJakarta EE最新技術への移行時の落とし穴と解決策
tnagao7
0
110
「AIで開発し、AIを届ける」をEvalでつなぐ 〜AIネイティブに始めるプロダクト開発の実践〜 / Connecting "Develop with AI, deliver AI" with Eval
rkaga
4
5.3k
The NotImplementedError Problem in Ruby
koic
1
840
[2026年度第1回ORセミナー] 計画最適化ベンチャーと競技プログラミング人材
terryu16
0
270
Observability in Practice:Grafana 與 Edge Device SRE 的那些事
blueswen
0
170
net-httpのHTTP/2対応について
naruse
0
500
ECSアプリログをFireLensでコスト削減しようとしたけど諦めた話 in Fargate×Node.js
akihisaikeda
2
4.2k
Creating Composable Callables in Contemporary C++
rollbear
0
150
エージェンティックRAGにAWSで入門しよう!
har1101
8
1.7k
Featured
See All Featured
KATA
mclloyd
PRO
35
15k
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
659
62k
What the history of the web can teach us about the future of AI
inesmontani
PRO
1
620
The SEO identity crisis: Don't let AI make you average
varn
0
490
Navigating Algorithm Shifts & AI Overviews - #SMXNext
aleyda
1
1.3k
Noah Learner - AI + Me: how we built a GSC Bulk Export data pipeline
techseoconnect
PRO
0
200
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
16
2k
Breaking role norms: Why Content Design is so much more than writing copy - Taylor Woolridge
uxyall
0
320
Fashionably flexible responsive web design (full day workshop)
malarkey
408
66k
A Tale of Four Properties
chriscoyier
163
24k
How to build a perfect <img>
jonoalderson
1
5.7k
Between Models and Reality
mayunak
4
340
Transcript
@mtsmfm ActiveSupport::Multibyte:: Unicode::UnicodeDatabase を消したかった
Fumiaki MATSUSHIMA GitHub, Twitter @mtsmfm Web Developer
https://www.quipper.com/
https://ninirb.github.io
https://www.meetup.com/ja-JP/GraphQL-Tokyo/
http://rubykaigi.org/2017/speakers
http://contributors.rubyonrails.org/
Rails で 一番大きいファイル 知ってますか?
$ find vendor/bundle/gems/acti* -type f -exec du -h -a {}
+ | sort -h -r | head -n 10 1.1M vendor/bundle/gems/activesupport-5.1.2/lib/active_support/values/unicode_tables.dat 104K vendor/bundle/gems/actionview-5.1.2/lib/action_view/helpers/form_helper.rb 100K vendor/bundle/gems/activerecord-5.1.2/lib/active_record/associations.rb 76K vendor/bundle/gems/actionpack-5.1.2/lib/action_dispatch/routing/mapper.rb 60K vendor/bundle/gems/actionview-5.1.2/lib/action_view/helpers/date_helper.rb 52K vendor/bundle/gems/activerecord-5.1.2/lib/active_record/connection_adapters/abstract/schema_statements.rb 44K vendor/bundle/gems/activerecord-5.1.2/lib/active_record/migration.rb 44K vendor/bundle/gems/actionview-5.1.2/lib/action_view/helpers/form_tag_helper.rb 44K vendor/bundle/gems/actionview-5.1.2/lib/action_view/helpers/form_options_helper.rb 40K vendor/bundle/gems/activerecord-5.1.2/lib/active_record/relation/query_methods.rb
$ find vendor/bundle/gems/acti* -type f -exec du -h -a {}
+ | sort -h -r | head -n 10 1.1M vendor/bundle/gems/activesupport-5.1.2/lib/active_support/values/unicode_tables.dat 104K vendor/bundle/gems/actionview-5.1.2/lib/action_view/helpers/form_helper.rb 100K vendor/bundle/gems/activerecord-5.1.2/lib/active_record/associations.rb 76K vendor/bundle/gems/actionpack-5.1.2/lib/action_dispatch/routing/mapper.rb 60K vendor/bundle/gems/actionview-5.1.2/lib/action_view/helpers/date_helper.rb 52K vendor/bundle/gems/activerecord-5.1.2/lib/active_record/connection_adapters/abstract/schema_statements.rb 44K vendor/bundle/gems/activerecord-5.1.2/lib/active_record/migration.rb 44K vendor/bundle/gems/actionview-5.1.2/lib/action_view/helpers/form_tag_helper.rb 44K vendor/bundle/gems/actionview-5.1.2/lib/action_view/helpers/form_options_helper.rb 40K vendor/bundle/gems/activerecord-5.1.2/lib/active_record/relation/query_methods.rb 1.1M!
active_support/values/unicode_tables.dat
https://github.com/rails/rails/blob/16f2b2044eaaa54b7bc205ef9af1689a152b2fdf/actives upport/lib/active_support/multibyte/unicode.rb
Rails で 一番大きいファイル ↓ ActiveSupport::Multibyte:: Unicode::UnicodeDatabase の dat ファイル
https://github.com/rails/rails/pull/26743
@mtsmfm ActiveSupport::Multibyte:: Unicode::UnicodeDatabase を消したかった
http://agile.esm.co.jp/news/2016-04-08-rails-study-session.html
社内 Rails 勉強会 ↓ OSS パッチ会
https://speakerdeck.com/a_matsuda/3x-rails
https://speakerdeck.com/a_matsuda/3x-rails?slide=156
https://speakerdeck.com/a_matsuda/3x-rails?slide=156
https://speakerdeck.com/a_matsuda/3x-rails?slide=156
None
None
AS::Mb::Unicode そもそも何ができる?
None
PR 出したタイミングの Rails v5.0.0.1 の コードベースで話をします (今も大差ないけれど) 当時は Ruby 2.4
が出る ちょっと前でした
- Normalize - Case mapping - Pack/unpack grapheme - Tidy
bytes
- Normalize - Case mapping - Pack/unpack grapheme - Tidy
bytes AS::Mb::Unicode::UnicodeDatabase 使ってない
- Normalize - Case mapping - Pack/unpack grapheme
Unicode Normalize とは
Decompose ‘が’ [‘か’, ‘゛’] Compose [‘か’, ‘゛’] ‘が’
Normalize 関連のメソッド - AS::Mb::Unicode#normalize - AS::Mb::Unicode#decompose - AS::Mb::Unicode#compose - AS::Mb::Unicode#reorder_characters
Unicode 正規化 - NFD - NFC - NFKD - NFKC
Normalization Form Decopose Compose
Unicode 正規化 - NFD - NFC - NFKD - NFKC
Normalization Form Decopose Compose
Unicode 正規化 - NFD - NFC - NFKD - NFKC
Normalization Form Decopose Compose
“In NFKC and NFKD, a K is used to stand
for compatibility to avoid confusion with the C standing for composition.” http://unicode.org/reports/tr15/
Unicode 正規化 - NFD - NFC - NFKD - NFKC
Normalization Form Decopose Compose K(C)ompatibility (互換等価)
Unicode 正規化の等価性 - 正準等価 (Kじゃない方) - 戻れる - 互換等価 (Kの方)
- 緩め。戻れない
㈱
正準等価 ‘㈱’ != [‘(’ , ‘株’, ‘)’] 互換等価 ‘㈱’ ==
[‘(’, ‘株’, ‘)’]
Normalize 関連のメソッド - AS::Mb::Unicode#normalize - AS::Mb::Unicode#decompose - AS::Mb::Unicode#compose - AS::Mb::Unicode#reorder_characters
https://github.com/rails/rails/blob/v5.0.0.1/activesupport/lib/active_ support/multibyte/unicode.rb#L285-L301
https://github.com/rails/rails/blob/v5.0.0.1/activesupport/lib/active_ support/multibyte/unicode.rb#L159-L177
https://github.com/rails/rails/blob/v5.0.0.1/activesupport/lib/active_ support/multibyte/unicode.rb#L180-L236
https://github.com/rails/rails/blob/v5.0.0.1/activesupport/lib/active_ support/multibyte/unicode.rb#L143-L136
Normalize 関連のメソッド - AS::Mb::Unicode#normalize - AS::Mb::Unicode#decompose - AS::Mb::Unicode#compose - AS::Mb::Unicode#reorder_characters
#normalize で使うための ヘルパメソッド (なぜ public なのか...)
Normalize 関連のメソッド - AS::Mb::Unicode#normalize - AS::Mb::Unicode#decompose - AS::Mb::Unicode#compose - AS::Mb::Unicode#reorder_characters
Ruby 本体は?
https://docs.ruby-lang.org/ja/search/
https://docs.ruby-lang.org/ja/search/query:unicode/query:normalize/
あった!
String#unicode_normalize [1] pry(main)> '株'.codepoints => [26666] [2] pry(main)> '㈱'.codepoints =>
[12849] [3] pry(main)> '㈱'.unicode_normalize(:nfc).codepoints => [12849] [4] pry(main)> '㈱'.unicode_normalize(:nfd).codepoints => [12849] [5] pry(main)> '㈱'.unicode_normalize(:nfkc).codepoints => [40, 26666, 41] [6] pry(main)> '㈱'.unicode_normalize(:nfkd).codepoints => [40, 26666, 41]
https://github.com/rails/rails/pull/26743/files?diff=split
https://github.com/rails/rails/pull/26743/files?diff=split
https://github.com/rails/rails/pull/26743/files?diff=split
Ruby 便利!
- Normalize - Case mapping - Pack/unpack grapheme ✔
‘A’ ‘a’
‘A’ ‘a’ ‘Ä’ ‘ä’
Case mapping 関連のメソッド - AS::Mb::Unicode#downcase - AS::Mb::Unicode#upcase - AS::Mb::Unicode#swapcase
https://github.com/rails/rails/blob/v5.0.0.1/activesupport/lib/active_ support/multibyte/unicode.rb#L303-L313
https://github.com/rails/rails/blob/v5.0.0.1/activesupport/lib/active_ support/multibyte/unicode.rb#L392-L402
Ruby 本体は?
http://rubykaigi.org/2016/presentations/duerst.html
https://www.ruby-lang.org/en/news/2016/09/08/ruby-2-4-0-preview 2-released/
$ docker run -e LANG=C.UTF-8 --rm ruby:2.3 \ ruby -e
"p 'Ä'.downcase == 'ä'" false $ docker run -e LANG=C.UTF-8 --rm ruby:2.4 \ ruby -e "p 'Ä'.downcase == 'ä'" true
https://github.com/rails/rails/pull/26743/files?diff=split
Ruby 便利!!
https://www.sw.it.aoyama.ac.jp/2016/pub/RubyKaigi/
https://bugs.ruby-lang.org/issues/10084
- Normalize - Case mapping - Pack/unpack grapheme ✔ ✔
Grapheme とは
Grapheme (書記素) ≒ 文字の単位 あ が ゛
ぎんざ
[‘き’, ‘゛’, ‘ん’, ‘ざ’]
文字区切り [[‘き’], [‘゛’], [‘ん’], [‘ざ’]] 書記素区切り [[‘き’, ’゛’], [‘ん’], [‘ざ’]]
Pack/unpack grapheme 関連のメソッド - AS::Mb::Unicode#pack_graphemes - AS::Mb::Unicode#unpack_graphemes
https://github.com/rails/rails/blob/v5.0.0.1/activesupport/lib/active_ support/multibyte/unicode.rb#L138-L140
https://github.com/rails/rails/blob/v5.0.0.1/activesupport/lib/active_ support/multibyte/unicode.rb#L80-L133
Ruby 本体は?
https://docs.ruby-lang.org/ja/search/
https://docs.ruby-lang.org/ja/search/query:grapheme
/\X/
https://github.com/rails/rails/pull/26743/files
https://github.com/rails/rails/pull/26743/files
Ruby 本体の機能便利!!!
と思いきや テストが通らない
None
https://github.com/k-takata/Onigmo/issues/46
https://bugs.ruby-lang.org/issues/12831
https://bugs.ruby-lang.org/issues/12831
2.4 で入った
https://github.com/rails/rails/pull/26743/files
- Normalize - Case mapping - Pack/unpack grapheme ✔ ✔
✔
None
https://github.com/rails/rails/pull/26743
なぜマージできないか
Rails 5 は Ruby 2.2.2 以降を サポート
- Normalize - Ruby 2.2 から - Case mapping -
Ruby 2.4 から - Pack/unpack grapheme - Ruby 2.0 から - ただし、Unicode のテストが 通るのは 2.4 から
入るとしたら Ruby のバージョンが 上がるとき ≒ Rails 6 ?
Rails を待たなくても 手元の開発では 使える
それ、 Ruby 本体で できるかも
まとめ - Rails 6 になると UnicodeDatabase が 消せて、3x Rails に近づくかも
- 多数の人の力により、gem でやっていた ことが Ruby 本体でできるようになって いっている
Credits Background pattern from subtlepatterns.com Emoji artwork provided by Emoji
One