Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Rubyの正規表現を調べてみた
Search
Yasuhiroki
September 26, 2018
Technology
990
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Rubyの正規表現を調べてみた
調べてみた系の発表です。
Yasuhiroki
September 26, 2018
More Decks by Yasuhiroki
See All by Yasuhiroki
自分に勉強させるには
yasuhiroki
1
460
Android Studio `Command+Shift+A`
yasuhiroki
0
390
シェルスクリプトをサーバーレスで cron したい
yasuhiroki
1
920
rails new コマンド
yasuhiroki
1
850
自動化を習慣化する
yasuhiroki
2
15k
GitHub Actions Parallel Testing
yasuhiroki
1
1.3k
circleci.vim
yasuhiroki
0
1.8k
ベンチャー企業がCircleCIを選んだ理由と活用方法
yasuhiroki
1
860
開発者(個人)のためのJenkins 運用編
yasuhiroki
0
2.4k
Other Decks in Technology
See All in Technology
20260619 私の日常業務での生成 AI 活用
masaruogura
1
230
いまさら聞けない「仕様駆動開発入門」 〜AI活用時代の開発プロセスを考える〜
findy_eventslides
2
160
インシデントレスポンス演習 I / Incident Response Exercise I
ks91
PRO
0
100
データレイクの「見えない問題」を可視化する
sansantech
PRO
1
110
Oracle AI Database@Google Cloud:サービス概要のご紹介
oracle4engineer
PRO
6
1.6k
攻撃者視点で考えるDetection Engineering
cryptopeg
3
2k
マルチアカウント環境での コーディングエージェントを使った障害調査が大変なので AIエージェントにReadOnly権限を付与してみた / ReadOnly AI Agents for Multi-Account AWS Incident Response
yamaguchitk333
2
110
脱SaaS!FDEを支えるプロビジョニングと分離設計
knih
0
240
あなたの知らないPDFのアクセシビリティ
lycorptech_jp
PRO
0
220
徹底討論!ECS vs EKS!
daitak
2
830
Oracle AI Database@AWS:サービス概要のご紹介
oracle4engineer
PRO
4
3k
Bucharest Tech Week 2026 - Guardians of the Cloud-Native Galaxy
edeandrea
PRO
0
130
Featured
See All Featured
Faster Mobile Websites
deanohume
310
31k
The agentic SEO stack - context over prompts
schlessera
0
820
Rails Girls Zürich Keynote
gr2m
96
14k
Beyond borders and beyond the search box: How to win the global "messy middle" with AI-driven SEO
davidcarrasco
3
160
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
162
16k
AI in Enterprises - Java and Open Source to the Rescue
ivargrimstad
0
1.3k
How to Talk to Developers About Accessibility
jct
2
240
Leo the Paperboy
mayatellez
7
1.8k
Fashionably flexible responsive web design (full day workshop)
malarkey
408
66k
Site-Speed That Sticks
csswizardry
13
1.2k
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
37
6.5k
We Are The Robots
honzajavorek
0
250
Transcript
Ruby ͷ ਖ਼نදݱΛௐͯΈͨ @yasuhiroki (tw: @duck_ysauhiroki)
ࣗݾհ • Yasuhiroki (Twitter: @duck_yasuhiroki) • ΤʔςϯϥϘגࣜձࣾ • αʔόʔαΠυΤϯδχΞ •
AWS • Ruby on Rails • ͨ·ʹ Android ͬͯΔ
ൃද༰ͷ͖͔͚ͬ
ϋογϡλάΛൈ͖ग़͍ͨ͠ $ cat text λά #λά ##λά# ## #ά#͙ͨ##tag g
λά #ͨ# #͙ͨ # #λάɹ#͙ͨ #
ϋογϡλάΛൈ͖ग़͍ͨ͠ $ cat text #͙ͨ # #λάɹ#͙ͨ # $ cat
text | ruby -ne 'p $_.scan(/????/)’ ["#͙ͨ", "#λά", "#͙ͨ", "#"]
ϋογϡλάΛൈ͖ग़͍ͨ͠ $ cat text #͙ͨ # #λάɹ#͙ͨ # $ cat
text | ruby -ne 'p $_.scan(/#[^#\s]+/)’ ["#͙ͨ", "#λάɹ", "#͙ͨ", “#"] શ֯εϖʔε͕औΓআ͚ͳ͍ʂ
ϋογϡλάΛൈ͖ग़͍ͨ͠ $ cat text | ruby -ne 'p $_.scan(/#[^#\s]+/)’ ["#͙ͨ",
"#λάɹ", "#͙ͨ", "#"] શ֯εϖʔε͕औΓআ͚ͳ͍ $ cat text | \ ruby -ne 'p $_.scan(/#[^#[:space:]+/)’ ["#͙ͨ", "#λά", "#͙ͨ", "#"] ͬͪ͜ͳΒΦοέʔ
\s ͱ [:space:] ԿͰҧ͏ͷʁ
Rubyͷਖ਼نදݱΛௐͯΈͨ
ൃදͷલఏ • RubyϫΧϧ • ਖ਼نදݱνϣοτγοςϧ • /Ebisu\.rb#\d+/ ͘Β͍ϫΧϧ • (ͪͳΈʹ)
“Ebisu.rb#18” ʹϚον͠·͢
Rubyͷਖ਼نදݱΛௐͯΈͨ
Rubyͷਖ਼نදݱΤϯδϯ • َӢ https://github.com/k-takata/Onigmo/ • Ruby ͷਖ਼نදݱΤϯδϯ • Ruby 2.0
͔Β࠾༻ • PerlͰΘΕ͍ͯΔʁ • ଞͰΘΕ͍ͯͳ͍ʁ
Ruby ͷਖ਼نදݱ • ௐͯʮ͓ͬʯͱࢥͬͨͷΛϐοΫΞοϓ • Character Property • ෦ࣜݺͼग़͠ •
ඇแؚԋࢉࢠ • ઌಡΈɺޙಡΈ
Ruby ͷਖ਼نදݱ • ௐͯʮ͓ͬʯͱࢥͬͨͷΛϐοΫΞοϓ • Character Property • ෦ࣜݺͼग़͠ •
ඇแؚԋࢉࢠ • ઌಡΈɺޙಡΈ
Character Property • \p{} Ͱ Unicode ͷ Character Property Λࢦఆ
Ͱ͖Δ • ͻΒ͕ͳɺΧλΧφɺࣈɺֆจࣈɺಛघͳ จࣈ ͍Ζ͍ΖϚονͰ͖Δ
ͻΒ͕ͳɾΧλΧφʹϚον "ΈΜνϟϨ".match(/\p{Hiragana}+/) => #<MatchData “ΈΜ"> "ΈΜνϟϨ".match(/\p{Katakana}+/) => #<MatchData “νϟϨ">
ֆจࣈʹϚον "͙ʔͺΜͪ".match(/\p{Emoji}/) => #<MatchData "">
৭͖ֆจࣈʹϚον "͙ʔ$ͺΜͪ".match(/\p{Emoji}/) => #<MatchData ""> ৭͕ണ͕Εͯ͠·͏ "͙ʔ$ͺΜͪ".match(/\p{Emoji}\p{Emoji_Modifier}/) #=> #<MatchData "$">
৭(skin)ࢦఆ͢ΕOK
ٽ͔ͳ͍Ͱ… (´°̥̥̥ω°̥̥̥ʆ) "(´° ̥ ̥̥ω° ̥ ̥̥ʆ)".gsub(/\p{Combining_Mark}/, '') => “(´°ω°ʆ)"
݁߹จࣈΛۭจࣈʹม͢ΕྦΛ১͍ڈΕΔʂ
Ruby ͷਖ਼نදݱ • ௐͯʮ͓ͬʯͱࢥͬͨͷΛϐοΫΞοϓ • Character Property • ෦ࣜݺͼग़͠ •
ඇแؚԋࢉࢠ • ઌಡΈɺޙಡΈ
෦ࣜݺͼग़͠ • \g{name} άϧʔϓͷࣜͦͷͷΛݺͼग़͢ • \1, \2 ͷΑ͏ͳޙํࢀরͱҧ͏
໋ྩ͞Ε͍ͯΔϝϩε “ΕϝϩεౖΕϝϩε伻Εϝϩε” .match(/(.Εϝϩε)\g<1>\g<1>/) => #<MatchData "ΕϝϩεౖΕϝϩε伻Εϝϩε" 1:”伻Εϝϩε "> (.Εϝϩε)\g<1>\g<1>ɹ (.Εϝϩε)(.Εϝϩε)(.Εϝϩε)
ͱಉ͡ “Εϝϩε৸Δϝϩε伻Εϝϩε" .match(/(.Εϝϩε)\g<1>\g<1>/) => nil
Ruby ͷਖ਼نදݱ • ௐͯʮ͓ͬʯͱࢥͬͨͷΛϐοΫΞοϓ • Character Property • ෦ࣜݺͼग़͠ •
ඇแؚԋࢉࢠ • ઌಡΈɺޙಡΈ
ඇแؚԋࢉࢠ • (?~) ͰจࣈྻΛؚ·ͳ͍͕දݱͰ͖Δ • (?~abc) จࣈྻ abc Λؚ·ͳ͍ͷҙ •
ab ac ڐ͢ • ࢀߟ [^abc] a, b, c ͷ͍ͣΕͷจࣈؚ·ͳ͍ͷҙ • ab ac ڐ͞ͳ͍
ίϝϯτΞτͷநग़ "/* ͜͜ͷ࣮࠷ѱͰ͢(*^o^*)/ */" .match(%r{/\*(?~\*/)\*/}) => #<MatchData "/* ͜͜ͷ࣮࠷ѱͰ͢(*^o^*)/ */">
"/* ͜͜ͷ࣮࠷ѱͰ͢(*^o^*)/ */" .match(%r{/\*[^\*]*\*+(([^\*/][^\*]*)\*+)*/}) => #<MatchData "/* ͜͜ͷ࣮࠷ѱͰ͢(*^o^*)/ */" 1:")/ *" 2:")/ "> (?~) ΛΘͳ͍ͱͪΐͬͱେม ※ https://qiita.com/k-takata/items/4e45121081c83d3d5bfd
Ruby ͷਖ਼نදݱ • ௐͯʮ͓ͬʯͱࢥͬͨͷΛϐοΫΞοϓ • Character Property • ෦ࣜݺͼग़͠ •
ඇแؚԋࢉࢠ • ઌಡΈɺޙಡΈ
ઌಡΈɺޙಡΈ • (?=) (?<=) ͳͲ • Ϛον͢Δ݅ʹࢦఆ͢Δ͚Ͳ Ϛονͨ݁͠ՌʹؚΊͨ͘ͳ͍࣌ʹ͏ •
͍ํʹΑͬͯANDͬΆ͑͘Δ
ܙൺणͷ൪ͷΈऔಘ "౦ژौ୩۠ܙൺण1-8-5 ౦༸Ϗϧ 3֊" .match(/(?<=ܙൺण)\S+/) => #<MatchData "1-8-5">
౦ژͷौ୩ͷܙൺणͷ͚࣌ͩϚον "౦ژौ୩۠ܙൺण1-8-5 ౦༸Ϗϧ 3֊" .match(/(?=.*౦ژ)(?=.*ौ୩)(?=.*ܙൺण).*/) => #<MatchData "౦ژौ୩۠ܙൺण1-8-5 ౦༸Ϗϧ 3֊">
"ژौ୩۠ܙൺण1-8-5 ౦༸Ϗϧ 3֊" .match(/(?=.*౦ژ)(?=.*ौ୩)(?=.*ܙൺण).*/) => nil ژͩͱϚον͠ͳ͍
͓·͚ • \s ͱ [:space:] ͕ҧ͏ཧ༝ΛௐͯΈͨ
َӢͷυΩϡϝϯτͰʁ • \s • 0009, 000A, 000B, 000C, 000D, 0085(NEL)
• Line_Separator, Paragraph_Separator, Space_Separator • [:space:] • 0009, 000A, 000B, 000C, 000D, 0085(NEL) • Line_Separator, Paragraph_Separator, Space_Separator
َӢͷυΩϡϝϯτͰʁ • \s • 0009, 000A, 000B, 000C, 000D, 0085(NEL)
• Line_Separator, Paragraph_Separator, Space_Separator • ASCII֎ͷจࣈΛؚΉ͔Ͳ͏͔ ONIG_OPTION_ASCII_RANGE Φϓγϣϯʹґଘ ͢Δɻ • [:space:] • 0009, 000A, 000B, 000C, 000D, 0085(NEL) • Line_Separator, Paragraph_Separator, Space_Separator • ASCII֎ͷจࣈʹϚον͢Δ͔Ͳ͏͔ ONIG_OPTION_ASCII_RANGE Φϓγϣϯ ͱ ONIG_OPTION_POSIX_BRACKET_ALL_RANGE Φϓγϣϯʹґଘ͢Δɻ
จࣈू߹Φϓγϣϯʹώϯτ͕ • d: σϑΥϧτ (Ruby 1.9.3 ޓ) \w, \d, \s
ɺඇASCIIจࣈʹϚον͠ͳ͍ɻ POSIXϒϥέοτɺ֤ΤϯίʔσΟϯάͷϧʔϧʹै ͏ɻ • u: Unicode ONIG_OPTION_ASCII_RANGEΦϓγϣϯ͕ΦϑʹͳΔɻ \w (\W), \d (\D), \s (\S), \b (\B), POSIXϒϥέοτɺ֤Τ ϯίʔσΟϯάͷϧʔϧʹै͏ɻ
\s ͱ [:space:] ͷڍಈ • σϑΥϧτ • \s: ASCIIจࣈͷΈର •
[:space:]: ASCIIจࣈҎ֎ର
จࣈू߹ΦϓγϣϯΛࢦఆ͢Δͱʁ "#λάɹ".scan(/(?u)#[^#\s]+/) => ["#λά"] \sͰશ֯εϖʔεΛѻ͑Δ
ࢀߟ • [ਖ਼نදݱ](https://docs.ruby-lang.org/ja/latest/doc/spec=2fregexp.html) • [RegexpΫϥε](https://docs.ruby-lang.org/ja/latest/class/Regexp.html) • [Emoji Properties](http://unicode.org/reports/tr51/#Emoji_Properties) • [َӢ](https://github.com/k-takata/Onigmo/)
• [َӢʹඇแؚΦϖϨʔλΛ࣮ͨ͠](https://qiita.com/k-takata/items/ 4e45121081c83d3d5bfd) • [ਖ਼نදݱϝϞ](http://www.kt.rim.or.jp/~kbk/regex/regex.html)
͋Γ͕ͱ͏͍͟͝·ͨ͠