Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Rubyで書くParser (自力かライブラリか、それが問題だ)
Search
やきとりい
November 25, 2017
Programming
2.2k
3
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Rubyで書くParser (自力かライブラリか、それが問題だ)
2017.Nov 福岡Ruby会議02 でのセッション資料です。Parser書いたら楽しいよというお話。
やきとりい
November 25, 2017
More Decks by やきとりい
See All by やきとりい
社会の中のわたしの技術 ─ 自分の地図の描き方 #wttjp
yotii23
0
1k
Rubyと自由とAIと
yotii23
6
3.2k
Railsの勉強のすすめかた
yotii23
0
210
株式会社万葉「自分ごと」としての産休・育休(持続的エンジニア人生のための組織戦略) #enechange_meetup
yotii23
4
820
Reading and improving Pattern Matching in Ruby
yotii23
0
340
10年前のRails Girls Japanむかしばなしとわたし #rggjp #rgjp10th
yotii23
3
610
Rubyから広がるプログラミング入門教育〜小学校高学年向けプログラミング入門書『ユウと魔法のプログラミング・ノート』執筆から学んだこと〜
yotii23
2
980
質問を”聴く”技術
yotii23
23
15k
ダイバシティな絵本のご紹介
yotii23
0
3.3k
Other Decks in Programming
See All in Programming
ふつうのFeature Flag実践入門
irof
8
4.2k
Snowflake Summitでの新機能 CoCo / CoWork / snowflake-summit-2026-overall-what-new-coco
tatsuhiro
1
170
ローカルLLMでどこまでコードが書けるか -拡張版 / How much code can be written on a local LLM Extended
kishida
12
4.4k
依存関係から依存物へ―Dependencyという言葉の歴史をひも解く
j_lee
0
130
エンジニア向け会社紹介/Findy Company Profile
findyinc
6
350k
AI 時代のソフトウェア設計の学び方
masuda220
PRO
29
13k
The ROI of Quarkus for Spring Boot Applications
hollycummins
0
140
Signal Forms: Details & Live Coding @enterJS 2026 in Mannheim
manfredsteyer
PRO
0
190
代数的データ型って何が嬉しいの? #frontend_phpcon_do
kajitack
8
3.8k
Performance Engineering for Everyone
elenatanasoiu
0
210
Spring Security 実践 ─ GraphQL APIで実務に役立つ 認証・認可 を学ぶ
wagyu
0
260
AI 輔助遺留系統現代化的經驗分享
jame2408
1
970
Featured
See All Featured
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
12
1.2k
State of Search Keynote: SEO is Dead Long Live SEO
ryanjones
0
210
The agentic SEO stack - context over prompts
schlessera
0
820
Leading Effective Engineering Teams in the AI Era
addyosmani
9
2.1k
Organizational Design Perspectives: An Ontology of Organizational Design Elements
kimpetersen
PRO
1
750
Building a A Zero-Code AI SEO Workflow
portentint
PRO
0
610
The Cult of Friendly URLs
andyhume
79
6.9k
svc-hook: hooking system calls on ARM64 by binary rewriting
retrage
2
310
The Language of Interfaces
destraynor
162
27k
Digital Ethics as a Driver of Design Innovation
axbom
PRO
1
320
The AI Search Optimization Roadmap by Aleyda Solis
aleyda
1
5.9k
A Guide to Academic Writing Using Generative AI - A Workshop
ks91
PRO
1
330
Transcript
RUBYͰॻ͘PARSER (ࣗྗ͔ϥΠϒϥϦ͔ɺͦΕ͕ͩʣ 2017.Nov ԬRubyձٞ02 ௗҪઇ
ௗҪઇ ࣗݾհ w גࣜձࣾສ༿ۈ w 3BJMTΞϓϦέʔγϣϯΤϯδχΞ w 3BJMT(JSMT5PLZPOEΦʔΨφΠβʔ w ༁ॻʹ
w ʰϓϩάϥϛϯά&MJYJSʱ %BWF5IPNBTɺΦʔϜࣾ ాߞҰͱڞ༁ w ʰϧϏΟͷ΅͏͚ΜʱγϦʔζ ϦϯμɾϦΧεஶɹᠳӭࣾ
ௗҪઇ ࣗݾհ ੜ·ΕԬ ౦۠ശ࡚খֶߍ ஜࢵঁֶԂதֶߍ ஜࢵঁֶԂߴֶߍ => େֶ͔Β౦ژ ͍ͭͷؒʹ͔ϓϩάϥϚʹ 2012ԬRubyձٞ01
LT 2015 RailsGirls Fukuoka ίʔν Ԭ ظ ؒ
None
RUBY Ͱॻ͘PARSER ͘͡ • ͳͥParserΛॻ͘͜ͱʹͳͬͨͷ͔ • ઃܭ • ParserΉ͔͍ͣ͠ʂ •
Treetop ͱ͍͏ gem • ͦΕͰָ͍ࣗ͠࡞ParserʢͨͿΜΊͨ΄͏͕͍͍ʣ
ͳͥPARSERΛॻ͘͜ͱʹͳͬͨͷ͔ ͦΕͪΐͬͱͨ͠ग़དྷ৺ͩͬͨ
RUBYʹELIXIRʢ̍ Έ͍ͨͳ ύλʔϯϚονϯάʢ2 ΄͍͠… 1 Elixir: Erlang VM্Ͱಈ͘ϓϩάϥϛϯάݴޠ 2 ύλʔϯϚονϯάɿElixirʹ͋Δ͔͍͍ͬ͜ػೳ
=~ ͰϚονͤͯ͞ύλʔϯมͱ͍͍ͯͨ͠ RUBYͰॻ͖͍ͨύλʔϯϚον
=== ͰϚον͍ͤͨ͞ʢCASEจॻ͖͍ͨʣ RUBYͰॻ͖͍ͨύλʔϯϚον
͜Εʢ̍ ͕ಈ͘ͱ આಘྗ͕͋ΔΜ͡Όͳ͍͔…ʢ2 1 ࠷ॳ༷͚ͩເݟͯͨ 2 ݁ہઆಘྗ͕͔͋ͬͨෆ໌
ͬͯΈͨ ʢRUBY KAIGI YOUTUBE HTTPS://WWW.YOUTUBE.COM/WATCH? V=1M4IPJH0K0E&INDEX=19&T=6S&LIST=PL
RUBYΠϯλϓϦλͷCͷίʔυͷࠩ͜Ε͚ͩ ͬͯΈͨ compile.c parse.y
ʢಈ͚ྑ͠ɺύϑΥʔϚϯεͳͲߟ͑ͳ͍ͷͱ͢Δʣ ઃܭ Ruby script Parse Compile Ruby byte code Evaluator
ʢಈ͚ྑ͠ɺύϑΥʔϚϯεͳͲߟ͑ͳ͍ͷͱ͢Δʣ ઃܭ Ruby script Parse Compile Ruby byte code PatternMatching
%p([a, ‘bc’]) =~ [3, ‘bc’] “[a, ‘bc’]” มϦετ[“a”] มͷఆٛ Evaluator pattern_match obj Parse pattern Binding ΛͱΔͨΊʹ͝ʹΐΔ ASTߏங Ϛον͢Δ͔νΣοΫ มೖ RubyͷClass
ʢಈ͚ྑ͠ɺύϑΥʔϚϯεͳͲߟ͑ͳ͍ͷͱ͢Δʣ ઃܭ Ruby script Parse Compile Ruby byte code PatternMatching
%p([a, ‘bc’]) =~ [3, ‘bc’] “[a, ‘bc’]” มϦετ[“a”] มͷఆٛ Evaluator pattern_match obj Parse pattern Binding ΛͱΔͨΊʹ͝ʹΐΔ RubyͷClass ࠓͷίί‼︎ ASTߏங Ϛον͢Δ͔νΣοΫ มೖ
ʢPATTERN MATCHING ΫϥεͷʣPARSERͷΔ͜ͱ ྫ͑ `%p ([a, ‘bc’])`ͱ͍͏ύλʔϯ͕ࢦఆ͞Εͨ߹ɺ “[a, ‘bc’]” ͱ͍͏จࣈྻΛड͚औͬͯ…
• छྨ:ʮྻʯͰ͋Δ • ཁૉͷҰ൪͕มaͰ͋Δ • ཁૉͷೋ൪͕จࣈྻ ͷ ‘bc’ Ͱ͋Δ • ඞཁͳύλʔϯมͷϦετɿ[a]Ͱ͋Δ ͜ͱΛղੳͯ͠ɺߏʹ͢Δ
%p([a, ‘bc’]) =~ [3, ‘bc’] PARSEͷྲྀΕ “[a, `bc`]” [ ͱ
a ͱ , ͱ `bc` ͱ ] Tokenize จࣈྻ Tokens AST ASTߏங String Node (‘bc’) Array Node Variable Node (a) 1 AST࡞Δͱ͖ʹࠓճύλʔϯมϦετ࡞Δ
PARSEͷྲྀΕ “{status: 200, users: [a, b] }” { ͱ status:
ͱ 200 ͱ , ͱ users: ͱ [ ͱ aͱ , ͱ b ͱ ] ͱ } Tokenize จࣈྻ Tokens ASTߏங %p({status: 200, users: [a, b] }) =~ {status: 200, users: [1, 3] } AST Variable Node (b) val:Array Node Variable Node (a) Hash Node val: Integer Node (200) key: Symbol Node (:status) key: Symbol Node (:users)
࠷ऴతʹཉ͍͠ͷAST %p({status: 200, users: [a, b] }) =~ {status: 200,
users: [1, 3] } AST Variable Node (b) val:Array Node Variable Node (a) Hash Node val: Integer Node (200) key: Symbole Node (:status) key: Symbole Node (:users) {status: 200, users: [a, b] } ɹASTΛḷͬͯɺͱύλʔϯͱϚον͢Δ͔ΛௐΔ Ϛονର ͦͦhash? key ͕ status: ͷ val 200? key ͕users: ͷ val ྻʁ ྻͷཁૉ2? ྻͷཁૉͷ1൪Λมaʹ֨ೲ͠Αʔ ྻͷཁૉͷ2൪Λมbʹ֨ೲ͠Αʔ
Ή͔͔ͣͬͨ͠ ʢͱ͘ʹTOKENIZE ʣ
Tokenize “[a, `bc`]” [ ͱ a ͱ , ͱ `bc`
ͱ ] Tokenize จࣈྻ Tokens Tokens Token ͷλΠϓΛݟͯɺʮ͓ͬྻͷ։͖ه߸͕དྷ͔ͨΒɺ͜ͷޙྻ͕ด͡ Δ·ͰྻͷதͩͳʯΈ͍ͨʹASTΛ࡞ͬͯΏ͘ λΠϓ [ ྻͷ։͖ه߸ a ม , ΧϯϚ `bc` จࣈྻɹ ] ྻͷด͡ه߸
ͬͨͷStringScanner#scan Tokenize • StringScanner#scan • จࣈྻΛ಄͔ΒεΩϟϯͯ͠ɺਖ਼نදݱʹϚονͨ͠ΒϚον෦ Λฦͯͦ͠ͷޙΖ·ͰindexΛ͢͢ΊΔ “[a, `bc`]” [
a , `bc` ] ਖ਼نදݱ λΠϓ /\[/ ྻͷ։͖ ه߸ /[a-z_][a-z0-9_]*/ ม /,/ ΧϯϚ /'.*?'/ จࣈྻɹ /\]/ ྻͷด͡ ه߸ “a, `bc`]” “`bc`]” “]” “[a, `bc`]” จࣈྻ Tokens Scan
ίϛοτ࣌ʹ ྫɿεϖʔε͕2ͭҎ্ʹͳΔͱࣦഊ͢Δόά “[a, `bc`]” “[a, `bc`]”
ͯ͠ͳ͍ʢ͕ΜΔʣ ྫɿࣗ͘͝વʹ{} Λলུͯ͠͏͔͝ͳ͍ϋογϡ %p({ user: 1, from: ‘Fukuoka’}) %p( user:
1, from: ‘Fukuoka’ )
TOKENIZEʹҰͷਖ਼نදݱηοτ͔͠దԠͰ͖ͳ͍ ྫɿ͋ΔλΠϓͷTOKENIZEಠࣗϧʔϧͳͲ͕ѻ͍͑ͯͳ͍ “Name is #{user.name}”
• %p( [x, :y, { "array" => [5, v] }]
) ͘Β͍·ͰParseͰ͖ΔΑ͏ʹͳͬͨ • ࣗྗͰҰ͔ΒParserΛॻ͘ͷ͔ͳΓߝΓ • ֦ுੑʹݶք͋ΔʢΘͨ͠ʹʣ
PARSERΛॻ͍ͯΈΔͱ… • ࠓ·Ͱࣗ͘͝વʹಡΈॻ͖͍ͯͨ͠`[1, 2, 3]` `{status: 200, users: [1, 2]
}`ͳͲ͕ɺ ಥવʮ͜Ε͔Βղऍ͞ΕΔʢ·ͩҙຯΛ࣋ͨͳ͍ʣจࣈྻʯͱ ͯ͠ͷલʹݱΕΔ • εϖʔεɺΧϯϚɺͯ͢ʹҙຯ͕͋Δ • Rubyຊମͷparse͍͢͝ • ਓؒͷ͍͢͝
·͞ʹʮ͏Ұɺ RUBYͱग़ձ͏ʯମݧ
https://github.com/cjheath/treetop ͱ͜ΖͰTreetopͱ͍͏gem͕͋Γ·͢ • PEGϕʔεͷಠࣗͷهड़ํࣜͰਖ਼نදݱͳͲΛͬͯจ๏ϧʔϧΛఆ ٛ͢Δ.treetopϑΝΠϧΛͭ͘Δ • ttίϚϯυʹͦͷϑΝΠϧΛ͢ͱɺͦΕΛݩʹrubyͷparserϑΝΠ ϧΛ࡞ͬͯ͘ΕΔ • ੜ͞ΕͨrubyϑΝΠϧΛrequire
͢Δ͜ͱͰɺsyntaxnode, ͍ΘΏ ΔASTΛߏங͢ΔParserΛ͏͜ͱ͕Ͱ͖Δ • ϧʔϧͷωετͷهड़༰қ
࠷ॳ͔ΒTREETOPΛ ͑ྑ͔ͬͨͷͰ…
ࣗ࡞PARSERͱTREETOPൺֱද ࣗ࡞ Treetop هड़ͷચ࿅ ϧʔϧͷωετ όάͷग़ʹ͘͞ Rubyͱग़ձ͑Δ
ʢෛ͚੯͠Έ͚ͩͰͳ͍ʣ ͦΕͰָ͍ࣗ͠࡞PARSER • ͦͦ࡞Γ࢝Ίͨஈ֊ͰʮParserʯͱ͍͏ͷ͕΅ΜΓ͔͠ཧ ղͰ͖ͯͳ͔ͬͨ • ͜ͷஈ֊ͰTreetopΛͬͯɺநෛ͚͍ͯ͜͠ͳͤͳ͔ͬͨ ͷͰͳ͍͔ • ͍·͍ํ͕Θ͔Βͳͯ͘Treetopͷੜͨ͠Ruby
ParserΛಡΉͱ ؾ͕࣋ͪΘ͔Δ • ࣗͷίʔυ͕શ෦จࣈྻʹݟ͑ΔମݧϓϥΠεϨε • ंྠͷ࠶ൃ໌Ͱ͍͍ɺंྠ͕৺ͷதʹΈཱͯΒΕΔͷେࣄ
ˎ͋ΔఔҎ্ෳࡶͳ͜ͱΛ ͠Α͏ͱ͢Δͱߦ͖٧·Δɺ ͦΖͦΖΓ͑Δͷ͕٢ˎ
ԿͰ RUBYʹग़ձ͍͖ͬͯ·͠ΐ͏ɺ ͋Γ͕ͱ͏͍͟͝·ͨ͠ɻ