Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Let’s write a parser! [SoundCloud HQ edition]
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
Denis Defreyne
May 17, 2016
Programming
0
270
Let’s write a parser! [SoundCloud HQ edition]
Denis Defreyne
May 17, 2016
Tweet
Share
More Decks by Denis Defreyne
See All by Denis Defreyne
The importance of naming
denisdefreyne
0
130
An introduction to fibers
denisdefreyne
0
260
Code as data (RubyConfBY 2019 edition)
denisdefreyne
0
150
Code as data
denisdefreyne
0
220
How to memoize
denisdefreyne
0
230
Clean & fast code with enumerators
denisdefreyne
0
170
Fibers
denisdefreyne
0
530
Let’s create a programming language! [SoundCloud HQ edition]
denisdefreyne
0
260
Let’s create a programming language! [RUG::B edition]
denisdefreyne
1
250
Other Decks in Programming
See All in Programming
Basic Architectures
denyspoltorak
0
660
16年目のピクシブ百科事典を支える最新の技術基盤 / The Modern Tech Stack Powering Pixiv Encyclopedia in its 16th Year
ahuglajbclajep
5
990
カスタマーサクセス業務を変革したヘルススコアの実現と学び
_hummer0724
0
590
CSC307 Lecture 07
javiergs
PRO
0
550
今こそ知るべき耐量子計算機暗号(PQC)入門 / PQC: What You Need to Know Now
mackey0225
3
370
AIエージェント、”どう作るか”で差は出るか? / AI Agents: Does the "How" Make a Difference?
rkaga
4
2k
FOSDEM 2026: STUNMESH-go: Building P2P WireGuard Mesh Without Self-Hosted Infrastructure
tjjh89017
0
140
The Past, Present, and Future of Enterprise Java
ivargrimstad
0
470
Oxlintはいいぞ
yug1224
5
1.3k
CSC307 Lecture 09
javiergs
PRO
1
830
[KNOTS 2026登壇資料]AIで拡張‧交差する プロダクト開発のプロセス および携わるメンバーの役割
hisatake
0
240
Automatic Grammar Agreementと Markdown Extended Attributes について
kishikawakatsumi
0
180
Featured
See All Featured
Unlocking the hidden potential of vector embeddings in international SEO
frankvandijk
0
170
Building a A Zero-Code AI SEO Workflow
portentint
PRO
0
300
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
659
61k
Mobile First: as difficult as doing things right
swwweet
225
10k
Pawsitive SEO: Lessons from My Dog (and Many Mistakes) on Thriving as a Consultant in the Age of AI
davidcarrasco
0
62
First, design no harm
axbom
PRO
2
1.1k
We Have a Design System, Now What?
morganepeng
54
8k
Ruling the World: When Life Gets Gamed
codingconduct
0
140
brightonSEO & MeasureFest 2025 - Christian Goodrich - Winning strategies for Black Friday CRO & PPC
cargoodrich
3
97
Amusing Abliteration
ianozsvald
0
95
Keith and Marios Guide to Fast Websites
keithpitt
413
23k
Why Mistakes Are the Best Teachers: Turning Failure into a Pathway for Growth
auna
0
50
Transcript
Let’s write a parser! DENIS DEFREYNE / SOUNDCLOUD, BERLIN /
MAY 17TH, 2016
1. Language 2
I am Denis. 3
But how do you know that I am Denis? 4
But how do you know that I am Denis? I
told you. I wrote it down. You’ve probably seen me before. Etc. 5
But how do you know that I am Denis? You
understand English. 6
Computers are stupid. 7
8 $ git commit --message="Fix bugs"
9 def greet(name) puts "Hello, #{name}" end
10 def greet(name: String): Unit = { println(s"Hello, $name!") }
Text forms a language, but computers don’t know that. 11
2. Parsing 12
Basic idea: 13 Parser objects that are small, composable, and
purely functional.
14 def read(input, pos)
15 def read(input, pos) Success.new(pos + 1) end
16 def read(input, pos) Failure.new(pos) end
17 char("H") Succeeds if the next character is the given
one.
18 char("H").apply("Hello")
18 H e l l o char("H").apply("Hello")
18 H e l l o 0 1 2 3
4 char("H").apply("Hello")
18 H e l l o 0 1 2 3
4 char("H").apply("Hello")
18 H e l l o 0 1 2 3
4 char("H").apply("Hello")
18 H e l l o 0 1 2 3
4 char("H").apply("Hello") Success(pos = 1)
19 char("H").apply("Adiós")
19 A d i ó s 0 1 2 3
4 char("H").apply("Adiós")
19 A d i ó s 0 1 2 3
4 char("H").apply("Adiós")
19 A d i ó s 0 1 2 3
4 char("H").apply("Adiós")
Failure(pos = 0) 19 A d i ó s 0
1 2 3 4 char("H").apply("Adiós")
20 if input[pos] == @char Success.new(pos + 1) else Failure.new(pos)
end
21 seq(a, b) Succeeds if both given parsers succeed in
sequence.
22 seq(char("H"), char("e")).apply("Hello")
H e l l o 22 0 1 2 3
4 seq(char("H"), char("e")).apply("Hello")
H e l l o 22 0 1 2 3
4 seq(char("H"), char("e")).apply("Hello")
H e l l o 22 0 1 2 3
4 seq(char("H"), char("e")).apply("Hello")
H e l l o 22 0 1 2 3
4 seq(char("H"), char("e")).apply("Hello")
H e l l o 22 0 1 2 3
4 seq(char("H"), char("e")).apply("Hello") Success(pos = 2)
23 seq( char("H"), char("e"), char("l"), char("l"), char("o"), )
24 string(s) Succeeds if all characters in the given string
can be read in sequence.
H e l l o 25 0 1 2 3
4 string("Hello").apply("Hello")
H e l l o 25 0 1 2 3
4 string("Hello").apply("Hello")
H e l l o 25 0 1 2 3
4 string("Hello").apply("Hello")
H e l l o 25 0 1 2 3
4 string("Hello").apply("Hello") Success(pos = 5)
26 eof() Succeeds at the end of input; fails otherwise.
H e l l o 27 0 1 2 3
4 seq(string("Hello"), eof).apply("Hello")
H e l l o 27 0 1 2 3
4 seq(string("Hello"), eof).apply("Hello")
H e l l o 27 0 1 2 3
4 seq(string("Hello"), eof).apply("Hello")
H e l l o 27 0 1 2 3
4 seq(string("Hello"), eof).apply("Hello")
H e l l o 27 0 1 2 3
4 seq(string("Hello"), eof).apply("Hello") Success(pos = 5)
28 0 1 2 3 4 5 H e l
l o ! seq(string("Hello"), eof).apply("Hello!")
28 0 1 2 3 4 5 H e l
l o ! seq(string("Hello"), eof).apply("Hello!")
28 0 1 2 3 4 5 H e l
l o ! seq(string("Hello"), eof).apply("Hello!")
28 0 1 2 3 4 5 H e l
l o ! seq(string("Hello"), eof).apply("Hello!")
28 0 1 2 3 4 5 Failure(pos = 5)
H e l l o ! seq(string("Hello"), eof).apply("Hello!")
29 alt(a, b) Succeeds if either of the given parsers
succeed.
A d i ó s 30 0 1 2 3
4 alt(char("H"), char("A")).apply("Adiós")
A d i ó s 30 0 1 2 3
4 alt(char("H"), char("A")).apply("Adiós")
A d i ó s 30 0 1 2 3
4 alt(char("H"), char("A")).apply("Adiós")
A d i ó s 30 0 1 2 3
4 alt(char("H"), char("A")).apply("Adiós") Success(pos = 1)
31 whitespace_char = alt( char(" "), char("\t"), char("\r"), char("\n"), )
32 opt(p) Succeeds always, but only advances if p succeeds.
33 repeat(p) Succeeds always, and attempts to apply p as
often as possible.
34 repeat(whitespace_char)
35 intersperse(a, b) Alternates between a and b., always ending
with a.
36 intersperse(char("a"), char(",")).apply("a,a,b")
a , a , b 36 0 1 2 3
4 intersperse(char("a"), char(",")).apply("a,a,b")
a , a , b 36 0 1 2 3
4 intersperse(char("a"), char(",")).apply("a,a,b")
a , a , b 36 0 1 2 3
4 intersperse(char("a"), char(",")).apply("a,a,b")
a , a , b 36 0 1 2 3
4 intersperse(char("a"), char(",")).apply("a,a,b")
a , a , b 36 0 1 2 3
4 intersperse(char("a"), char(",")).apply("a,a,b") Success(pos = 3)
37 etc.
3. Examples 38
39 720 6 29530
40 digit = alt( *('0'..'9') .map { |c| char(c) }
)
41 digit = char_in('0'..'9')
42 digit = char_in('0'..'9') nat_number = seq(digit, repeat(digit))
43 digit = char_in('0'..'9') nat_number = repeat1(digit)
44 digit = char_in('0'..'9') nat_number = repeat1(digit) .capture
44 digit = char_in('0'..'9') nat_number = repeat1(digit) .capture
Success(pos = 3, data = "720")
45 def read(input, pos)
46 def read(input, pos) Success.new(pos + 1) end
47 def read(input, pos) Success.new(pos + 1, "blahblah") end
48 dec_number = seq( nat_number, char('.'), nat_number, )
49 Horan,Niall,93 Payne,Liam,93 Tomlinson,Louis,91 Styles,Harry,94 Malik,Zayn,93
50 field = repeat(char_not_in(',', "\n")) line = intersperse(field, char(',')) file
= seq( line.intersperse(char("\n")), eof(), )
50 field = repeat(char_not_in(',', "\n")) line = intersperse(field, char(',')) file
= seq( line.intersperse(char("\n")), eof(), )
50 field = repeat(char_not_in(',', "\n")) line = intersperse(field, char(',')) file
= seq( line.intersperse(char("\n")), eof(), )
50 field = repeat(char_not_in(',', "\n")) line = intersperse(field, char(',')) file
= seq( line.intersperse(char("\n")), eof(), )
51 Horan,Niall,93 Payne,Liam,93 Tomlinson,Louis,91 Styles,Harry,94 Malik,Zayn,93
52 [ ["Horan", "Niall", 93], ["Payne", "Liam", 93], ["Tomlinson", "Louis",
91], ["Styles", "Harry", 94], ["Malik", "Zayn", 93], ]
53 add(1, mul(2, 3)) sub(5, 4)
54 lparen = char('(') rparen = char(')') comma = char(',')
55 expr = alt(lazy { funcall }, nat_number)
56 funcall = seq( identifier, lparen, arg_list, rparen, )
57 letter = char_in('a'..'z') identifier = repeat1(letter)
58 arg_list = intersperse( expr, seq(comma, whitespace), )
59 arg_list = opt( intersperse( expr, seq(comma, whitespace), ) )
60
60 expr_list = intersperse(expr, char("\n"))
60 expr_list = intersperse(expr, char("\n")) program = seq(expr_list, eof)
61 add(1, mul(2, 3)) sub(5, 4)
62 Success(pos = 27)
Where’s the data!!! 63
64 funcall = seq( identifier, lparen, arg_list, rparen, )
65 funcall = seq( identifier.capture, lparen, arg_list, rparen, )
66 funcall = seq( identifier.capture, lparen, arg_list, rparen, ).map do
|data| # stuff here end
67 funcall = seq( identifier.capture, lparen, arg_list, rparen, ).map do
|data| FunCall.new(data[0], data[2]) end
68 add(1, mul(2, 3)) sub(5, 4)
69 [ FunCall.new("add", [ 1, FunCall.new("mul", [2, 3]), ]), FunCall.new("sub",
[5, 4]), ]
And that is how you can write a parser. 70
And that is how you can write a parser using
parser combinators. 71
72 ḌPARSE
72 ḌPARSE A GOOD PARSER LIBRARY FOR RUBY
github.com/ddfreyne/d-parse 73
github.com/ddfreyne/d-parse 73
github.com/ddfreyne/d-parse 73
74
74 require 'd-parse'
74 require 'd-parse' module JSONGrammar
74 require 'd-parse' module JSONGrammar extend DParse::DSL
74 require 'd-parse' module JSONGrammar extend DParse::DSL DIGIT = char_in('0'..'9')
NUMBER = repeat1(DIGIT) end
74 require 'd-parse' module JSONGrammar extend DParse::DSL DIGIT = char_in('0'..'9')
NUMBER = repeat1(DIGIT) end res = Grammar::NUMBER.apply('8700')
75
75 case res
75 case res when DParse::Success puts(res.data.inspect)
75 case res when DParse::Success puts(res.data.inspect) when DParse::Failure $stderr.puts res.pretty_message
exit(1) end
76 expected identifier at line 1, column 36 def
reticulate(splines, threshold, ) { ↑
77 github.com/ddfreyne/d-parse
77 github.com/ddfreyne/d-parse PRE- ALPHA! BE AN EARLY ADOPTER!
78 My name is Denis. Ready to parse your questions.
Find me at
[email protected]
, or @denis on Slack.