Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Write an interpreter in Ruby

Write an interpreter in Ruby

Talk for the RedDotRuby Conference 2024

Mario Arias

July 26, 2024
Tweet

More Decks by Mario Arias

Other Decks in Technology

Transcript

  1. About me • Not a Ruby expert • Ruby-curious? •

    A Kotlin nerd • Almost 20 years of experience • OSS Contributor 2
  2. Why do you want to write an interpreter? … is

    everything ok at home? • Is fun • To refresh or to a cquire new knowledge a bout computer science. • Useful c a ses: • Expression l a ngu a ges • Business rules • Extern a l DSLs 3
  3. The Monkey Language … it doesn’t exists, sort of, in

    a way • Cre a ted by Thorsten B a ll • It exists when you write it • C-like-JS-like synt a x • First-cl a ss a nd high-order functions • Integers, boole a ns, a rr a ys a nd h a shes • It h a s a reference implement a tion in Go, with unit tests 4
  4. A f ibonacci function Recursive let fibonacci = fn(x) {

    if (x < 2) { return x; } else { fibonacci(x - 1) + fibonacci(x - 2); } }; fibonacci(35); 5
  5. Pepa … like the character from the movie Encanto •

    A Ruby implement a tion of the Monkey L a ngu a ge • Ruby 3.1 a nd onw a rds • Fully tested • RBS • https://github.com/M a rioAri a sC/pep a 6
  6. The interpreter f low …a simpli fi ed version of

    it Source code Lexer Tokens P a rser AST Ev a lu a tor Execution 7
  7. Tokens … not the LLM ones https://github.com/M a rioAri a

    sC/pep a /blob/m a in/lib/tokens.rb let fibonacci = fn(x) { if (x < 2) { return x; } else { fibonacci(x - 1) + fibonacci(x - 2); } }; fibonacci(35); LET ^ “let” FUNCTION ^ “fn” RETURN ^ “return” ELSE ^ “else” IF ^ “if” IDENT ^ “ f ibon a cci” IDENT ^ “ f ibon a cci” IDENT ^ “ f ibon a cci” IDENT ^ “ f ibon a cci” IDENT ^ “x” IDENT ^ “x” ASSIGN ^ “=” LPAREN ^ “(” RPAREN ^ “)” LBRACE ^ “(” 8
  8. Lexer One character at the time https://github.com/M a rioAri a

    sC/pep a /blob/m a in/lib/lexers.rb We need to m a n a ge sever a l scen a rios • Single ch a r a cter tokens: =, (, ), {, } a nd others • Double ch a r a cter tokens: !=, == • Strings • Integers • One a lph a betic ch a r a cter + one or more a lph a numeric ch a r a cters • Reserved word • Identi f ier 9
  9. AST let fibonacci = fn(x) { if (x < 2)

    { return x; } else { fibonacci(x - 1) + fibonacci(x - 2); } }; fibonacci(35); https://github.com/M a rioAri a sC/pep a /blob/m a in/lib/ a st.rb Progr a m • st a tements • let st a tement • function liter a l • p a r a meters • block st a tement • if expression • condition • in f ix expression • identi f ier • oper a tor • integer liter a l … 10
  10. AST https://github.com/M a rioAri a sC/pep a /blob/m a in/lib/

    a st.rb Progr a m: List of St a tements St a tement LetSt a tement: let foo = 1; ExpressionSt a tement: A st a tement th a t cont a ins a n expression ReturnSt a tement: return foo; BlockSt a tement: List of St a tements inside a block Expression Pre f ixExpression: !true; In f ixExpression: 1 + 1; C a llExpression: myFunction(1); IndexExpression: myHash[“foo”] IfExpression: if (x > 2) {“x”} else {“y”} Identi f ier IntegerLiter a l: 1; Boole a nLiter a l: true; Arr a yLiter a l: [1, 2, 3]; FunctionLiter a l fn(x) {x + 2} StringLiter a l: “Hello” H a shLiter a l: {“foo”: 1, true:2, 3:3} 11
  11. Parser One token at the time https://github.com/M a rioAri a

    sC/pep a /blob/m a in/lib/p a rsers.rb P a rsers Recursive descent p a rsers Top-down oper a tor precedence p a rser Top-down oper a tor precedence p a rser a .k. a “Pr a tt P a rser”, cre a ted by V a ugh a n Pr a tt • E a sy to implement a nd underst a nd • Slowish […] is very simple to understand, trivial to implement, easy to use, extremely ef f icient in practice if not in theory, yet f lexible enough to meet most reasonable syntactic needs of users […] 12
  12. Parser One token at the time P a rse a

    progr a m, i.e., a list of st a tements Every method uses parse_expression 13
  13. Parser One token at the time • Is there a

    method to p a rse the current token? • INT, TRUE, FALSE, STRING, FUNCTION • IDENT, • BANG, MINUS, • LPAREN, LBRACKET, LBRACE • IF • Is there a nother token on the right, a nd does it h a ve the correct precedence? • Is there a method to p a rse the next token? • PLUS, MINUS, SLASH, ASTERISK, LT, GT, EQ, NOT_EQ • LPAREN e.g., myFuction ( … • LBRACKET e.g., myArray [ … • 1 + 2 + 3 => ((1 + 2) + 3) 14
  14. Evaluator …now we’re talking To ev a lu a te,

    we need three things: • A w a y to communic a te with our host l a ngu a ge. We will c a ll it the Object System • https://github.com/M a rioAri a sC/pep a /blob/m a in/lib/objects.rb • Where to store v a lues a nd their scopes, e.g. Identi f iers, p a r a meters. We will c a ll it the Environment • The ev a lu a tor, in our c a se, is a Tree-W a lking ev a lu a tor • https://github.com/M a rioAri a sC/pep a /blob/m a in/lib/ev a lu a tor.rb 15
  15. Evaluator …now we’re talking Ev a lu a te a

    ll st a tements, return the l a st one Recursively w a lks the AST 16
  16. Evaluator …now we’re talking Re a d from the environment

    or the BUILTINS Re a d from itself or its outer Environment 17
  17. Executing …Gotta go fast!! hyperfine - w 3 './benchmark' Benchmark

    1: ./benchmark Time (mean ± σ): 149.411 s ± 1.317 s [User: 149.215 s, System: 0.193 s] Range (min … max): 146.906 s … 150.827 s 10 runs Ruby 3.3.3 - 20.5 MB 18
  18. Executing …Gotta go fast!! Python 3.12 - 11.37 MB hyperfine

    - w 1 - M 3 'python benchmarks.py' Benchmark 1: python benchmarks.py Time (mean ± σ): 682.299 s ± 3.465 s [User: 468.764 s, System: 204.651 s] Range (min … max): 678.526 s … 685.340 s 3 runs 0 s 175 s 350 s 525 s 700 s Ruby 3.3.3 Python 3.12 682.299 149.411 https://github.com/M a rioAri a sC/bruno 19
  19. Executing …Gotta go fast!! Lu a - 2.75 MB hyperfine

    - w 3 'lua benchmarks.lua' Benchmark 1: lua benchmarks.lua Time (mean ± σ): 140.583 s ± 2.741 s [User: 140.575 s, System: 0.004 s] Range (min … max): 138.264 s … 147.545 s 10 runs 0 s 175 s 350 s 525 s 700 s Ruby 3.3.3 Python 3.12 Lu a 140.583 682.299 149.411 https://github.com/M a rioAri a sC/juliet a 20
  20. The Lua code in this post is c**p. Get rid

    of the silly OO emulation and it would be tons faster. - Some Reddit user. 21
  21. Executing …Gotta go fast!! Lu a JIT - 2.62 MB

    hyperfine - w 3 'luajit benchmarks.lua' Benchmark 1: luajit benchmarks.lua Time (mean ± σ): 82.749 s ± 1.416 s [User: 82.740 s, System: 0.002 s] Range (min … max): 80.846 s … 85.703 s 10 runs 175 s 350 s 525 s 700 s Ruby 3.3.3 Python 3.12 Lu a Lu a JIT 82.749 140.583 682.299 149.411 22
  22. Executing …Gotta go fast!! Ruby YJIT - 21.25 MB hyperfine

    - w 3 './benchmark - linux - yjit' Benchmark 1: ./benchmark - linux - yjit Time (mean ± σ): 52.484 s ± 0.681 s [User: 52.279 s, System: 0.197 s] Range (min … max): 51.637 s … 54.159 s 10 runs 175 s 350 s 525 s 700 s Ruby 3.3.3 Python 3.12 Lu a Lu a JIT Ruby YJIT 52.484 82.749 140.583 682.299 149.411 Good Job M a xime! 23
  23. Executing …Gotta go fast!! JRuby - 751.07 MB hyperfine -

    w 3 './benchmark - linux - jruby' -- export - json jruby.json Benchmark 1: ./benchmark - linux - jruby Time (mean ± σ): 186.716 s ± 23.751 s [User: 195.723 s, System: 1.971 s] Range (min … max): 119.831 s … 199.868 s 10 runs Warning: Statistical outliers were detected. 175 s 350 s 525 s 700 s Ruby 3.3.3 Python 3.12 Lu a Lu a JIT Ruby YJIT JRuby 186.716 52.484 82.749 140.583 682.299 149.411 - Xcompile.invokedynamic 24
  24. Executing …Gotta go fast!! Tru ff leRuby - 1946.62 MB

    hyperfine - w 3 './benchmark - truffleruby' Benchmark 1: ./benchmark - truffleruby Time (mean ± σ): 8.173 s ± 0.177 s [User: 11.883 s, System: 0.716 s] Range (min … max): 7.898 s … 8.460 s 10 runs 175 s 350 s 525 s 700 s Ruby 3.3.3 Python 3.12 Lu a Lu a JIT Ruby YJIT JRuby Tru ff leRuby 8.173 186.716 52.484 82.749 140.583 682.299 149.411 25
  25. Executing …Gotta go fast!! Go 1.22.4 - 8 MB hyperfine

    - w 3 './fibonacci -- engine=eval' Benchmark 1: ./fibonacci -- engine=eval Time (mean ± σ): 10.546 s ± 0.042 s [User: 11.254 s, System: 0.353 s] Range (min … max): 10.503 s … 10.650 s 10 runs 175 s 350 s 525 s 700 s Ruby 3.3.3 Python 3.12 Lu a Lu a JIT Ruby YJIT JRuby Tru ff leRuby G o 1.22.4 10.546 8.173 186.716 52.484 82.749 140.583 682.299 149.411 https://github.com/M a rioAri a sC/monkey 26
  26. Executing …Gotta go fast!! C - 1.75 MB hyperfine -

    w 3 './bin/benchmark eval' Benchmark 1: ./bin/benchmark eval Time (mean ± σ): 10.583 s ± 0.211 s [User: 10.581 s, System: 0.001 s] Range (min … max): 10.407 s … 11.075 s 10 runs 175 s 350 s 525 s 700 s Ruby 3.3.3 Python 3.12 Lu a Lu a JIT Ruby YJIT JRuby Tru ff leRuby G o 1.22.4 C 10.583 10.546 8.173 186.716 52.484 82.749 140.583 682.299 149.411 https://github.com/ a bhin a v-up a dhy a y/cmonkey 27
  27. Executing …Gotta go fast!! Cryst a l 1.12.2 - 4.25

    MB hyperfine - w 3 './benchmarks.sh -- eval - fast' Benchmark 1: ./benchmarks.sh -- eval - fast Time (mean ± σ): 4.942 s ± 0.102 s [User: 4.935 s, System: 0.006 s] Range (min … max): 4.829 s … 5.189 s 10 runs 175 s 350 s 525 s 700 s Ruby 3.3.3 Python 3.12 Lu a Lu a JIT Ruby YJIT JRuby Tru ff leRuby G o 1.22.4 C ryst a l 1.12.2 4.942 10.583 10.546 8.173 186.716 52.484 82.749 140.583 682.299 149.411 https://github.com/M a rioAri a sC/monyet 28
  28. Even faster?! Is it possible? We need a di ff

    erent a ppro a ch. A new kind of interpreter: • Production-re a dy • VM a nd m a ybe a JIT • Written by sm a rt people. 30
  29. Of course, Ruby!! D’oh! But Ruby c a nnot interpret

    Monkey code, therefore we need … 31
  30. Our new f low We have 80% of it Source

    code Lexer Tokens P a rser AST Ev a lu a tor Execution Tr a nspil a tion Kernel.eval 33
  31. Transpilation Everyone knows what to do Invoke to_rb in a

    ll the st a tements M a n a ge the synt a ctic a l di ff erences 34
  32. Executing …Gotta go fast!! Ruby Tr a nspiled - 20.38

    MB hyperfine - w 3 './exe/benchmark - linux - tr - yjit' Benchmark 1: ./exe/benchmark - linux - tr - yjit Time (mean ± σ): 150.3 ms ± 1.2 ms [User: 133.1 ms, System: 16.9 ms] Range (min … max): 148.3 ms … 152.6 ms 19 runs 175 s 350 s 525 s 700 s Ruby 3.3.3 Python 3.12 Lu a Lu a JIT Ruby YJIT JRuby Tru ff leRuby G o 1.22.4 C ryst a l 1.12.2 Tr a nspiled 0.15 4.942 10.583 10.546 8.173 186.716 52.484 82.749 140.583 682.299 149.411 35
  33. The best way to write an interpreter is not to

    write an interpreter - Me 36