Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Rearchitect Ripper

Rearchitect Ripper

yui-knk

May 16, 2024
Tweet

More Decks by yui-knk

Other Decks in Programming

Transcript

  1. May 15th - 17th, 2024 NAHA CULTURAL ARTS THEATER NAHArt,

    Okinawa, Japan RubyKaigi 2023 was “Great Parser Era”
  2. May 15th - 17th, 2024 NAHA CULTURAL ARTS THEATER NAHArt,

    Okinawa, Japan “Parser Renaissance”
  3. RubyKaigi 2023 LT Introduced “parse.y” with 3 courses & 15

    themes You must completely understood “parse.y”
  4. May 15th - 17th, 2024 NAHA CULTURAL ARTS THEATER NAHArt,

    Okinawa, Japan EXTRA STAGE “Ripper” CHALLENGE 19
  5. About me Yuichiro Kaneko yui-knk (GitHub) / spikeolaf (Twitter) Treasure

    Data Engineering Manager of Applications Backend CRuby committer, mainly develop parser generator and parser Lrama LALR (1) parser generator (2023, Ruby 3.3) Love LR parser
  6. May 15th - 17th, 2024 NAHA CULTURAL ARTS THEATER NAHArt,

    Okinawa, Japan The Bison Slayer The parser monster The world is now in the great age of parsers. People are setting sail into the vast sea of parsers. - RubyKaigi 2023 LT- Yuichiro Kaneko https://twitter.com/kakutani/status/1657762294431105025/ NEW !!!
  7. “parse.y” Who's Who in 2024 3rd contributor for parse.y But

    I’m still on light side The Creator of Ruby The patch monster Me The Organizer of TRICK
  8. S-expression Ripper provides an easy interface for parsing your program

    into a symbolic expression tree (or S- expression) https://github.com/ruby/ruby/blob/v3_3_0/ext/ripper/lib/ripper.rb
  9. Ripper is … A Ruby script parser You can get

    information from the parser with event- based style Abstract syntax trees Simple lexical analysis https://github.com/ruby/ruby/blob/v3_3_0/ext/ripper/lib/ripper.rb
  10. Low level interface Ripper provides “on_XXX” methods on_int, on_op, on_binary,

    on_stmts_new, … Simply count the number of method call
  11. How low level interface works For example, “1 + 2”

    is provided on_int(“1”) is called when “1” is scanned 1 2 + 2 + 1 1 (“count 1”) 2 + on_int(“1”) is called and “count 1” is returned 1
  12. How low level interface works on_int(“2”) is called when “2”

    is scanned 2 1 (“count 1”) 2 (“count 5”) + 1 (“count 1”) + on_int(“2”) is called and “count 5” is returned 2
  13. How low level interface works on_binary is called when 1,

    +, 2 are reduced to arg Arguments are “count 1”, :+ and “count 5” Returned values are passed to another method call 1 (“count 1”) 2 (“count 5”) + arg (“count 6”) 1 (“count 1”) 2 (“count 5”) + on_binary("count 1", :+, "count 5") is called and “count 6” is returned
  14. May 15th - 17th, 2024 NAHA CULTURAL ARTS THEATER NAHArt,

    Okinawa, Japan How Ripper is implemented?
  15. How parse.y is used parse.y is a source of ripper.y

    parse.y parse.c parse.h Lrama ripper.y tool/id2token.rb tools/preproc.rb Lrama ripper.c
  16. Comments in parse.y is transformed to C codes in ripper.y

    Comments are not comments !! parse.y is two-faced parse.y ripper.y
  17. Jeremy’s challenge “This isn't a very clean way to fix

    it, but I was not able to figure out a way to fix it by modifying parse.y.”
  18. Nobu’s challenge Make semantic value stack to manage callback value

    and parser value by using union. But it’s not a finisher.
  19. Parser and Stack CRuby’s parser is LR parser LR parser

    is implemented as pushdown automaton LR parser manages semantic value with stack
  20. Parser semantic value stack Parser manages Nodes on the stack

    arg 1 2 + NODE_INTEGER NODE_INTEGER NODE_OPCALL NODE_INTEGER NODE_INTEGER Stack to manage Node
  21. Ripper semantic value stack Ripper manages Ruby Objects on the

    stack arg 1 2 + “count 1” “count 5” “count 6” #on_binary Stack to manage Ruby Object Call #on_binary method Return value
  22. How to do semantic analysis block_dup_check function checks the existence

    of NODE_BLOCK_PASS and NODE_ITER Ripper can’t do the check because it doesn’t have nodes primary method_call brace_block NODE_FCALL NODE_ITER block_dup_check Stack to manage Node NODE_BLOCK_ PASS
  23. May 15th - 17th, 2024 NAHA CULTURAL ARTS THEATER NAHArt,

    Okinawa, Japan If single stack is not enough, then let use two stacks.
  24. If tow stacks exist Ripper can manages Nodes and Ruby

    Objects arg 1 2 + NODE_INTEGE R NODE_INTEGE R NODE_OPCAL L NODE_INTEGE R NODE_INTEGE R Stack to manage Node “count 1” “count 5” “count 6” #on_binary Stack to manage Ruby Object Call #on_binary method Return value
  25. May 15th - 17th, 2024 NAHA CULTURAL ARTS THEATER NAHArt,

    Okinawa, Japan Bison provides only one stack
  26. How to use callbacks New Ripper uses Ruby’s Array as

    its stack In %after_shift, “rb_ary_push” the object to the stack 1 2 + NODE_INTEGER NODE_INTEGER “count 1” “count 2” rb_ary_push
  27. How to use callbacks In %after_reduce, “rb_ary_pop” objects then “rb_ary_push”

    the object to the stack 1 2 + NODE_INTEGER NODE_INTEGER “count 1” “count 2” rb_ary_push
  28. How to use $:n variable Need to access the Object

    on the stack, like $1 $:n is expanded to minus index integer 1 2 + NODE_INTEGER NODE_INTEGER “count 1” “count 2” $1 $3 ary[$:1] => ary[-3] ary[$:3] => ary[-1]
  29. Day 0: Night Cruise at RubyKaigi 2024 by ESM What

    do you think of recent parse.y ? Best parse.y in the last 10 years
  30. Fix other bugs Bug #18988 Bug #20055 Unreported bugs, e.g.

    omitted warning for “if 1 then end”
  31. Fix other problems Different functions were defined for parser and

    ripper Because the type of semantic value was different It was too difficult… https://github.com/ruby/ruby/blob/v3_3_0/parse.y Parser Ripper NODE NODE VALUE VALUE
  32. Only one call_bin_op is needed Type of $1 is same

    in both parser and ripper VALUE, Ruby Object, is managed by $:1 Parser / Ripper Ripper NODE NODE VALUE VALUE
  33. Bene fi ts of the re-architecture Current ripper is super

    set of parser It’s easy to follow up parser changes, like new syntax We can maintain Ripper Parser Ripper Parser Ripper
  34. May 15th - 17th, 2024 NAHA CULTURAL ARTS THEATER NAHArt,

    Okinawa, Japan See you next time at Lrama