Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Ruby on Railroad: The Power of Visualizing CFG

ydah
April 17, 2025

Ruby on Railroad: The Power of Visualizing CFG

RubyKaigi 2025 「Ruby on Railroad: The Power of Visualizing CFG」の発表スライド
https://rubykaigi.org/2025/ #rubykaigi

ydah

April 17, 2025
Tweet

More Decks by ydah

Other Decks in Programming

Transcript

  1. @ydah / Yudai Takada RubyKaigi 2025ʔMatsuyama, Ehime Ehime Prefectural Convention

    Hall 17 April 2025 Ruby on Railroad The Power of Visualizing CFG
  2. Software engineer @ ‣Ruby Committer ‣Node Location, Refacor parse.y ‣A

    committer of Lrama ‣Parameterizing Rules, Inlining Yudai Takada / ydah CAM APR.17:2025 PM 17:21
  3. KansaiRubyKaigi08 CPF IS NOW LIVING!! ▶︎ Kyoto Pontocho Kaburenjo Theater

    2025-06-28(Sat) @shimbaco / @pocke Unsolicted Ads
  4. "Ruby on Railroad: The Power of Visualizing CFG" ʔ @ydah

    Residents of parse.y ‣ I'm in "Five Chariot Stars" ‣ A castle inhabited by the creator of the Ruby, Two monsters, and the Organizers of TRICK $ git shortlog -ns parse.y 1354 Nobuyoshi Nakada 362 Yukihiro "Matz" Matsumoto 252 yui-knk 174 Yusuke Endoh 75 ydah The Patch Monster The Creator of the Ruby The Parser Monster The Organizer of TRICK Me
  5. "Ruby on Railroad: The Power of Visualizing CFG" ʔ @ydah

    parse.y is ‣A file that defines the rules and symbols needed to parse Ruby programs and generate abstract syntax trees (AST) ‣Before Ruby 3.2, it was compiled using the GNU Bison parser generator, and from Ruby 3.3 onward, it uses a parser generator called Lrama
  6. "Ruby on Railroad: The Power of Visualizing CFG" ʔ @ydah

    parse.y written by BNF brace_block : '{' brace_body '}' { $$ = $2; set_embraced_location($$, &@1, &@3); / * % ripper: $ : 2 % * / } | k_do do_body k_end { $$ = $2; set_embraced_location($$, &@1, &@3); / * % ripper: $ : 2 % * / } ;
  7. "Ruby on Railroad: The Power of Visualizing CFG" ʔ @ydah

    Define the structure of grammar brace_block : '{' brace_body '}' { $$ = $2; set_embraced_location($$, &@1, &@3); / * % ripper: $ : 2 % * / } | k_do do_body k_end { $$ = $2; set_embraced_location($$, &@1, &@3); / * % ripper: $ : 2 % * / } ;
  8. "Ruby on Railroad: The Power of Visualizing CFG" ʔ @ydah

    Define the action brace_block : '{' brace_body '}' { $$ = $2; set_embraced_location($$, &@1, &@3); / * % ripper: $ : 2 % * / } | k_do do_body k_end { $$ = $2; set_embraced_location($$, &@1, &@3); / * % ripper: $ : 2 % * / } ;
  9. "Ruby on Railroad: The Power of Visualizing CFG" ʔ @ydah

    Lrama has tracing $ lrama sample.y - - trace=rules Grammar rules: $accept - > program YYEOF program - > expr expr - > term '+' expr expr - > term term - > factor '*' term term - > factor factor - > number
  10. ❯ exe/lrama tmp/parse.tmp.y - - trace=rules Grammar rules: $accept -

    > program END_OF_INPUT $@1 - > ε option_terms - > ε option_terms - > terms compstmt_top_stmts - > top_stmts option_terms program - > $@1 compstmt_top_stmts top_stmts - > none top_stmts - > top_stmt top_stmts - > top_stmts terms top_stmt top_stmt - > stmt top_stmt - > keyword_BEGIN begin_block block_open - > '{' begin_block - > block_open compstmt_top_stmts '}' compstmt_stmts - > stmts option_terms $@2 - > ε $@3 - > ε bodystmt - > compstmt_stmts lex_ctxt opt_rescue . . . bodystmt - > compstmt_stmts lex_ctxt opt_rescue . . stmts - > none stmts - > stmt_or_begin stmts - > stmts terms stmt_or_begin stmt_or_begin - > stmt $@5 - > ε stmt_or_begin - > keyword_BEGIN $@5 begin_block allow_exits - > ε k_END - > keyword_END lex_ctxt $@6 - > ε stmt - > keyword_alias f i tem $@6 f i tem stmt - > keyword_alias tGVAR tGVAR stmt - > keyword_alias tGVAR tBACK_REF stmt - > keyword_alias tGVAR tNTH_REF stmt - > keyword_undef undef_list stmt - > stmt modif i er_if expr_value stmt - > stmt modif i er_unless expr_value stmt - > stmt modif i er_while expr_value stmt - > stmt modif i er_until expr_value stmt - > stmt modif i er_rescue after_rescue stmt stmt - > k_END allow_exits '{' compstmt_stmts '}' stmt - > command_asgn stmt - > mlhs '=' lex_ctxt command_call_value asgn_lhs_mrhs - > lhs '=' lex_ctxt mrhs stmt - > asgn_lhs_mrhs expr - > arg tASSOC $@7 p_in_kwarg p_pvtbl . . . stmt - > mlhs '=' lex_ctxt mrhs_arg modif i er_rescue after_rescue stmt stmt - > mlhs '=' lex_ctxt mrhs_arg stmt - > expr stmt - > YYerror asgn_lhs_command_rhs - > lhs '=' lex_ctxt command_rhs command_asgn - > asgn_lhs_command_rhs op_asgn_command_rhs - > var_lhs tOP_ASGN lex_ctxt command_rhs op_asgn_command_rhs - > primary_value '[' opt_call_args rbracket . . . op_asgn_command_rhs - > primary_value call_op tIDENTIFIER tOP_ASGN . . . op_asgn_command_rhs - > primary_value call_op tCONSTANT tOP_ASGN . . . op_asgn_command_rhs - > primary_value tCOLON2 tIDENTIFIER tOP_ASGN . . . op_asgn_command_rhs - > primary_value tCOLON2 tCONSTANT tOP_ASGN . . . op_asgn_command_rhs - > tCOLON3 tCONSTANT tOP_ASGN lex_ctxt . . . op_asgn_command_rhs - > backref tOP_ASGN lex_ctxt command_rhs command_asgn - > op_asgn_command_rhs def_endless_method_endless_command - > defn_head f_opt_paren_args . . . def_endless_method_endless_command - > defs_head f_opt_paren_args . . . command_asgn - > def_endless_method_endless_command endless_command - > command endless_command - > endless_command modif i er_rescue after_rescue arg option_'\n' - > ε option_'\n' - > '\n' endless_command - > keyword_not option_'\n' endless_command command_rhs - > command_call_value command_rhs - > command_call_value modif i er_rescue after_rescue stmt command_rhs - > command_asgn expr - > command_call expr - > expr keyword_and expr expr - > expr keyword_or expr expr - > keyword_not option_'\n' expr expr - > '!' command_call $@7 - > ε $@8 - > ε expr - > arg keyword_in $@8 p_in_kwarg p_pvtbl p_pktbl p_top_expr_body expr - > arg def_name - > fname defn_head - > k_def def_name $@9 - > ε defs_head - > k_def singleton dot_or_colon $@9 def_name value_expr_expr - > expr expr_value - > value_expr_expr expr_value - > YYerror $@10 - > ε $@11 - > ε expr_value_do - > $@10 expr_value do $@11 command_call - > command command_call - > block_command value_expr_command_call - > command_call command_call_value - > value_expr_command_call block_command - > block_call block_command - > block_call call_op2 operation2 command_args cmd_brace_block - > tLBRACE_ARG brace_body '}' fcall - > tIDENTIFIER fcall - > tCONSTANT fcall - > tFID command - > fcall command_args command - > fcall command_args cmd_brace_block command - > primary_value call_op operation2 command_args command - > primary_value call_op operation2 command_args . . . command - > primary_value tCOLON2 operation2 command_args command - > primary_value tCOLON2 operation2 command_args . . . command - > primary_value tCOLON2 tCONSTANT . . . command - > keyword_super command_args command - > k_yield command_args command - > k_return call_args command - > keyword_break call_args command - > keyword_next call_args mlhs - > mlhs_basic mlhs - > tLPAREN mlhs_inner rparen mlhs_inner - > mlhs_basic mlhs_inner - > tLPAREN mlhs_inner rparen mlhs_basic - > mlhs_head mlhs_basic - > mlhs_head mlhs_item mlhs_basic - > mlhs_head tSTAR mlhs_node mlhs_mlhs_item - > mlhs_item mlhs_mlhs_item - > mlhs_mlhs_item ',' mlhs_item mlhs_basic - > mlhs_head tSTAR mlhs_node ',' mlhs_mlhs_item mlhs_basic - > mlhs_head tSTAR mlhs_basic - > mlhs_head tSTAR ',' mlhs_mlhs_item mlhs_basic - > tSTAR mlhs_node mlhs_basic - > tSTAR mlhs_node ',' mlhs_mlhs_item mlhs_basic - > tSTAR mlhs_basic - > tSTAR ',' mlhs_mlhs_item mlhs_item - > mlhs_node mlhs_item - > tLPAREN mlhs_inner rparen mlhs_head - > mlhs_item ',' mlhs_head - > mlhs_head mlhs_item ',' mlhs_node - > user_variable mlhs_node - > keyword_variable mlhs_node - > primary_value '[' opt_call_args rbracket mlhs_node - > p mlhs_node - > p mlhs_node - > p mlhs_node - > p mlhs_node - > t mlhs_node - > b lhs - > user_va lhs - > keyword lhs - > primary lhs - > primary lhs - > primary lhs - > primary lhs - > primary lhs - > tCOLON3 lhs - > backref cname - > tIDEN cname - > tCONS cpath - > tCOLO cpath - > cname cpath - > prima fname - > tIDEN fname - > tCONS fname - > tFID fname - > op fname - > reswo f i tem - > fname f i tem - > symbo undef_list - > i $@12 - > ε undef_list - > i op - > '|' op - > '^' op - > '&' op - > tCMP op - > tEQ op - > tEQQ op - > tMATCH op - > tNMATCH op - > '>' op - > tGEQ op - > '<' op - > tLEQ op - > tNEQ op - > tLSHFT op - > tRSHFT - >
  11. ❯ exe/lrama tmp/parse.tmp.y - - trace=rules Grammar rules: $accept -

    > program END_OF_INPUT $@1 - > ε option_terms - > ε option_terms - > terms compstmt_top_stmts - > top_stmts option_terms program - > $@1 compstmt_top_stmts top_stmts - > none top_stmts - > top_stmt top_stmts - > top_stmts terms top_stmt top_stmt - > stmt top_stmt - > keyword_BEGIN begin_block block_open - > '{' begin_block - > block_open compstmt_top_stmts '}' compstmt_stmts - > stmts option_terms $@2 - > ε $@3 - > ε bodystmt - > compstmt_stmts lex_ctxt opt_rescue . . . bodystmt - > compstmt_stmts lex_ctxt opt_rescue . . stmts - > none stmts - > stmt_or_begin stmts - > stmts terms stmt_or_begin stmt_or_begin - > stmt $@5 - > ε stmt_or_begin - > keyword_BEGIN $@5 begin_block allow_exits - > ε k_END - > keyword_END lex_ctxt $@6 - > ε stmt - > keyword_alias f i tem $@6 f i tem stmt - > keyword_alias tGVAR tGVAR stmt - > keyword_alias tGVAR tBACK_REF stmt - > keyword_alias tGVAR tNTH_REF stmt - > keyword_undef undef_list stmt - > stmt modif i er_if expr_value stmt - > stmt modif i er_unless expr_value stmt - > stmt modif i er_while expr_value stmt - > stmt modif i er_until expr_value stmt - > stmt modif i er_rescue after_rescue stmt stmt - > k_END allow_exits '{' compstmt_stmts '}' stmt - > command_asgn stmt - > mlhs '=' lex_ctxt command_call_value asgn_lhs_mrhs - > lhs '=' lex_ctxt mrhs stmt - > asgn_lhs_mrhs expr - > arg tASSOC $@7 p_in_kwarg p_pvtbl . . . mlhs_node - > p mlhs_node - > p mlhs_node - > p mlhs_node - > p mlhs_node - > t mlhs_node - > b lhs - > user_va lhs - > keyword lhs - > primary lhs - > primary lhs - > primary lhs - > primary lhs - > primary lhs - > tCOLON3 lhs - > backref cname - > tIDEN cname - > tCONS cpath - > tCOLO cpath - > cname cpath - > prima fname - > tIDEN fname - > tCONS fname - > tFID fname - > op fname - > reswo f i tem - > fname f i tem - > symbo undef_list - > i $@12 - > ε undef_list - > i op - > '|' op - > '^' op - > '&' op - > tCMP op - > tEQ op - > tEQQ op - > tMATCH op - > tNMATCH op - > '>' op - > tGEQ op - > '<' op - > tLEQ op - > tNEQ op - > tLSHFT op - > tRSHFT - > command_call - > command command_call - > block_command value_expr_command_call - > command_call command_call_value - > value_expr_command_call block_command - > block_call block_command - > block_call call_op2 operation2 command_args cmd_brace_block - > tLBRACE_ARG brace_body '}' fcall - > tIDENTIFIER fcall - > tCONSTANT fcall - > tFID command - > fcall command_args command - > fcall command_args cmd_brace_block command - > primary_value call_op operation2 command_args command - > primary_value call_op operation2 command_args . . . command - > primary_value tCOLON2 operation2 command_args command - > primary_value tCOLON2 operation2 command_args . . . command - > primary_value tCOLON2 tCONSTANT . . . command - > keyword_super command_args command - > k_yield command_args command - > k_return call_args command - > keyword_break call_args command - > keyword_next call_args mlhs - > mlhs_basic mlhs - > tLPAREN mlhs_inner rparen mlhs_inner - > mlhs_basic mlhs_inner - > tLPAREN mlhs_inner rparen mlhs_basic - > mlhs_head mlhs_basic - > mlhs_head mlhs_item mlhs_basic - > mlhs_head tSTAR mlhs_node mlhs_mlhs_item - > mlhs_item mlhs_mlhs_item - > mlhs_mlhs_item ',' mlhs_item mlhs_basic - > mlhs_head tSTAR mlhs_node ',' mlhs_mlhs_item mlhs_basic - > mlhs_head tSTAR mlhs_basic - > mlhs_head tSTAR ',' mlhs_mlhs_item mlhs_basic - > tSTAR mlhs_node mlhs_basic - > tSTAR mlhs_node ',' mlhs_mlhs_item mlhs_basic - > tSTAR mlhs_basic - > tSTAR ',' mlhs_mlhs_item mlhs_item - > mlhs_node mlhs_item - > tLPAREN mlhs_inner rparen mlhs_head - > mlhs_item ',' mlhs_head - > mlhs_head mlhs_item ',' mlhs_node - > user_variable mlhs_node - > keyword_variable mlhs_node - > primary_value '[' opt_call_args rbracket stmt - > mlhs '=' lex_ctxt mrhs_arg modif i er_rescue after_rescue stmt stmt - > mlhs '=' lex_ctxt mrhs_arg stmt - > expr stmt - > YYerror asgn_lhs_command_rhs - > lhs '=' lex_ctxt command_rhs command_asgn - > asgn_lhs_command_rhs op_asgn_command_rhs - > var_lhs tOP_ASGN lex_ctxt command_rhs op_asgn_command_rhs - > primary_value '[' opt_call_args rbracket . . . op_asgn_command_rhs - > primary_value call_op tIDENTIFIER tOP_ASGN . . . op_asgn_command_rhs - > primary_value call_op tCONSTANT tOP_ASGN . . . op_asgn_command_rhs - > primary_value tCOLON2 tIDENTIFIER tOP_ASGN . . . op_asgn_command_rhs - > primary_value tCOLON2 tCONSTANT tOP_ASGN . . . op_asgn_command_rhs - > tCOLON3 tCONSTANT tOP_ASGN lex_ctxt . . . op_asgn_command_rhs - > backref tOP_ASGN lex_ctxt command_rhs command_asgn - > op_asgn_command_rhs def_endless_method_endless_command - > defn_head f_opt_paren_args . . . def_endless_method_endless_command - > defs_head f_opt_paren_args . . . command_asgn - > def_endless_method_endless_command endless_command - > command endless_command - > endless_command modif i er_rescue after_rescue arg option_'\n' - > ε option_'\n' - > '\n' endless_command - > keyword_not option_'\n' endless_command command_rhs - > command_call_value command_rhs - > command_call_value modif i er_rescue after_rescue stmt command_rhs - > command_asgn expr - > command_call expr - > expr keyword_and expr expr - > expr keyword_or expr expr - > keyword_not option_'\n' expr expr - > '!' command_call $@7 - > ε $@8 - > ε expr - > arg keyword_in $@8 p_in_kwarg p_pvtbl p_pktbl p_top_expr_body expr - > arg def_name - > fname defn_head - > k_def def_name $@9 - > ε defs_head - > k_def singleton dot_or_colon $@9 def_name value_expr_expr - > expr expr_value - > value_expr_expr expr_value - > YYerror $@10 - > ε $@11 - > ε expr_value_do - > $@10 expr_value do $@11 Harsh reality...
  12. "Ruby on Railroad: The Power of Visualizing CFG" ʔ @ydah

    Railroad diagram ‣ A way to represent a context-free grammar ‣ Represent a graphical alternative to BNF and other text- based grammars as metalanguages
  13. "Ruby on Railroad: The Power of Visualizing CFG" ʔ @ydah

    Chevrotain ‣ Parser Building Toolkit for JavaScript ‣https://github.com/Chevrotain/chevrotain ‣ Primarily builds parsers using LL(k) grammar techniques, specifically the top-down parsing method ‣Grammars are defined directly within JavaScript or TypeScript code using a fluent API
  14. "Ruby on Railroad: The Power of Visualizing CFG" ʔ @ydah

    Syntax Diagrams ‣ Visually examining grammar diagrams is often useful during development and for documentation purposes ‣Generate railroad diagrams using the railroad-diagrams library
  15. "Ruby on Railroad: The Power of Visualizing CFG" ʔ @ydah

    railroad-diagrams ‣ A small JS+SVG library for drawing railroad syntax diagrams, like on JSON.org ‣This library is supported by both JS and Python modules ‣Ruby is not supported ...
  16. "Ruby on Railroad: The Power of Visualizing CFG" ʔ @ydah

    If it doesn't exist, just create it
  17. "Ruby on Railroad: The Power of Visualizing CFG" ʔ @ydah

    ydah/railroad_diagrams ‣ The design was inspired from tabatkins/railroad-diagrams to implement a library for drawing Railroad diagrams in Ruby
  18. "Ruby on Railroad: The Power of Visualizing CFG" ʔ @ydah

    Feedback What do you think of this feature? I don't want it. Because I can read.
  19. "Ruby on Railroad: The Power of Visualizing CFG" ʔ @ydah

    I forgot that he's a Parser Monster...
  20. "Ruby on Railroad: The Power of Visualizing CFG" ʔ @ydah

    Try it! ydah.github.io/railroad-diagram-collection
  21. "Ruby on Railroad: The Power of Visualizing CFG" ʔ @ydah

    Conclusion ‣ We have the power to look at context-free grammar ‣ Railroad diagrams make the structure of grammar easier to read ‣ If it doesn't exist, just create it ‣ Only on the Lrama