Upgrade to Pro — share decks privately, control downloads, hide ads and more …

YARP

Kevin Newton
May 12, 2023
380

 YARP

Building good Ruby tooling is dependent on having a good parser. Today, it is difficult to use the CRuby parser to build tooling because of its lack of public interface and documentation. This has led to people using external gems and generally fragmenting the community.

The Yet Another Ruby Parser project is building a universal Ruby parser that can be used by all Ruby implementations and tools. It is documented, error tolerant, and performant. It can be used without linking against CRuby, which means it can be easily used by other projects.

This talk is about the YARP project's motivation, design, implementation, and results. Come to learn about the future of parsing Ruby.

Kevin Newton

May 12, 2023
Tweet

Transcript

  1. Yet Another Ruby Parser Can parse 100% of: Shopify/shopify github/github

    ruby/ruby The top 100 gems by download on rubygems.org
  2. Yet Another Ruby Parser Can parse 100% of: Shopify/shopify github/github

    ruby/ruby The top 100 gems by download on rubygems.org Experimental fork of CRuby
  3. Yet Another Ruby Parser Can parse 100% of: Shopify/shopify github/github

    ruby/ruby The top 100 gems by download on rubygems.org Experimental fork of CRuby Experimental feature in TruffleRuby
  4. Yet Another Ruby Parser Can parse 100% of: Shopify/shopify github/github

    ruby/ruby The top 100 gems by download on rubygems.org Experimental fork of CRuby Experimental feature in TruffleRuby Experimental branch in JRuby
  5. Error tolerance $ irb irb(main):001:0> foo + bar + <>

    baz .../irb/workspace.rb:119:in `eval': (irb):1: syntax error, unexpected '<' (SyntaxError) foo + bar + <> baz ^ from .../exe/irb:11:in `<top (required)>' from .../bin/irb:25:in `load' from .../bin/irb:25:in `<main>' irb(main):002:0> Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  6. Automatically insert tokens Error tolerance Error tolerance · Portability ·

    Maintainability · Challenges · Adoption · Future work
  7. Automatically insert tokens Automatically insert nodes Error tolerance Error tolerance

    · Portability · Maintainability · Challenges · Adoption · Future work
  8. Automatically insert tokens Automatically insert nodes Context-based recovery Error tolerance

    Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  9. Error tolerance · Portability · Maintainability · Challenges · Adoption

    · Future work class Foo def bar(baz) self.then { |value| value + baz * end end
  10. Error tolerance · Portability · Maintainability · Challenges · Adoption

    · Future work class Foo def bar(baz) self.then { |value| value + baz * end end ClassNode
  11. Error tolerance · Portability · Maintainability · Challenges · Adoption

    · Future work class Foo def bar(baz) self.then { |value| value + baz * end end ClassNode StatementsNode
  12. Error tolerance · Portability · Maintainability · Challenges · Adoption

    · Future work class Foo def bar(baz) self.then { |value| value + baz * end end ClassNode StatementsNode DefNode
  13. Error tolerance · Portability · Maintainability · Challenges · Adoption

    · Future work class Foo def bar(baz) self.then { |value| value + baz * end end ClassNode StatementsNode DefNode CallNode
  14. Error tolerance · Portability · Maintainability · Challenges · Adoption

    · Future work class Foo def bar(baz) self.then { |value| value + baz * end end ClassNode StatementsNode DefNode CallNode SelfNode
  15. Error tolerance · Portability · Maintainability · Challenges · Adoption

    · Future work class Foo def bar(baz) self.then { |value| value + baz * end end ClassNode StatementsNode DefNode CallNode SelfNode BlockNode
  16. Error tolerance · Portability · Maintainability · Challenges · Adoption

    · Future work class Foo def bar(baz) self.then { |value| value + baz * end end ClassNode StatementsNode DefNode CallNode SelfNode BlockNode CallNode
  17. Error tolerance · Portability · Maintainability · Challenges · Adoption

    · Future work class Foo def bar(baz) self.then { |value| value + baz * end end ClassNode StatementsNode DefNode CallNode SelfNode BlockNode CallNode ArgumentsNode
  18. Error tolerance · Portability · Maintainability · Challenges · Adoption

    · Future work class Foo def bar(baz) self.then { |value| value + baz * end end ClassNode StatementsNode DefNode CallNode SelfNode BlockNode CallNode CallNode ArgumentsNode
  19. Error tolerance · Portability · Maintainability · Challenges · Adoption

    · Future work class Foo def bar(baz) self.then { |value| value + baz * end end ClassNode StatementsNode DefNode CallNode SelfNode BlockNode CallNode CallNode ArgumentsNode ArgumentsNode
  20. Error tolerance · Portability · Maintainability · Challenges · Adoption

    · Future work class Foo def bar(baz) self.then { |value| value + baz * end end ClassNode StatementsNode DefNode CallNode SelfNode BlockNode CallNode CallNode ArgumentsNode ArgumentsNode CLASS
  21. Error tolerance · Portability · Maintainability · Challenges · Adoption

    · Future work class Foo def bar(baz) self.then { |value| value + baz * end end ClassNode StatementsNode DefNode CallNode SelfNode BlockNode CallNode CallNode ArgumentsNode ArgumentsNode CLASS CLASS,DEF
  22. Error tolerance · Portability · Maintainability · Challenges · Adoption

    · Future work class Foo def bar(baz) self.then { |value| value + baz * end end ClassNode StatementsNode DefNode CallNode SelfNode BlockNode CallNode CallNode ArgumentsNode ArgumentsNode CLASS CLASS,DEF CLASS,DEF,BLOCK
  23. Error tolerance · Portability · Maintainability · Challenges · Adoption

    · Future work class Foo def bar(baz) self.then { |value| value + baz * end end ClassNode StatementsNode DefNode CallNode SelfNode BlockNode CallNode CallNode ArgumentsNode ArgumentsNode CLASS CLASS,DEF CLASS,DEF,BLOCK
  24. Error tolerance · Portability · Maintainability · Challenges · Adoption

    · Future work class Foo def bar(baz) self.then { |value| value + baz * end end ClassNode StatementsNode DefNode CallNode SelfNode BlockNode CallNode CallNode MissingNode ArgumentsNode ArgumentsNode CLASS CLASS,DEF CLASS,DEF,BLOCK
  25. Portability parse.y ripper/parse.y CRuby Runtimes Error tolerance · Portability ·

    Maintainability · Challenges · Adoption · Future work
  26. Portability parse.y ripper/parse.y mruby CRuby Runtimes Error tolerance · Portability

    · Maintainability · Challenges · Adoption · Future work
  27. Portability parse.y ripper/parse.y mruby JRuby CRuby Runtimes Error tolerance ·

    Portability · Maintainability · Challenges · Adoption · Future work
  28. Portability parse.y ripper/parse.y mruby JRuby TruffleRuby CRuby Runtimes Error tolerance

    · Portability · Maintainability · Challenges · Adoption · Future work
  29. Portability parse.y ripper/parse.y mruby JRuby TruffleRuby ruruby CRuby Runtimes Error

    tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  30. Portability parse.y ripper/parse.y mruby JRuby TruffleRuby ruruby natalie CRuby Runtimes

    Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  31. Portability parse.y ripper/parse.y mruby JRuby TruffleRuby ruruby natalie CRuby Runtimes

    Tools Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  32. Portability parse.y ripper/parse.y mruby JRuby TruffleRuby ruruby natalie parser CRuby

    Runtimes Tools Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  33. Portability parse.y ripper/parse.y mruby JRuby TruffleRuby ruruby natalie parser ruby_parser

    CRuby Runtimes Tools Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  34. Portability parse.y ripper/parse.y mruby JRuby TruffleRuby ruruby natalie parser ruby_parser

    tree-sitter-ruby CRuby Runtimes Tools Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  35. Portability parse.y ripper/parse.y mruby JRuby TruffleRuby ruruby natalie parser ruby_parser

    tree-sitter-ruby sorbet CRuby Runtimes Tools Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  36. Portability parse.y ripper/parse.y mruby JRuby TruffleRuby ruruby natalie lib-ruby-parser parser

    ruby_parser tree-sitter-ruby sorbet CRuby Runtimes Tools Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  37. Portability kddnewton.com/parsing-ruby/ parse.y ripper/parse.y mruby JRuby TruffleRuby ruruby natalie lib-ruby-parser

    parser ruby_parser tree-sitter-ruby sorbet CRuby Runtimes Tools Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  38. Portability syntax_tree rufo rubyfmt syntax_suggest Error tolerance · Portability ·

    Maintainability · Challenges · Adoption · Future work
  39. Portability syntax_tree rufo rubyfmt ruby-lsp syntax_suggest Error tolerance · Portability

    · Maintainability · Challenges · Adoption · Future work
  40. Portability syntax_tree rufo rubyfmt ruby-lsp rubocop syntax_suggest Error tolerance ·

    Portability · Maintainability · Challenges · Adoption · Future work
  41. Portability syntax_tree rufo rubyfmt ruby-lsp rubocop standard syntax_suggest Error tolerance

    · Portability · Maintainability · Challenges · Adoption · Future work
  42. Portability syntax_tree rufo rubyfmt ruby-lsp unparser rubocop standard syntax_suggest Error

    tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  43. Portability syntax_tree rufo rubyfmt ruby-lsp unparser ruby-next rubocop standard syntax_suggest

    Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  44. Portability syntax_tree rufo rubyfmt ruby-lsp solargraph unparser ruby-next rubocop standard

    syntax_suggest Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  45. Portability syntax_tree rufo rubyfmt ruby-lsp solargraph unparser ruby-next rubocop standard

    steep syntax_suggest Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  46. Portability syntax_tree rufo rubyfmt ruby-lsp solargraph unparser ruby-next debride rubocop

    standard steep syntax_suggest Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  47. Portability syntax_tree rufo rubyfmt ruby-lsp solargraph unparser ruby-next flay debride

    rubocop standard steep syntax_suggest Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  48. Portability syntax_tree rufo rubyfmt ruby-lsp solargraph unparser ruby-next flog flay

    debride rubocop standard steep syntax_suggest Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  49. Portability syntax_tree rufo rubyfmt ruby-lsp solargraph unparser ruby-next flog flay

    debride fasterer rubocop standard steep syntax_suggest Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  50. Portability syntax_tree rufo rubyfmt ruby-lsp sorbet solargraph unparser ruby-next flog

    flay debride fasterer rubocop standard steep syntax_suggest Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  51. No reliance on CRuby internals (VALUE, ID, etc.) Portability Error

    tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  52. No reliance on CRuby internals (VALUE, ID, etc.) No reliance

    on any external packages or generators Portability Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  53. No reliance on CRuby internals (VALUE, ID, etc.) No reliance

    on any external packages or generators Shared logic: unescaping strings, pack API, etc. Portability Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  54. No reliance on CRuby internals (VALUE, ID, etc.) No reliance

    on any external packages or generators Shared logic: unescaping strings, pack API, etc. Serialization API Portability Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  55. #include "yarp.h" #include "org_yarp_Parser.h" JNIEXPORT jbyteArray JNICALL Java_org_yarp_Parser_parseAndSerialize( JNIEnv *env,

    jclass clazz, jbyteArray source) { jsize size = (*env)->GetArrayLength(env, source); jbyte* bytes = (*env)->GetByteArrayElements(env, source, NULL); yp_buffer_t buffer; yp_buffer_init(&buffer); yp_parse_serialize((char*) bytes, size, &buffer); (*env)->ReleaseByteArrayElements(env, source, bytes, JNI_ABORT); jbyteArray serialized = (*env)->NewByteArray(env, buffer.length); jbyte *value = buffer.value; (*env)->SetByteArrayRegion(env, serialized, 0, buffer.length, value); yp_buffer_free(&buffer); return serialized; } Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  56. #include "yarp.h" #include "org_yarp_Parser.h" JNIEXPORT jbyteArray JNICALL Java_org_yarp_Parser_parseAndSerialize( JNIEnv *env,

    jclass clazz, jbyteArray source) { jsize size = (*env)->GetArrayLength(env, source); jbyte* bytes = (*env)->GetByteArrayElements(env, source, NULL); yp_buffer_t buffer; yp_buffer_init(&buffer); yp_parse_serialize((char*) bytes, size, &buffer); (*env)->ReleaseByteArrayElements(env, source, bytes, JNI_ABORT); jbyteArray serialized = (*env)->NewByteArray(env, buffer.length); jbyte *value = buffer.value; (*env)->SetByteArrayRegion(env, serialized, 0, buffer.length, value); yp_buffer_free(&buffer); return serialized; } Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  57. Readability Understandability Ease of contribution Ease of change Maintainability Error

    tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  58. Readability Understandability Ease of contribution Ease of change Documentation Maintainability

    Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  59. Readability Understandability Ease of contribution Ease of change Documentation Test

    suites Maintainability Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  60. Is YARP more maintainable? Maintainability Error tolerance · Portability ·

    Maintainability · Challenges · Adoption · Future work
  61. Is YARP more maintainable? Can we make YARP more maintainable?

    Maintainability Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  62. Grammar ambiguities Unary/binary operators, spaces Challenges Error tolerance · Portability

    · Maintainability · Challenges · Adoption · Future work
  63. Grammar ambiguities Unary/binary operators, spaces The `do` keyword Challenges Error

    tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  64. Grammar ambiguities Unary/binary operators, spaces The `do` keyword Newlines, comments,

    semicolons Challenges Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  65. Grammar ambiguities Unary/binary operators, spaces The `do` keyword Newlines, comments,

    semicolons Local variables and method calls Challenges Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  66. Grammar ambiguities Unary/binary operators, spaces The `do` keyword Newlines, comments,

    semicolons Local variables and method calls Encoding Challenges Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  67. Gem release CRuby fork/compilation work Adoption Error tolerance · Portability

    · Maintainability · Challenges · Adoption · Future work
  68. Gem release CRuby fork/compilation work JRuby exploring Adoption Error tolerance

    · Portability · Maintainability · Challenges · Adoption · Future work
  69. Gem release CRuby fork/compilation work JRuby exploring TruffleRuby merged, actively

    working Adoption Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  70. Gem release CRuby fork/compilation work JRuby exploring TruffleRuby merged, actively

    working Other runtimes/tools Adoption Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  71. Gem release CRuby fork/compilation work JRuby exploring TruffleRuby merged, actively

    working Other runtimes/tools Compatibility with ripper Adoption Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  72. Error tolerance - forward scanning Future work Error tolerance ·

    Portability · Maintainability · Challenges · Adoption · Future work
  73. class Foo def bar(baz) baz. end end Error tolerance ·

    Portability · Maintainability · Challenges · Adoption · Future work
  74. class Foo def bar(baz) baz. end end baz.end Missing `end`

    keyword Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  75. class Foo def bar(baz) baz. end end baz.end baz. Missing

    `end` keyword Missing call message Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  76. Error tolerance - scanning forward Future work Error tolerance ·

    Portability · Maintainability · Challenges · Adoption · Future work
  77. Error tolerance - scanning forward Memory Future work Error tolerance

    · Portability · Maintainability · Challenges · Adoption · Future work
  78. Error tolerance - scanning forward Memory Arena allocation, improve locality

    Future work Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  79. Error tolerance - scanning forward Memory Arena allocation, improve locality

    Reduce size of the tree Future work Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  80. Error tolerance - scanning forward Memory Arena allocation, improve locality

    Reduce size of the tree More nodes, more specialization Future work Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work
  81. Error tolerance - scanning forward Memory Arena allocation, improve locality

    Reduce size of the tree More nodes, more specialization Serialization - reduce size, variable length integers? Future work Error tolerance · Portability · Maintainability · Challenges · Adoption · Future work