$30 off During Our Annual Pro Sale. View Details »

JRuby Startup and AOT

headius
February 01, 2020

JRuby Startup and AOT

Rubyists work from a command line, which makes JRuby startup time a critical concern. Traditionally, the JVM has not been optimized for startup, but that's changing. This talk will explore all available options for making a heavy runtime like JRuby start up quickly, from using class data caching services like Hotspot's CDS and OpenJ9's Quickstart to ahead-of-time compilation of JRuby using GraalVM's Native Image. We'll compare approaches and trade-offs.

Delivered on February 1, 2020 at FOSDEM 2020 in Brussels, Belgium

headius

February 01, 2020
Tweet

More Decks by headius

Other Decks in Programming

Transcript

  1. JRuby Startup and AOT
    Thomas E. Enebo (@tom_enebo)
    Charles O. Nutter (@headius)

    View Slide

  2. • JRuby co-leads
    • Red Hat
    Charles Thomas Ruby
    Java
    Beer

    View Slide

  3. What is JRuby?
    An implementation of the

    Ruby language and Runtime
    def ruby
    puts "It is pretty cool..."
    end

    View Slide

  4. Why JRuby?
    • Native Threads
    • Access Java Libraries from within Ruby
    • The power of Java (GCs, Profiled Opts, Portability)
    • All the things we know and love about Java….BUT…

    View Slide

  5. …startup time is a problem
    • Java is not known for short-lived command-line executions
    • Rubyists constantly run ruby from the command-line
    • They expect it to be short-lived
    • This is our perennial issue to solve

    View Slide

  6. Many Ruby Core Classes
    % jruby -e 1
    415 classes/modules defined
    300+ come from Ruby source

    View Slide

  7. Many Java Classes
    • More than 6400 Java classes loaded for -e 1
    • Nearly 5000 are from JRuby

    View Slide

  8. • Ruby is a dynamic expression-based language
    • No ahead of time knowledge of what will load
    Dynamic Language
    require 'normal'
    if something_special(oh_noes)
    require 'mystery_module'
    end

    View Slide

  9. Path Searching Hell
    More Paths More Problems!
    ["/home/enebo/work/jruby/frogger/lib", "/home/enebo/work/jruby/frogger/vendor", "/home/enebo/work/jruby/frogger/app/channels", "/home/enebo/work/jruby/frogger/app/controllers", "/home/enebo/work/jruby/frogger/app/
    controllers/concerns", "/home/enebo/work/jruby/frogger/app/helpers", "/home/enebo/work/jruby/frogger/app/jobs", "/home/enebo/work/jruby/frogger/app/mailers", "/home/enebo/work/jruby/frogger/app/models", "/home/enebo/
    work/jruby/frogger/app/models/concerns", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/turbolinks-5.2.1/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/webpacker-4.2.2/lib", "/home/enebo/work/jruby/lib/ruby/
    gems/shared/gems/actiontext-6.0.2.1/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/actiontext-6.0.2.1/app/helpers", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/actiontext-6.0.2.1/app/models", "/home/enebo/
    work/jruby/lib/ruby/gems/shared/gems/actionmailbox-6.0.2.1/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/actionmailbox-6.0.2.1/app/controllers", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/
    actionmailbox-6.0.2.1/app/jobs", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/actionmailbox-6.0.2.1/app/models", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/actioncable-6.0.2.1/lib", "/home/enebo/work/jruby/lib/
    ruby/gems/shared/gems/activestorage-6.0.2.1/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/activestorage-6.0.2.1/app/controllers", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/activestorage-6.0.2.1/app/
    controllers/concerns", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/activestorage-6.0.2.1/app/jobs", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/activestorage-6.0.2.1/app/models", "/home/enebo/work/jruby/lib/
    ruby/gems/shared/gems/actionview-6.0.2.1/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/bundler-2.0.1/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/webdrivers-4.2.0/lib", "/home/enebo/work/jruby/lib/ruby/
    gems/shared/gems/web-console-4.0.1/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/tzinfo-data-1.2019.3/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/turbolinks-source-5.2.0/lib", "/home/enebo/work/jruby/
    lib/ruby/gems/shared/gems/selenium-webdriver-3.142.7/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/sass-rails-6.0.0/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/sassc-rails-2.1.2/lib", "/home/enebo/work/
    jruby/lib/ruby/gems/shared/gems/tilt-2.0.10/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/sassc-2.2.1/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/extensions/universal-java-13/2.5.0/sassc-2.2.1", "/home/enebo/
    work/jruby/lib/ruby/gems/shared/gems/rubyzip-2.1.0/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/rails-6.0.2.1/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/sprockets-rails-3.2.1/lib", "/home/enebo/work/
    jruby/lib/ruby/gems/shared/gems/sprockets-4.0.0/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/railties-6.0.2.1/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/thor-1.0.1/lib", "/home/enebo/work/jruby/lib/ruby/
    gems/shared/gems/rack-proxy-0.6.5/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/puma-4.3.1-java/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/method_source-0.9.2/lib", "/home/enebo/work/jruby/lib/
    ruby/gems/shared/gems/listen-3.1.5/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/ruby_dep-1.5.0/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/rb-inotify-0.10.1/lib", "/home/enebo/work/jruby/lib/ruby/gems/
    shared/gems/rb-fsevent-0.10.3/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/jbuilder-2.9.1/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/ffi-1.12.1-java/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/
    gems/childprocess-3.0.0/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/capybara-3.31.0/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/xpath-3.2.0/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/
    regexp_parser-1.6.0/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/bindex-0.8.1/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/extensions/universal-java-13/2.5.0/bindex-0.8.1", "/home/enebo/work/jruby/lib/ruby/
    gems/shared/gems/addressable-2.7.0/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/public_suffix-4.0.3/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/activerecord-jdbcsqlite3-adapter-60.1-java/lib", "/home/
    enebo/work/jruby/lib/ruby/gems/shared/gems/jdbc-sqlite3-3.28.0/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/activerecord-jdbc-adapter-60.1-java/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/
    actionmailer-6.0.2.1/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/mail-2.7.1/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/mini_mime-1.0.2/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/
    marcel-0.3.3/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/mimemagic-0.3.3/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/activerecord-6.0.2.1/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/
    activemodel-6.0.2.1/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/activejob-6.0.2.1/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/globalid-0.4.2/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/
    websocket-driver-0.7.1-java/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/websocket-extensions-0.1.4/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/nio4r-2.5.2-java/lib", "/home/enebo/work/jruby/lib/ruby/
    gems/shared/gems/actionpack-6.0.2.1/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/rack-test-1.1.0/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/rack-2.1.1/lib", "/home/enebo/work/jruby/lib/ruby/gems/
    shared/gems/rails-html-sanitizer-1.3.0/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/loofah-2.4.0/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/crass-1.0.6/lib", "/home/enebo/work/jruby/lib/ruby/gems/
    shared/gems/rails-dom-testing-2.0.3/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/nokogiri-1.10.7-java/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/erubi-1.9.0/lib", "/home/enebo/work/jruby/lib/ruby/gems/
    shared/gems/builder-3.2.4/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/activesupport-6.0.2.1/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/zeitwerk-2.2.2/lib", "/home/enebo/work/jruby/lib/ruby/gems/
    shared/gems/tzinfo-1.2.6/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/thread_safe-0.3.6-java/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/minitest-5.14.0/lib", "/home/enebo/work/jruby/lib/ruby/gems/
    shared/gems/i18n-1.8.2/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/concurrent-ruby-1.1.5/lib", "/home/enebo/work/jruby/lib/ruby/gems/shared/gems/rake-13.0.1/lib", "/home/enebo/work/jruby/lib/ruby/2.5/site_ruby", "/
    home/enebo/work/jruby/lib/ruby/stdlib"]

    View Slide

  10. • Critical code (like Parser) takes time to warmup
    Everything Starts Cold
    Java Instructions
    (java bytecode)
    interpret
    C1 compile
    native code better
    native code
    java
    bytecode
    interpreter
    execute
    C2 compile
    Java Virtual Machine
    deoptimize

    View Slide

  11. total execution time (lower is better)
    0
    0.45
    0.9
    1.35
    1.8
    -e 1
    CRuby JRuby (JDK8) JRuby (10th iter) JRuby (50th iter)

    View Slide

  12. total execution time (lower is better)
    0
    1.25
    2.5
    3.75
    5
    gem --version gem list (~350 gems)
    CRuby JRuby (JDK8) JRuby (10th iter)

    View Slide

  13. Other Implementations
    • CRuby: bytecode compile and interpret
    • Low peak performance but fastest startup across the board
    • TruffleRuby: AST interpret + Futamura projection via Graal
    • Extremely long startup in JVM mode (10x JRuby)
    • Native compile base startup =~ CRuby
    • Apps still slow to start (2-3x JRuby)

    View Slide

  14. A History of JRuby Startup

    View Slide

  15. In the beginning…
    Circa Java 1.4
    Ruby (.rb) Lexer Parser
    interpret
    interpreter
    keyword_def,tIDENTIFIER[foo],NL
    tIDENTIFIER[puts],tSTRING_BEG,
    tSTRING_CONTENT[hello],tSTRING_END,NL
    keyword_end,NL
    tIDENTIFIER[foo],EOF
    Root
    Method Call:
    Foo
    “foo” Call:
    Puts
    “hello”
    def foo
    puts "hello"
    end
    foo
    All hail the interpreted AST runtime

    View Slide

  16. AST Runtime
    • Startup was ok for the time
    • Ruby was much simpler back then
    • Quick tasks were ~2-3x slower than C Ruby but very short runs
    • Peak performance was not good but was better than C Ruby

    View Slide

  17. Lexer Optimization
    • Serialize lexer stream
    • Java 1.4 boost was significant (~25% faster)
    • Java 5 dropped improvement to <5%…death to serialization!
    Ruby (.rb) Lexer Parser
    interpret
    interpreter
    Ruby (.ser)

    View Slide

  18. AST JIT Added
    • Helped keep performance edge over C Ruby 1.9
    • Quick tasks startup time unaffected
    • Longer tasks could be if they contained hot code
    • This startup time of large web applications
    Ruby (.rb) Lexer Parser
    interpret
    interpreter
    Compiler
    Called enough
    To compile

    View Slide

  19. Honorable Mention…Bytecode AOT
    • AOT all the things to class files!
    • Class verification on: insanely slow
    • Class verification off: slower than manually parsing Ruby
    • More on this technique later…

    View Slide

  20. JRuby adds its own IR
    • Startup decreased 15-25% on short runs
    • Even better Steady State performance
    • Longer app startup potentially improves more
    Ruby (.rb) Lexer Parser IR Builder
    Startup
    Instructions
    Full
    Instructions
    Bytecode
    Generation
    Instructions and passes…performance for the masses

    View Slide

  21. Lazy Building of Methods
    • AST tree smaller than IR build of that tree
    • Until a method's first call do not build it
    • Saves memory (%40 of Rails methods not called)
    • Eliminates build time (15-25%) if not called.

    View Slide

  22. JRuby adds IR Serialization
    Ruby (.rb) Lexer Parser IR Builder
    Startup
    Instructions
    Full
    Instructions
    Bytecode
    Generation
    Ruby (.ir)
    Silver bullet?

    View Slide

  23. IR Serialization
    • SLOWER THAN PARSING FROM SCRATCH…barely

    View Slide

  24. Add --dev mode
    • Only use IR startup instructions in interpreter
    • Disable C2 (-XX:TieredStopAtLevel=1)
    • Disable bytecode verification
    • Most significant startup optimization to date

    View Slide

  25. Dev Mode Flag
    total execution time (lower is better)
    0
    1.15
    2.3
    3.45
    4.6
    gem list (~350 gems)
    JRuby JRuby --dev

    View Slide

  26. Native Compilers
    • Generate a JRuby executable
    • Excelsior JET and jaotc
    • Very slightly faster than --dev…but not quite enough
    • There's new developments here...

    View Slide

  27. Startup Experiments
    • Preboot or reuse JVM process
    • Lazy deserialize and IR constant pooling
    • Precompile to JVM bytecode

    View Slide

  28. Pre-booting Options
    • Nailgun - reuse same JVM repeatedly
    • Poor resource cleanup for badly-behaved apps
    • Drip - pre-boot next JVM in background
    • Troublesome TTY issues, stale instances
    • CRIU - checkpoint and restore in userspace
    • Linux only, still in proof-of-concept stage for JVM apps

    View Slide

  29. Lazy Instruction Deserialization
    • All Scopes (in IR) are needed for implementation reasons
    • …but startup instructions are only needed it they are used
    IRScope
    Flags
    StaticScope Closures
    … Instructions
    Decode on demand

    View Slide

  30. 0
    0.425
    0.85
    1.275
    1.7
    gem list (20 gems)
    1.32
    1.61
    1.41
    --dev old serialize + --dev lazy serialize --dev
    Java 13.0.2+8

    View Slide

  31. 0
    1.025
    2.05
    3.075
    4.1
    gem list (2000 gems)
    3.87
    4.07
    3.91
    --dev old serialize + --dev lazy serialize --dev
    Java 13.0.2+8

    View Slide

  32. 0
    3.575
    7.15
    10.725
    14.3
    rails console
    13.7
    14
    14.3
    --dev old serialize + --dev lazy serialize --dev
    Java 13.0.2+8

    View Slide

  33. Add constant pooling
    • Serializer encodes/decodes a lot of the same values
    • Common Symbol Operand type into per IRScope constant pool
    • Lazily Decode pool when instructions decode
    • Reduces disk space
    • Reduces extra lookups

    View Slide

  34. 0
    0.425
    0.85
    1.275
    1.7
    gem list (20 gems)
    1.27
    1.32
    1.61
    1.41
    --dev old serialize + --dev lazy serialize --dev lazy + pool
    Java 13.0.2+8

    View Slide

  35. 0
    1.025
    2.05
    3.075
    4.1
    gem list (2000 gems)
    3.64
    3.87
    4.07
    3.91
    --dev old serialize + --dev lazy serialize --dev lazy + pool
    Java 13.0.2+8

    View Slide

  36. 0
    3.575
    7.15
    10.725
    14.3
    rails console
    13.6
    13.7
    14
    14.3
    --dev old serialize + --dev lazy serialize --dev lazy + pool
    Java 13.0.2+8

    View Slide

  37. JVM Bytecode Compiler
    • Used for JIT at runtime
    • 50 call compile threshold
    • Background compiler threads
    • Also supports compiling entire scripts
    • Used for "main" script by default

    View Slide

  38. Precompiling Goals
    • JVM, JRuby initialization largely unchanged
    • Read, parse, IR compile, IR interpret eliminated
    • Reduce JVM classes and IR state on heap
    • Reduce startup (maybe) and warmup (probably) by skipping JIT

    View Slide

  39. $ jruby -Xjit.logging -S gem -v
    2020-01-31T13:44:10.835+01:00 [main] INFO Ruby : done compiling target script: /Users/headius/projec
    2020-01-31T13:44:10.981+01:00 [Ruby-0-JIT-1] INFO JITCompiler : block done jitting: Gem::Specificati
    2020-01-31T13:44:10.989+01:00 [Ruby-0-JIT-1] INFO JITCompiler : method done jitting: Gem::StubSpecif
    2020-01-31T13:44:10.993+01:00 [Ruby-0-JIT-1] INFO JITCompiler : method done jitting: Gem::BasicSpeci
    2020-01-31T13:44:10.996+01:00 [Ruby-0-JIT-1] INFO JITCompiler : method done jitting: Gem::BasicSpeci
    2020-01-31T13:44:10.998+01:00 [Ruby-0-JIT-1] INFO JITCompiler : block done jitting: Gem::Specificati
    2020-01-31T13:44:11.000+01:00 [Ruby-0-JIT-1] INFO JITCompiler : block done jitting: Gem::Specificati
    $ jruby -Xjit.logging -X+C -S gem -v
    2020-01-31T13:44:20.839+01:00 [main] INFO Ruby : done compiling target script: uri:classloader:/jrub
    2020-01-31T13:44:20.861+01:00 [main] INFO Ruby : done compiling target script: uri:classloader:/jrub
    2020-01-31T13:44:20.919+01:00 [main] INFO Ruby : done compiling target script: uri:classloader:/jrub
    2020-01-31T13:44:20.940+01:00 [main] ERROR Ruby : failed to compile target script: uri:classloader:/
    2020-01-31T13:44:20.947+01:00 [main] INFO Ruby : done compiling target script: uri:classloader:/jrub
    2020-01-31T13:44:21.008+01:00 [main] INFO Ruby : done compiling target script: file:/Users/headius/p
    Normal JIT mode
    Force compile mode

    View Slide

  40. $ jruby -Xcompile.cache.classes -Xcompile.cache.classes.logging -X+C -S gem -v
    saved compiled script as /Users/headius/.ir/uri_3a_classloader_3a_/jruby/java.class
    saved compiled script as /Users/headius/.ir/uri_3a_classloader_3a_/jruby/java/core_ext.class
    saved compiled script as /Users/headius/.ir/uri_3a_classloader_3a_/jruby/java/core_ext/module.class
    saved compiled script as /Users/headius/.ir/uri_3a_classloader_3a_/jruby/java/java_ext.class
    saved compiled script as /Users/headius/.ir/Users/headius/projects/jruby/lib/jruby_dot_jar/jruby/jruby.class
    saved compiled script as /Users/headius/.ir/uri_3a_classloader_3a_/jruby/kernel/kernel.class
    saved compiled script as /Users/headius/.ir/uri_3a_classloader_3a_/jruby/kernel/process.class
    saved compiled script as /Users/headius/.ir/uri_3a_classloader_3a_/jruby/kernel/io.class
    saved compiled script as /Users/headius/.ir/uri_3a_classloader_3a_/jruby/kernel/gc.class
    saved compiled script as /Users/headius/.ir/uri_3a_classloader_3a_/jruby/kernel/range.class
    saved compiled script as /Users/headius/.ir/Users/headius/projects/jruby/lib/jruby_dot_jar/jruby/preludes.class
    saved compiled script as /Users/headius/.ir/uri_3a_classloader_3a_/jruby/kernel/prelude.class
    saved compiled script as /Users/headius/.ir/uri_3a_classloader_3a_/jruby/kernel/enc_prelude.class
    saved compiled script as /Users/headius/.ir/Users/headius/projects/jruby/lib/ruby/stdlib/unicode_normalize.class
    saved compiled script as /Users/headius/.ir/uri_3a_classloader_3a_/jruby/kernel/gem_prelude.class
    saved compiled script as /Users/headius/.ir/Users/headius/projects/jruby/lib/ruby/stdlib/rubygems.class
    saved compiled script as /Users/headius/.ir/Users/headius/projects/jruby/lib/ruby/stdlib/rbconfig.class
    saved compiled script as /Users/headius/.ir/uri_3a_classloader_3a_/jruby/kernel/rbconfig.class
    saved compiled script as /Users/headius/.ir/Users/headius/projects/jruby/lib/ruby/stdlib/rubygems/compatibility.class
    saved compiled script as /Users/headius/.ir/Users/headius/projects/jruby/lib/ruby/stdlib/rubygems/defaults.class
    saved compiled script as /Users/headius/.ir/Users/headius/projects/jruby/lib/ruby/stdlib/rubygems/errors.class
    saved compiled script as /Users/headius/.ir/Users/headius/projects/jruby/lib/ruby/stdlib/rubygems/version.class
    saved compiled script as /Users/headius/.ir/Users/headius/projects/jruby/lib/ruby/stdlib/rubygems/requirement.class
    saved compiled script as /Users/headius/.ir/Users/headius/projects/jruby/lib/ruby/stdlib/rubygems/platform.class
    saved compiled script as /Users/headius/.ir/Users/headius/projects/jruby/lib/ruby/stdlib/rubygems/basic_specification.class
    saved compiled script as /Users/headius/.ir/Users/headius/projects/jruby/lib/ruby/stdlib/rubygems/stub_specification.class
    saved compiled script as /Users/headius/.ir/Users/headius/projects/jruby/lib/ruby/stdlib/rubygems/util/list.class

    View Slide

  41. Does it Work?
    • rails generate scaffold produces over 1200 class files
    • Cached size exceeds 80MB
    • At command line most of this never JITs
    • JVM interpreter usually slower than our interpreter
    • Trace bytecode execution to measure improvement

    View Slide

  42. TraceBytecodes
    • Debug option -XX:+TraceBytecodes
    • Print out all bytecodes as executed
    • Use -Xint to also capture bytecodes that JIT
    • Claes Redestad's bytestacks: https://github.com/cl4es/bytestacks
    • Looking only at JVM in JIT mode
    • Need to quantify cold execution

    View Slide

  43. Cold Bytecodes for -e 1
    -e 1
    -e 1 cached
    Number of bytecodes executed (cold, in millions)
    0 10 20 30 40
    JVM JRuby base JRuby libs Other

    View Slide

  44. Cold bytecodes for gem list
    gem list
    gem list cached
    Number of bytecodes executed (cold, in millions)
    0 12.5 25 37.5 50
    JVM JRuby base RubyGems Other gem list

    View Slide

  45. InvokeDynamic Usage
    • Normal mode: Ruby literals, constant lookups, global vars
    • Indy mode: method, block invocations
    • InvokeDynamic + MethodHandle are expensive in JVM interpreter
    • LambdaForms never get a chance to optimize
    • More bytecode than manual lookup + virtual call varargs
    • Enabling full indy runs 48M cold bytecodes vs 33M for "-e 1"

    View Slide

  46. AOT Mode
    • AOT mode: No indy at all
    • A bit more bytecode generated
    • Only direct method handles (no LambdaForms)
    • (Constant lookup still using indy, will fix later)
    • Cold bytecodes reduced vs normal precompile

    View Slide

  47. Cold Bytecodes for -e 1
    -e 1
    -e 1 cached
    -e 1 cached2
    Number of bytecodes executed (cold, in millions)
    0 10 20 30 40
    JVM JRuby base JRuby libs Other

    View Slide

  48. Cold bytecodes for gem list
    gem list
    gem list cached
    gem list cached2
    Number of bytecodes executed (cold, in millions)
    0 12.5 25 37.5 50
    JVM JRuby base RubyGems Other gem list

    View Slide

  49. Mix and Match
    • Experimenting with combinations
    • Interp, JIT, AOT modes
    • Serialized IR, precompiled bytecode caches
    • JVM options
    • Disabled verification and AppCDS on Hotspot
    • -Xquickstart and -Xshareclasses on OpenJ9

    View Slide

  50. 0
    0.45
    0.9
    1.35
    1.8
    gem list (20 gems)
    1.42
    1.71
    1.13
    1.12
    1.41
    --dev appcds --dev lazy serialize appcds --dev
    classcache --dev classcache appcds --dev
    Java 13.0.2+8

    View Slide

  51. 0
    1
    2
    3
    4
    gem list (2000 gems)
    3.84
    3.53
    3.98
    3.42
    3.45
    3.91
    --dev appcds --dev lazy serialize appcds --dev
    classcache --dev classcache appcds --dev j9 quickstart+shareclasses --dev
    Java 13.0.2+8

    View Slide

  52. 0
    4
    8
    12
    16
    rails console
    11.2
    12.5
    15.1
    12.7
    13.5
    14.3
    --dev appcds --dev lazy serialize appcds --dev
    classcache --dev classcache appcds --dev partial classcache appcds --dev
    Java 13.0.2+8

    View Slide

  53. Futures

    View Slide

  54. Native Compilation
    • Native AOT is cool again
    • Time to revive GCJ!
    • We are experimenting with GraalVM native image

    View Slide

  55. total execution time (lower is better)
    0
    0.75
    1.5
    2.25
    3
    -e 1
    0.054
    2.472
    1.268
    JRuby --dev TruffleRuby JVM TruffleRuby SVM

    View Slide

  56. total execution time (lower is better)
    0
    5.5
    11
    16.5
    22
    gem --version gem list (~350 gems)
    JRuby --dev TruffleRuby JVM TruffleRuby SVM

    View Slide

  57. GraalVM Native Image
    • Compile all of JRuby to native (working POC)
    • Too many limitations?
    • No invokedynamic, limited reflection, no dynamic classloading, ...
    • Ultimate goal: fully native Ruby apps (no startup or warmup)
    • Good for tools, microservices
    • Still need JVM for peak performance

    View Slide

  58. total execution time (lower is better)
    0
    0.45
    0.9
    1.35
    1.8
    -e 1
    0.117s
    0.124s
    1.63s
    0.1s
    CRuby JRuby (JDK8) JRuby (10th iter) JRuby native

    View Slide

  59. Native Futures
    • Compile Ruby app + library sources to native
    • Needed bytecode AOT to be working
    • Static optimizations
    • Remove unneeded parts of JRuby
    • Probably limited to small services, command line tools

    View Slide

  60. Miniature Ruby subset interpreter
    • When file uses a restricted subset of Ruby, encode as "special
    miniature language"
    • fcalls with literals (e.g. require "date")
    • module
    • class
    • def (their bodies are deserialized like normal and lazy)

    View Slide

  61. Miniature Ruby subset interpreter
    • More complicated Ruby uses ordinary deserialization path
    • Implemented a small static method to warm up quick
    • Will bypass creating IR directly
    • But IR can be lazily reconstructed

    View Slide

  62. Miniature Ruby subset interpreter
    require 'date'
    module Package
    class FooDriver
    def run
    end
    end
    class BarDriver
    def run
    end
    end
    end
    fcall :require, 'date'
    create_module :Package
    create_class :FooDriver
    create_method :run, @scopeXXX
    return # goes back to Package
    create_class :BarDriver
    create_method :run, @scopeYYY

    View Slide

  63. Summary
    • Precompiling to bytecode works but sometimes hurts startup
    • Class sharing features competitive with --dev without JIT limits
    • Lazier, lighter IR or subset languages show promise
    • Native compile shows promise but difficult to apply to Ruby

    View Slide

  64. Thank You!
    • Charles Nutter @headius, Tom Enebo @tom_enebo
    • https://www.jruby.org
    • https://github.com/jruby/jruby
    • Chat with JRuby devs, users
    • jruby on Matrix
    • #jruby on Freenode IRC
    • Mailing list: https://lists.ruby-lang.org

    View Slide