Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Hermes: Better Performance with Bytecode Transl...

Hermes: Better Performance with Bytecode Translation (React Universe 2024)

Introducing the Hermes JIT (bytecode translation) and diving deeper in how it works.

Tzvetan Mikov

September 12, 2024
Tweet

More Decks by Tzvetan Mikov

Other Decks in Programming

Transcript

  1. HOW DO YOU THINK HE IS DOING NOW, A YEAR

    LATER? • Probably still not very happy, since we haven’t released Static Hermes yet
  2. HERMES JavaScript engine for React Native Optimized for mobile Low

    runtime resource consumption Extremely fast startup
  3. KEY FEATURES AOT compilation to bytecode Optimization happens once, before

    execution Lightweight runtime Low memory footprint
  4. STATIC HERMES BUILDS UPON HERMES • Understands type annotations •

    Great performance by compiling typed JS code ahead of time • Emits typed bytecode or native machine code • State of the art compiler pipeline • Leverages the best production native compiler: LLVM
  5. STATIC HERMES: LOGISTICAL CHALLENGES • Existing JS build pipelines do

    not preserve type annotations • Shipping native code makes OTA updates harder • A lot of untyped code still exists. Amdahl’s law: • The performance of a system is limited by untyped JavaScript.
  6. STATIC HERMES: LOGISTICAL CHALLENGES • Existing JS build pipelines do

    not preserve type annotations • Shipping native code makes OTA updates harder • A lot of untyped code still exists. Amdahl’s law: • The performance of a system is limited by untyped JavaScript. • The performance of a system is limited by its slowest part.
  7. BYTECODE TRANSLATION ON DEVICE • Bytecode is translated to machine

    instructions at runtime • Ship bytecode like we do today; OTA updates work • Improved untyped performance • Excellent typed performance
  8. WAIT, ISN’T THAT … A JIT? • Technically, yes •

    In JavaScript “JIT” tends to mean a very complex speculative runtime compiler • Bytecode translation is very lightweight by comparison • Designed for the Hermes AOT pipeline
  9. 0 0.5 1 1.5 Box2D Crypto Gameboy Navier-stokes Richards N-body

    TS Raytracer Untyped JS Benchmarks Hermes 2023 Hermes 2024
  10. 0 2 4 6 8 10 12 14 Raytracer nbody

    Typed Benchmarks Hermes Untyped Typed Native
  11. RENDERING A MANDELBROT SET SOMETIMES THE NEW PERFORMANCE IS JUST

    FUN https://github.com/tmikov/mandelbrot-demo
  12. DEEP DIVE: HOW DOES IT WORK? A brief tutorial on

    building a Hermes JIT (but seriously, do not do this at home!)
  13. IN THE BEGINNING: HERMES BYTECODE Function<getprop>(2 params, 2 registers): LoadParam

    r0, 1 GetByIdShort r1, r0, 1, "prop" LoadConstUInt8 r0, 100 Mul r1, r1, r0 LoadConstUInt8 r0, 1 SubN r0, r1, r0 Ret r0 function getprop(o) { return (o.prop * 100) - 1; }
  14. HERMES BYTECODE Function<getprop>(2 params, 2 registers): LoadParam r0, 1 GetByIdShort

    r1, r0, 1, "prop" LoadConstUInt8 r0, 100 Mul r1, r1, r0 LoadConstUInt8 r0, 1 SubN r0, r1, r0 Ret r0 function getprop(o) { return (o.prop * 100) - 1; }
  15. HERMES BYTECODE Function<getprop>(2 params, 2 registers): LoadParam r0, 1 GetByIdShort

    r1, r0, 1, "prop" LoadConstUInt8 r0, 100 Mul r1, r1, r0 LoadConstUInt8 r0, 1 SubN r0, r1, r0 Ret r0 function getprop(o) { return (o.prop * 100) - 1; }
  16. HERMES BYTECODE Function<getprop>(2 params, 2 registers): LoadParam r0, 1 GetByIdShort

    r1, r0, 1, "prop" LoadConstUInt8 r0, 100 Mul r1, r1, r0 LoadConstUInt8 r0, 1 SubN r0, r1, r0 Ret r0 function getprop(o) { return (o.prop * 100) - 1; }
  17. Function<getprop> LoadParam r0, 1 GetByIdShort r1, r0, 1, "prop” LoadConstUInt8

    r0, 100 Mul r1, r1, r0 LoadConstUInt8 r0, 1 SubN r0, r1, r0 mov x0, x19 mov w1, 1 bl _sh_ljs_param str x0, [x20, 16] mov x0, x19 add x1, x20, 16 mov w2, 658 ldr x3, [RO_DATA] add x3, x3, 16 bl _sh_ljs_get_by_id_rjs str x0, [x20, 24] mov x1, 100 str x1, [x20, 16] mov x0, x19 add x1, x20, 24 add x2, x20, 16 bl _sh_ljs_mul_rjs fmov d1, x0 fmov d0, 1 fsub d0, d1, d0 function call function call function call
  18. Function<getprop> LoadParam r0, 1 GetByIdShort r1, r0, 1, "prop” LoadConstUInt8

    r0, 100 Mul r1, r1, r0 LoadConstUInt8 r0, 1 SubN r0, r1, r0 mov x0, x19 mov w1, 1 bl _sh_ljs_param str x0, [x20, 16] mov x0, x19 add x1, x20, 16 mov w2, 658 ldr x3, [RO_DATA] add x3, x3, 16 bl _sh_ljs_get_by_id_rjs str x0, [x20, 24] mov x1, 100 str x1, [x20, 16] mov x0, x19 add x1, x20, 24 add x2, x20, 16 bl _sh_ljs_mul_rjs fmov d1, x0 fmov d0, 1 fsub d0, d1, d0 function call function call function call
  19. Function<getprop> LoadParam r0, 1 GetByIdShort r1, r0, 1, "prop” LoadConstUInt8

    r0, 100 Mul r1, r1, r0 LoadConstUInt8 r0, 1 SubN r0, r1, r0 mov x0, x19 mov w1, 1 bl _sh_ljs_param str x0, [x20, 16] mov x0, x19 add x1, x20, 16 mov w2, 658 ldr x3, [RO_DATA] add x3, x3, 16 bl _sh_ljs_get_by_id_rjs str x0, [x20, 24] mov x1, 100 str x1, [x20, 16] mov x0, x19 add x1, x20, 24 add x2, x20, 16 bl _sh_ljs_mul_rjs fmov d1, x0 fmov d0, 1 fsub d0, d1, d0 function call function call function call
  20. Function<getprop> LoadParam r0, 1 GetByIdShort r1, r0, 1, "prop” LoadConstUInt8

    r0, 100 Mul r1, r1, r0 LoadConstUInt8 r0, 1 SubN r0, r1, r0 mov x0, x19 mov w1, 1 bl _sh_ljs_param str x0, [x20, 16] mov x0, x19 add x1, x20, 16 mov w2, 658 ldr x3, [RO_DATA] add x3, x3, 16 bl _sh_ljs_get_by_id_rjs str x0, [x20, 24] mov x1, 100 str x1, [x20, 16] mov x0, x19 add x1, x20, 24 add x2, x20, 16 bl _sh_ljs_mul_rjs fmov d1, x0 fmov d0, 1 fsub d0, d1, d0 function call function call function call
  21. Function<getprop> LoadParam r0, 1 GetByIdShort r1, r0, 1, "prop” LoadConstUInt8

    r0, 100 Mul r1, r1, r0 LoadConstUInt8 r0, 1 SubN r0, r1, r0 mov x0, x19 mov w1, 1 bl _sh_ljs_param str x0, [x20, 16] mov x0, x19 add x1, x20, 16 mov w2, 658 ldr x3, [RO_DATA] add x3, x3, 16 bl _sh_ljs_get_by_id_rjs str x0, [x20, 24] mov x1, 100 str x1, [x20, 16] mov x0, x19 add x1, x20, 24 add x2, x20, 16 bl _sh_ljs_mul_rjs fmov d1, x0 fmov d0, 1 fsub d0, d1, d0 function call function call function call
  22. WE CAN DO EVEN BETTER • There were lots of

    function calls • Function calls can be relatively expensive • JS is a funny language • Almost everything is valid. But: • Frequent patterns are cheap • Uncommon ones are expensive
  23. WE CAN DO EVEN BETTER • There were lots of

    function calls • Function calls can be relatively expensive • JS is a funny language • Almost everything is valid. But: • Frequent patterns are cheap • Uncommon ones are expensive 123 * 456 "Joe Native" * [2,3]
  24. FAST AND SLOW PATHS • Split expensive operation in two

    parts: • A slow path function call for all complicated and weird cases (“string” * [1,2,3]) • A fast path for the simple and fast cases (123 * 456)
  25. Mul r1, r1, r0 str x0, [x20, 24] str x1,

    [x20, 16] cmp x0, x21 b.hs SLOW_1 fmov d0, x1 fmov d1, x0 fmul d1, d1, d0 CONT_1: ... SLOW_1: mov x0, x19 add x1, x20, 24 add x2, x20, 16 bl _sh_ljs_mul_rjs fmov d1, x0 b CONT_1 FAST AND SLOW PATHS CHECK FAST PATH SLOW PATH
  26. LoadConstUInt8 r0, 100 Mul r1, r1, r0 LoadConstUInt8 r0, 1

    SubN r0, r1, r0 mov x1, 100 str x0, [x20, 24] str x1, [x20, 16] cmp x0, x21 b.hs SLOW_1 fmov d0, x1 fmov d1, x0 fmul d1, d1, d0 fmov d0, 1.0 fsub d0, d1, d0 COMMON EXECUTION TRACE No calls!
  27. STATIC HERMES • Static Hermes understands type annotations • Emits

    bytecode instructions that know the types of their operands • Typed bytecode results in much faster machine instruction sequences
  28. HERMES TYPED BYTECODE Function<getprop>: LoadParam r0, 1 GetOwnBySlotIdx r1, r0,

    0 LoadConstUInt8 r0, 100 MulN r1, r1, r0 LoadConstUInt8 r0, 1 SubN r0, r1, r0 Ret r0 type Obj = {prop: number}; function getprop(o: Obj): number { return (o.prop * 100) - 1; }
  29. TYPED VS UNTYPED BYTECODE LoadParam r0, 1 GetByIdShort r1, r0,

    1, "prop" LoadConstUInt8 r0, 100 Mul r1, r1, r0 LoadConstUInt8 r0, 1 SubN r0, r1, r0 Ret r0 LoadParam r0, 1 GetOwnBySlotIdx r1, r0, 0 LoadConstUInt8 r0, 100 MulN r1, r1, r0 LoadConstUInt8 r0, 1 SubN r0, r1, r0 Ret r0 Typed Untyped
  30. Function<getprop> LoadParam r0, 1 GetOwnBySlotIdx r1, r0, 0 LoadConstUInt8 r0,

    100 MulN r1, r1, r0 LoadConstUInt8 r0, 1 SubN r0, r1, r0 ldur x0, [x20, -72] mov x1, x0 and x1, x1, 0x0000ffffffffffff ldr x1, [x1, 48] mov x0, 100.0 fmov d0, x1 fmov d1, x0 fmul d0, d0, d1 fmov d1, 1.0 fsub d1, d0, d1
  31. TAKEAWAYS • Moderate speed ups for untyped code • Can

    be used for existing code and npm modules • Great speed ups for when using static types • React Native will use it for framework hot code • Developers could optionally use types to speed up non framework code
  32. HERMES V2: BUT WHEN? • When will all of this

    be enabled in RN by default • As usual, everything is available on our GitHub • We follow a process where we release to RN after we have tested things internally
  33. HERMES V2: BUT WHEN? • In order of sooner to

    later • Better language support (classes, etc) • Bytecode Translation • Static types