Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Call of C-Tooling: The Secrets Behind Nativ...

The Call of C-Tooling: The Secrets Behind Native Image Building (JBCNConf 2022)

(JBCNConf 2022 Version)

You have learned about the "Closed World Assumption". You live by the rule "Thou Shall Sparingly Use Reflection". You know that "From The Powerful defineClass Comes Great Responsibility". And yet you were still left to wonder: what is it still eluding me? What is the secret ingredient that I am still missing? Join us for a short, but deeper dive into the dark magic behind GraalVM's native image builder: heap snapshotting and build-time initialization. And learn more about other obscure projects investigating the craft of static Java compilation.

Edoardo Vacchi

July 22, 2022
Tweet

More Decks by Edoardo Vacchi

Other Decks in Programming

Transcript

  1. @evacchi About Me • Edoardo Vacchi @evacchi • Research @

    UniMi / MaTe • Research @ UniCredit R&D • Kogito / Drools / jBPM @ Red Hat • evacchi.github.io
  2. @evacchi Java Applications Build Time Run Time 3 Classloaders ~500

    Classes ~160 Static Init 100+ Classloaders 1000+ Classes 1000+ Static Init 100++ Classloaders 1000++ Classes 1000++ Static Init static void Main Framework Initialization Application Initialization Source: Dan Heidinga - “Starting Fast” (QCon Plus 2021)
  3. @evacchi Native Java Compilers • Compilation into machine code is

    not innovative per se • Prior art: native java compilers early 2000s. • GNU Compiler for Java (GCJ) • ExcelsiorJET • ... • More Recently: RoboVM (~2013)
  4. @evacchi Pros • Native code, possibly faster to start-up •

    Smaller memory footprint • by avoiding JIT+scratch memory in address space • possibly aggressive dead code elimination • Self-contained • avoid full JDK class library bundle
  5. @evacchi Cons Limitations • Not a JDK: different runtime environment,

    not cross-platform • May get out-of-sync with the spec • Trade-offs with dynamicity • Difference in run-time behavior (dynamic vs static) • Possibly need compromises with peak-performance (PGO ?) Moreover • The benefits of a native compilation are not compelling enough • Startup time is negligible • "You boot up your application once, you keep it running for a long time" • "Disk is cheap" • Dynamic Linking vs Static Linking • You can still achieve faster startup time through laziness
  6. @evacchi Laziness • Defer initialization to a later stage of

    execution, • Benefits: Shorter Startup Time • Downsides: Less predictable performance profile. Build Time Run Time static void Main Framework Initialization Application Initialization Delayed Inits...
  7. @evacchi Getting Closer to Today • Shared Managed Infrastructure •

    Serverless • More interest in “Stateless” Apps • Suddenly attractive: • Fast Startup • Smaller Disk Footprint • Smaller Memory Footprint • Time to revisit?
  8. @evacchi Run-Time vs Build-Time • Generate code at build-time •

    Pre-initialize for boot time • e.g. Read config files, turn them into configuration commands • e.g. Read annotations, produce code for dependency injection • At startup, just execute that code • Benefits: faster startup time • Downsides • you have to write the code that generates code • possibly non-trivial, certainly time-consuming Build Time Run Time static void Main Framework Initialization Application Initialization Codegen
  9. @evacchi Smalltalk Environment • Concept of image • At run-time

    you do not just write code, you manipulate the state of such machine • contributing to the environment itself • possibly altering it or even turning it upside-down • When it is shut down, you do not just save the code you wrote you persist the state of machine to the image • When you start it you do not only run a program the state is restored, and execution resumes from the last saved state Run Time Load State Shutdown Save State
  10. @evacchi CRIU + Java Build Time Run Time static void

    Main Framework Initialization Application Initialization Checkpoint • CRIU: Checkpoint and Restore in Userspace • https:/ /www.criu.org • Jigawatts: • https:/ /github.com/chflood/jigawatts • OpenJ9 Snapshot+Restore • https:/ /danheidinga.github.io/Everyone_wants_fast_startup • CRaC: Coordinated Restore at Checkpoint • https:/ /github.com/CRaC/docs#crac • https:/ /openjdk.java.net/projects/crac/
  11. @evacchi GraalVM • GraalVM is an umbrella of technologies •

    A just-in-time compiler • The Truffle framework to implement dynamic languages • they can be seamlessly JITted across language boundaries. • SubstrateVM: the native image builder • reuses the compilation backend for Ahead-of-Time compilation • static init • image heap
  12. @evacchi Native Image Restrictions • Native binary compilation • Restriction:

    “closed-world assumption” • Limitations on reflection • No dynamic code loading: forbidden ClassLoader#defineClass(...byte[]...) • Allows more aggressive optimization (e.g, dead code elimination) • Static initializers may be eager* ! • Evaluated at build time ! * originally opt-out, now opt-in. In some cases default on (e.g. Quarkus) Build Time static void Main Framework Initialization Application Initialization Run Time
  13. @evacchi • We run parts of an application at build

    time and snapshot the objects allocated by this initialization code, using an iterative approach that is intertwined with points-to analysis. • We use points-to analysis results to only AOT-compile the parts of an application that are reachable at run time. “ Source: Initialize Once, Start Fast: Application Initialization at Build Time (Wimmer et al. OOPSLA 2019)
  14. @evacchi Static initializers public class Example { static { System.out.println("hello");

    } public static void main(String... args) { System.out.println("world"); } }
  15. @evacchi Static initializers $ native-image --initialize-at-build-time Example [example:23074] classlist: 1,032.11

    ms, 1.18 GB [example:23074] (cap): 2,301.26 ms, 1.18 GB [example:23074] setup: 3,609.57 ms, 1.69 GB hello [example:23074] (clinit): 82.45 ms, 1.73 GB [example:23074] (typeflow): 3,032.00 ms, 1.73 GB [example:23074] (objects): 2,923.76 ms, 1.73 GB [example:23074] (features): 129.59 ms, 1.73 GB [example:23074] analysis: 6,307.81 ms, 1.73 GB [example:23074] universe: 277.17 ms, 1.73 GB [example:23074] (parse): 525.88 ms, 1.73 GB [example:23074] (inline): 877.57 ms, 1.78 GB [example:23074] (compile): 3,842.94 ms, 1.87 GB [example:23074] compile: 5,504.45 ms, 1.87 GB [example:23074] image: 463.22 ms, 1.87 GB [example:23074] write: 176.80 ms, 1.87 GB [example:23074] [total]: 17,528.27 ms, 1.87 GB
  16. @evacchi Static initializers $ native-image --initialize-at-build-time Example [example:23074] classlist: 1,032.11

    ms, 1.18 GB [example:23074] (cap): 2,301.26 ms, 1.18 GB [example:23074] setup: 3,609.57 ms, 1.69 GB hello [example:23074] (clinit): 82.45 ms, 1.73 GB [example:23074] (typeflow): 3,032.00 ms, 1.73 GB [example:23074] (objects): 2,923.76 ms, 1.73 GB [example:23074] (features): 129.59 ms, 1.73 GB [example:23074] analysis: 6,307.81 ms, 1.73 GB [example:23074] universe: 277.17 ms, 1.73 GB [example:23074] (parse): 525.88 ms, 1.73 GB [example:23074] (inline): 877.57 ms, 1.78 GB [example:23074] (compile): 3,842.94 ms, 1.87 GB [example:23074] compile: 5,504.45 ms, 1.87 GB [example:23074] image: 463.22 ms, 1.87 GB [example:23074] write: 176.80 ms, 1.87 GB [example:23074] [total]: 17,528.27 ms, 1.87 GB
  17. @evacchi Static initializers public class Example { static { System.out.println("hello");

    } public static void main(String... args) { System.out.println("world"); } }
  18. @evacchi Static initializers public class Example { static { System.out.println("hello");

    } public static void main(String... args) { System.out.println("world"); } } A string constant
  19. @evacchi Static initializers public class Example { static { System.out.println("hello");

    } public static void main(String... args) { System.out.println("world"); } } A string constant A method invocation Over a PrintStream
  20. @evacchi Static initializers public class Example { static { System.out.println("hello");

    } public static void main(String... args) { System.out.println("world"); } } A string constant A method invocation A field resolution Over a subtype of OutputStream
  21. @evacchi Static initializers public class Example { static { System.out.println("hello");

    } public static void main(String... args) { System.out.println("world"); } } A string constant A method invocation A field resolution A static class initializer Over a subtype of OutputStream
  22. @evacchi Initialization Code First, class initializers are executed. • In

    Java, every class can have a class initializer ("static initializer") • represented as a method named <clinit> in the class file. • It computes the initial value of static fields. • The developer decides which classes are initialized at image build time
  23. @evacchi Heap Snapshotting • Builds an object graph i.e., the

    transitive closure of reachable objects • starts with root pointers e.g. static fields. • This object graph is written into the native image as the image heap
  24. @evacchi Heap Snapshotting • Builds an object graph i.e., the

    transitive closure of reachable objects • starts with root pointers e.g. static fields. • This object graph is written into the native image as the image heap
  25. @evacchi Points-To Analysis • determine which classes, methods, and fields

    are reachable at run time. • starts with all entry points, e.g., the main method of the application, • iteratively processes all transitively reachable methods until a fixed point is reached
  26. @evacchi Points-To Analysis (Example) • System.out.println("hello") • java.lang.String • System.out

    • java.io.PrintStream • java.io.FilterOutputStream • java.io.OutputStream • System
  27. @evacchi Ahead-of-Time Compilation • methods marked as reachable by the

    points-to analysis • placed in the text section of the executable.
  28. @evacchi Image Heap at Run-Time • Execution at run-time starts

    with an already pre-populated Java heap • Relocatable: references relative to the start of the image heap • Objects of the image heap and objects allocated at run-time • i.e., also objects allocated at run time use relative references • (use of a fixed register r14 on x64 architectures). Build Time static void Main Framework Initialization Application Initialization Run Time
  29. @evacchi Project Leyden • Goals • Address Java’s slow startup

    time • Reduce time to peak performance • Reduce memory footprint • Introduce static images at spec level (TCK) • stand-alone • closed-world
  30. @evacchi Project Leyden • Goals • Address Java’s slow startup

    time • Reduce time to peak performance • Reduce memory footprint • Introduce static images at spec level (TCK) • stand-alone • closed-world “a spectrum of constraints”
  31. @evacchi Qbicc • Experimental sandbox project for Leyden • Intended

    for compiler developers and experts • Goal: prototype approaches to native Java • New self-contained codebase • Allows to experiment with different trade-offs • GraalVM’s choices are known, • possible to explore different trade-offs of the solution space • Currently: Java-based compiler to LLVM IR • Future: different backend? (e.g. C2) https://github.com/qbicc/qbicc https://github.com/qbicc/qbicc/discussions https://qbicc.zulipchat.com
  32. @evacchi Qbicc: Architecture • Points-to analysis (static entry points) •

    Flow graph copied between phases, dropping unreachable nodes • Approaches to static init being investigated ADD ANALYZE LOWER GENERATE • TRANSFORM • CORRECT • OPTIMIZE • INTEGRITY • TRANSFORM • CORRECT • OPTIMIZE • INTEGRITY • TRANSFORM • CORRECT • OPTIMIZE • INTEGRITY • TRANSFORM • CORRECT • OPTIMIZE • INTEGRITY
  33. @evacchi mmap + offset Qbicc Build-time serialization + Fast deserialization

    routines initially static first + opt-out now runtime first + opt-in Qbicc as close to “all build-time” as possible investigating explicit opt-in (build-time, run-time, reinit) (code hints? annotations? language changes?) Qbicc: Static Initialization Trade-Offs
  34. @evacchi References David Lloyd (J4K 2021) qbicc: Exploring the possibilities

    of Java native images Andrew Dinn (2021) Leyden: Lessons from Graal Native Static Java, GraalVM Native and OpenJDK C. Wimmer et al. (OOPSLA 2019) Initialize Once, Start Fast: Application Initialization at Build Time Dan Heidinga (QCon Plus 2021) Starting Fast and Recent Blog Posts Cover Art by François Baranger Duke Art at OpenJDK Wiki @evacchi