Upgrade to Pro — share decks privately, control downloads, hide ads and more …

PWMI#4: Partial Evaluation of Programs (Futamur...

Edoardo Vacchi
February 19, 2020

PWMI#4: Partial Evaluation of Programs (Futamura, 1983)

Papers We Love is back! Let's learn about partial evaluation: a technique used by GraalVM, PyPy and other Just-in-Time compilers!

- credit for the Futurama/Futamura pic to Lars Hupel https://twitter.com/larsr_h/status/1227956746104266753
- last slide is on GraalVM is still (c) by Oracle

Edoardo Vacchi

February 19, 2020
Tweet

More Decks by Edoardo Vacchi

Other Decks in Research

Transcript

  1. The performance of many dynamic language implementations suffers from high

    allocation rates and runtime type checks. This makes dynamic languages less applicable to purely algorithmic problems, despite their growing popularity. In this paper we present a simple compiler optimization based on online partial evaluation to remove object allocations and runtime type checks in the context of a tracing JIT. We evaluate the optimization using a Python VM and find that it gives good results for all our (real-life) benchmarks.
  2. The performance of many dynamic language implementations suffers from high

    allocation rates and runtime type checks. This makes dynamic languages less applicable to purely algorithmic problems, despite their growing popularity. In this paper we present a simple compiler optimization based on online partial evaluation to remove object allocations and runtime type checks in the context of a tracing JIT. We evaluate the optimization using a Python VM and find that it gives good results for all our (real-life) benchmarks.
  3. Copyright © 2018, Oracle and/or its affiliates. All rights reserved.

    | Confidential – Oracle Internal/Restricted/Highly Restricted !5 standalone Automatic transformation of interpreters to compilers Engine integration native and managed https://gotober.com/2018/sessions/650/graalvm-run-programs-faster-anywhere
  4. Copyright © 2018, Oracle and/or its affiliates. All rights reserved.

    | !9 https://gotober.com/2018/sessions/650/graalvm-run-programs-faster-anywhere
  5. Most high-performance dynamic language virtual machines duplicate language semantics in

    the interpreter, compiler, and runtime system. This violates the principle to not repeat yourself. In contrast, we define languages solely by writing an interpreter. The interpreter performs specializations, e.g., augments the interpreted program with type information and profiling information. Compiled code is derived automatically using partial evaluation while incorporating these specializations. This makes partial evaluation practical in the context of dynamic languages: it reduces the size of the compiled code while still compiling all parts of an operation that are relevant for a particular program. When a speculation fails, execution transfers back to the interpreter, the program re-specializes in the interpreter, and later partial evaluation again transforms the new state of the interpreter to compiled code.
  6. Most high-performance dynamic language virtual machines duplicate language semantics in

    the interpreter, compiler, and runtime system. This violates the principle to not repeat yourself. In contrast, we define languages solely by writing an interpreter. The interpreter performs specializations, e.g., augments the interpreted program with type information and profiling information. Compiled code is derived automatically using partial evaluation while incorporating these specializations. This makes partial evaluation practical in the context of dynamic languages: it reduces the size of the compiled code while still compiling all parts of an operation that are relevant for a particular program. When a speculation fails, execution transfers back to the interpreter, the program re-specializes in the interpreter, and later partial evaluation again transforms the new state of the interpreter to compiled code.
  7. We implement the language semantics only once in a simple

    form: as a language interpreter written in a managed high-level host language. Optimized compiled code is derived from the interpreter using partial evaluation. This approach and its obvious benefits were described in 1971 by Y. Futamura, and is known as the first Futamura projection. To the best of our knowledge no prior high-performance language implementation used this approach.
  8. We implement the language semantics only once in a simple

    form: as a language interpreter written in a managed high-level host language. Optimized compiled code is derived from the interpreter using partial evaluation. This approach and its obvious benefits were described in 1971 by Y. Futamura, and is known as the first Futamura projection. To the best of our knowledge no prior high-performance language implementation used this approach.
  9. We implement the language semantics only once in a simple

    form: as a language interpreter written in a managed high-level host language. Optimized compiled code is derived from the interpreter using partial evaluation. This approach and its obvious benefits were described in 1971 by Y. Futamura, and is known as the first Futamura projection. To the best of our knowledge no prior high-performance language implementation used this approach.
  10. Programs • We call a program a sequence of instructions

    that can be executed by a machine. • The machine may be a virtual machine or a physical machine • In the following, when we say that a program is evaluated, we assume that there exists some machine that is able to execute these instructions.
  11. Computational Models • “A sort of programming language” • Mechanical

    evaluation • Turing machines • Partial recursive functions • Church’s Lambda expressions
  12. Computational Models 1. Conditional 2. Read/Write Memory 3. Jump (loop)

    1. Condition 2. Expression 3. Function Definition
  13. Program Evaluation • Consider a program P, with input data

    D; • when we evaluate P over D it produces some output result R. D R P
  14. Interpreters • An interpreter I is a program • it

    evaluates some other given program P over some given data D, and it produces the output result R. P D R I • We denote this with I(P, D)
  15. f(k, u) = k + u Instructions add x y

    sub x y mul x y ... write(D) while(has-more-instructions(P)): instr ← fetch-next-instruction(P) switch(op(instr)): case ’add’: x ← read() y ← read() result ← x + y write(result) case . . .
  16. Compilers • Let be P a program that evaluates to

    R when given D; • A compiler C translates a source program P into an object program C(P) that evaluated over an input D still produces R P C C(P) C(P) D R • We denote this with C(P)(D)
  17. $ cat example.ml print_string "Hello world!\n" $ ocaml example.ml Hello

    world! $ ocamlc example.ml $ ./a.out Hello world!
  18. Partial Evaluation (intuition) Let us have a computation f of

    two parameters k, u f(k, u) • Now suppose that f is often called with k = 5; • f5(u) := “f by substituting 5 for k and doing all possible computation based upon value 5” • Partial evaluation is the process of transforming f(5, u) into f5(u)
  19. This is Currying! I Know This! • Not exactly! In

    functional programming currying or partial applicationa is f5(u) := f(5, u) let f = (k, u) => k * (k * (k+1) + u+1) + u*u; let f5 = (u) => f(5, u); • In a functional programming language this usually does not change the program that implements f a Although, strictly speaking they are not synonyms, see https://en.wikipedia.org/wiki/Currying
  20. Simplification let f = (k, u) => k * (k

    * (k+1) + u + 1) + u * u; by fixing k = 5 and simplifying: let f5 = (u) => 5 * (31 + u) + u * u;
  21. Rewriting function pow(n, k) { if (k <= 0) {

    return 1; } else { return n * pow(n, k-1); } } function pow5(n) { return pow(n, 5); }
  22. Rewriting function pow(n, k) { if (k <= 0) {

    return 1; } else { return n * pow(n, k-1); } } function pow5(n) { return n * pow(n, 4); }
  23. Rewriting function pow(n, k) { if (k <= 0) {

    return 1; } else { return n * pow(n, k-1); } } function pow5(n) { return n * n * pow(n, 3); }
  24. Rewriting function pow(n, k) { if (k <= 0) {

    return 1; } else { return n * pow(n, k-1); } } function pow5(n) { return n * n * n * n * n; }
  25. Rewriting function pow(n, k) { if (k <= 0) {

    return 1; } else { return n * pow(n, k-1); } } function pow5(n) { return n * n * n * n * n; } In compilers this is sometimes called inlining
  26. Rewriting and Simplification • Rewriting is similar to macro expansion

    and procedure integration (β-reduction, inlining) in the optimization technique of a compiler. • Often combined with simplification (constant folding)
  27. Projection Projection The following equation holds for fk and f

    fk(u) = f(k, u) (1) we call fk a projection of f at k
  28. Partial Evaluator A partial computation procedure may be a computer

    program α called a projection machine, partial computer or partial evaluator. α(f, k) = fk (2)
  29. Partial Evaluator function pow(n, k) { if (k <= 0)

    { return 1; } else { return n * pow(n, k-1); } } let pow5 = alpha(pow, {k:5}); // (n) => n * n * n * n * n;
  30. Examples The paper presents: • Automatic theorem proving • Pattern

    matching • Syntax analyzer • Automatically generating a compiler
  31. Examples The paper presents: • Automatic theorem proving • Pattern

    matching • Syntax analyzer • Automatically generating a compiler
  32. Interpreters and Compilers (reprise) • An interpreter is a program

    • This program takes another program and the data as input • It evaluates the program on the input and returns the result I(P, D) • A compiler is a program • This program takes a source program and returns an object program • The object program processes the input and returns the result C(P)(D)
  33. First Equation of Partial Computation (First Projection) D R IP

    • That is, by feeding D into IP, you get R; • in other words, IP is an object program. I(P, D) = C(P)(D) α(I, P) = IP IP = C(P) (4)
  34. f(k, u) = k + u (add x y) write(D)

    while(has-more-instructions(P)): instr ← fetch-next(P) switch(op(instr)): case ’add’: x ← read() y ← read() result ← x + y write(result) case . . .
  35. f(k, u) = k + u (add x y) write(D)

    while(has-more-instructions(P)): instr ← fetch-next(P) switch(op(instr)): case ’add’: x ← read() y ← read() result ← x + y write(result) case . . . ...but this interpreter executes on a machine!
  36. Second Equation of Partial Computation (Second Projection) P IP αI

    αI(P) = IP (5) • but IP, evaluated on D gives R
  37. Second Equation of Partial Computation (Second Projection) P C(P) αI

    αI(P) = IP (5) • but IP, evaluated on D gives R • then IP is an object program (P = C(P))
  38. Second Equation of Partial Computation (Second Projection) P C(P) αI

    αI(P) = IP (5) • but IP, evaluated on D gives R • then IP is an object program (P = C(P)) • αI transforms a source program P to IP (i.e., C(P))
  39. Second Equation of Partial Computation (Second Projection) P C(P) C

    αI(P) = IP (5) • but IP, evaluated on D gives R • then IP is an object program (P = C(P)) • αI transforms a source program P to IP (i.e., C(P)) • then αI is a compiler
  40. Third Equation of Partial Computation (Third Projection) I αI =

    C αα αα(I) = αI (6) • αα is a program that given I, returns αI = C • αI transforms a source program to an object program • αI is a compiler • αα is a compiler-compiler (a compiler generator) which generates a compiler αI from an interpreter I
  41. Partial Evaluation of a Partially-Evaluated Evaluator • Let us call

    I-language a language implemented by interpreter I, αα(I) = αI • αI is then a I-language compiler • let us now substitute α for I in αα(I) = αI, • which means considering α an interpreter for the α-language. αα(α) = αα • αα is an α-language compiler.
  42. Fourth Equation of Partial Computation αα(α) = αα α αα

    αα • αα is an α-language compiler. • αα(I) = αI is an object program of I; thus: αα(I)(P) = IP (7) • What is the α-language?
  43. What is the α-language? αα(I)(P) = IP αα(f)(k) = fk

    • In other words, by finding αα we can generate the partial computation of f at k, fk • That is, αα is a partial evaluation compiler (or generator). • However, the author notes, at the time of writing, there is no way to produce αα from α(α, α) for practical α’s.
  44. Conditions for a Projection Machines 1. Correctness. Program α must

    satisfy α(f, k)(u) = f(k, u) 2. Efficiency Improvement. Program α should perform as much computation as possible for the given data k 3. Termination. Program α should terminate on partial computation of as many programs as possible. Termination at α(α, α) is most desirable However, author notes, (2) is not mathematically clear
  45. Computation Rule for Recursive Program Schema Partial Computation of f

    at k: 1. Rewriting (when semi-bound; e.g. f(5, u)) 2. Simplification 3. Tabulation The discriminating character for p.c. are the semi-bound call and tabulation.
  46. Rewriting and Simplification Rewriting is similar to macro expansion and

    procedure integration in the optimization technique of a compiler. Often combined with simplification.
  47. Termination • Does not go into the details • Shows

    that for “practical” use cases it should terminate • Cites theoretical works (e.g. Ershov)
  48. Theory of Partial Computation • In the 1930’s Turing, Church,

    Kleene proposed several computational models and clarified the mathematical meanings of mechanical procedure. • e.g. Turing machines, lambda expressions, and partial recursive functions. • Research on computability, i.e., computational power of the models, not complexity or efficiency
  49. sm n theorem • Appears in Kleene sm n theorem

    (parameterization theorem, iteration theorem). • ϕ(k) x recursive function of k variables with G¨ odel number x; • then for every m ≥ 1 and n ≥ 1 there exists a primitive recursive function s such that for all x, y1, . . . , ym λz1, . . . , zn ϕ(m+n) x (y1, . . . , ym, z1, . . . , zn) = ϕ(n) s(x,y1,...,ym) The third equation of partial computation (αα) is also used in the proof of Kleene’s recursion theorem
  50. Programming Models • Turing machines and partial recursive functions were

    formulated to describe total computation • Church’s lambda expression was based upon partial computation f(5, u) with u undefined, yields f5(u)
  51. Usage in LISPS “Implementation of a projection machine and its

    application to real world problems started in the 1960’s after the programming language LISP began to be widely used”
  52. We implement the language semantics only once in a simple

    form: as a language interpreter written in a managed high-level host language. Optimized compiled code is derived from the interpreter using partial evaluation. This approach and its obvious benefits were described in 1971 by Y. Futamura, and is known as the first Futamura projection. To the best of our knowledge no prior high- performance language implementation used this approach.
  53. We implement the language semantics only once in a simple

    form: as a language interpreter written in a managed high-level host language. Optimized compiled code is derived from the interpreter using partial evaluation. This approach and its obvious benefits were described in 1971 by Y. Futamura, and is known as the first Futamura projection. To the best of our knowledge no prior high- performance language implementation used this approach.
  54. We believe that a simple partial evaluation of a dynamic

    language interpreter cannot lead to high-performance compiled code: if the complete semantics for a language operation are included during partial evaluation, the size of the compiled code explodes; if language operations are not included during partial evaluation and remain runtime calls, performance is mediocre. To overcome these inherent problems, we write the interpreter in a style that anticipates and embraces partial evaluation. The interpreter specializes the executed instructions, e.g., collects type information and profiling information. The compiler speculates that the interpreter state is stable and creates highly optimized and compact machine code. If a speculation turns out to be wrong, i.e., was too optimistic, execution transfers back to the interpreter. The interpreter updates the information, so that the next partial evaluation is less speculative.
  55. We believe that a simple partial evaluation of a dynamic

    language interpreter cannot lead to high-performance compiled code: if the complete semantics for a language operation are included during partial evaluation, the size of the compiled code explodes; if language operations are not included during partial evaluation and remain runtime calls, performance is mediocre. To overcome these inherent problems, we write the interpreter in a style that anticipates and embraces partial evaluation. The interpreter specializes the executed instructions, e.g., collects type information and profiling information. The compiler speculates that the interpreter state is stable and creates highly optimized and compact machine code. If a speculation turns out to be wrong, i.e., was too optimistic, execution transfers back to the interpreter. The interpreter updates the information, so that the next partial evaluation is less speculative.
  56. References • W¨ urthinger et al. 2017, Practical Partial Evaluation

    for High-Performance Dynamic Languages, PLDI’17 • ˇ Selajev 2018, GraalVM: Run Programs Faster Anywhere, GOTO Berlin 2018 • Bolz et al. 2011, Allocation Removal by Partial Evaluation in a Tracing JIT, PEPM’11 • Stuart 2013, Compilers for Free, RubyConf 2013 • Cook and L¨ ammel 2011, Tutorial on Online Partial Evaluation, EPTCS’11