Truffle Truffle is an language- implementation framework • Written in Java • Optimization primitives • Debugging and profiling • Language interoperability!
= CArray.new puts s.arraySum([1,2,3]) // The C extension: array.c #include “ruby.h” VALUE c_arraySum(VALUE self, VALUE array) { int sum = 0; for (int i = 0; i < RARRAY_LEN(array); i++) { sum += FIX2INT(rb_ary_entry(array, i)); } return INT2FIX(sum); } Slide modified from Matthias Grimmer, with permission
#include “ruby.h” VALUE c_arraySum(VALUE self, VALUE array) { int sum = 0; for (int i = 0; i < RARRAY_LEN(array); i++) { sum += FIX2INT(rb_ary_entry(array, i)); } return INT2FIX(sum); } // ruby.h typedef VALUE void*; typedef ID void *; VALUE rb_ary_entry(VALUE ary, long idx); Slide modified from Matthias Grimmer, with permission Programmers write their native extensions using the API provided by MRI
#include “ruby.h” VALUE c_arraySum(VALUE self, VALUE array) { int sum = 0; for (int i = 0; i < RARRAY_LEN(array); i++) { sum += FIX2INT(rb_ary_entry(array, i)); } return INT2FIX(sum); } Slide modified from Matthias Grimmer, with permission // ruby.c #include “ruby.h” #include “truffle.h” VALUE rb_ary_entry(VALUE ary, long idx) { return truffle_read_idx(ary, (int) idx); } int FIX2INT(VALUE value) { return truffle_invoke_i(RUBY_CEXT, “rb_fix2int”, value); } truffle_read_idx and truffle_invoke_i are Sulong intrinsics that send messages
#include “ruby.h” VALUE c_arraySum(VALUE self, VALUE array) { int sum = 0; for (int i = 0; i < RARRAY_LEN(array); i++) { sum += FIX2INT(rb_ary_entry(array, i)); } return INT2FIX(sum); } Slide modified from Matthias Grimmer, with permission // ruby.c #include “ruby.h” #include “truffle.h” VALUE rb_ary_entry(VALUE ary, long idx) { return truffle_read_idx(ary, (int) idx); } int FIX2INT(VALUE value) { return truffle_invoke_i(RUBY_CEXT, “rb_fix2int”, value); }
#include “ruby.h” VALUE c_arraySum(VALUE self, VALUE array) { int sum = 0; for (int i = 0; i < RARRAY_LEN(array); i++) { sum += FIX2INT(rb_ary_entry(array, i)); } return INT2FIX(sum); } Slide modified from Matthias Grimmer, with permission // ruby.c #include “ruby.h” #include “truffle.h” VALUE rb_ary_entry(VALUE ary, long idx) { return truffle_read_idx(ary, (int) idx); } int FIX2INT(VALUE value) { return truffle_invoke_i(RUBY_CEXT, “rb_fix2int”, value); } # ruby.rb def rb_fix2int(value) if value.nil? raise TypeError else int = value.to_int raise RangeError if int >= 2**32 int end end
30 35 Peak performance relative to MRI running pure Ruby MRI with C Extensions GraalVM with C Extensions Slide modified from Matthias Grimmer, with permission
30 35 Peak performance relative to MRI running pure Ruby MRI with C Extensions GraalVM with C Extensions Slide modified from Matthias Grimmer, with permission Truffle can inline the function call from Ruby to C!
upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements “ (C99 standard)
* sizeof(long)); long dest[4]; memcpy(dest, arr, sizeof(dest)); arr: dest: secret secret Heartbleed and Cloudbleed were such vulnerabilities Writes can allow attackers to change a program’s control flow
// run until overflow while (a < a + 1) { a++; } } What’s the compilation output of Clang/GCC? 1. The function works as expected by the programmer 2. The function body is optimized away 3. The function results in an endless loop 4. It depends on the optimization level
Hello world! Static compilers: optimize code based on Undefined Behavior Bug-finding tools: find bugs assuming that violations are visible side effects (Wang et al. 2012, D'Silva 2015)
= new long[3]; arr[4] = … long *arr = malloc(3 * sizeof(long)); arr[4] = … Map to Java Code The semantics of an out-of- bounds access are well specified
= new long[3]; arr[4] = … long *arr = malloc(3 * sizeof(long)); arr[4] = … Map to Java Code ArrayIndexOutOfBoundsException The semantics of an out-of- bounds access are well specified
= new long[3]; arr[4] = … long *arr = malloc(3 * sizeof(long)); arr[4] = … Map to Java Code ArrayIndexOutOfBoundsException The semantics of an out-of- bounds access are well specified The JVM’s compiler optimizes the program, but without optimizing Undefined Behavior away
Prevent Out-Of-Bounds Accesses 57 long *arr = malloc(3 * sizeof(long)); [How do we know the type?] [Pointer to an integer?] [Array bounds check elimination] [Strict-aliasing rule]
the errors • 8 errors not found by LLVM’s AddressSanitizer (and Valgrind) • Compiler optimizations (ASan –O3) prevented the detection of 4 additional bugs 67 [Comparison tools]
by LLVM’s AddressSanitizer and Valgrind 70 int main(int argc, char** argv) { printf("%d %s\n", argc, argv[5]); } [Comparison tools] In Safe Sulong instrumentation cannot be omitted by design
Compiler Truffle Framework TruffleRuby Graal.js Graal.python FastR LLVM IR Interpreter LLVM IR Clang Flang Optimization Boundary Managed Sulong, derived from Safe Sulong, is available in GraalVM
Josef Eisl Christian Häubl Matthias Grimmer Thomas Pointhuber Daniel Pekarek Chris Seaton Lukas Stadler Florian Angerer David Gnedt https://github.com/graalvm/sulong/graphs/contributors Swapnil Gaikwad
Projects Consist of More Than C Code Compiler builtins [Inline assembly details] [Inline Assembly and GCC Builtins in Sulong] ~1,000 instructions for a single complex ISA like x86-64
frequently are these used? How are they used? What is the implementation effort to cover most programs? How well do comparable tools support them? Informed decision to decide whether and do what extent to implement them in Sulong!
# studied projects ~5,000 ~1,300 Considered projects All C projects C Client Applications Identification grep <builtin name> grep asm Different setups, so the comparison should be taken with a grain of salt
10 20 30 40 % of projects Popular projects with inline assembly (Popular) projects with GCC builtins Both GCC builtins and inline assembly are frequently used by projects 88
0 10 20 30 40 50 Density (occurrence per KLOC) Popular projects with inline assembly (Popular) projects with GCC builtins They are infrequently used within a project 89
number of instructions; how many do they typically contain? uint64 sqlite3Hwtime(void){ unsigned long val; __asm__ ("rdtsc" : "=A" (val)); return val; }
30 40 50 60 70 80 90 100 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 Cumulative percentage Number of unique fragments per project 36% A number of projects only uses a single inline assembly fragments
40 50 60 70 80 90 100 1 2 3 4 5 6 7 8 9 10 11 12 Cumulative percentage Number of instructions per unique fragment 94 100% 438 … We also found fragments with several hundred instructions
(); Architecture-independent builtin c = __builtin_ia32_paddb(a, b); Architecture-specific builtin Architecture-specific builtins are similar to inline assembly. Are they used?
unknown node type: 'GCCAsmStmt 0x3a991f8 <line:5:3, col:38>'goroutine 1 [running]:github_com_elliotchance_c2go_ast.Parse go/src/github.com/elliotchance/c2go/ast/ast.go:211main.convertLinesToNodes go/src/github.com/elliotchance/c2go/main.go:81main.Start go/src/github.com/elliotchance/c2go/main.go:219main.runCommand go/src/github.com/elliotchance/c2go/main.go:350main.main go/src/github.com/elliotchance/c2go/main.go:277goroutine 6 [finalizer wait]: Splint 3.1.2 --- 03 May 2009 test.c: (in function rdtsc) test.c:5:3: Unrecognized identifier: asm Identifier used in code has not been declared. (Use –unrecog to inhibit warning) test.c:5:15: Parse Error. (For help on parse errors, see splint -help parseerrors.) *** Cannot continue.
function attributes are not widely understood • Testing the correct usage of inline assembly and GCC builtins • Support in formal models and static analysis tools • Automatic approaches? 108
of important use cases do current language interoperability approaches fail? Which 20% of important use cases cannot be expressed with <name of DSL> and how does it affect users? Which 20% of an approach for connecting heterogeneous code provides bad usability and how can we improve on it?