Sulong: Executing Low-level Languages on Truffle

Sulong: Executing Low-level Languages on Truffle Manuel Rigger Advanced Software
Technologies Lab (Zhendong Su) ETH Zurich 1. April 2019 Interconnecting Code Workshop @ <Programming> 2019 @RiggerManuel

PhD Topic 2 Safe and Efficient Execution of Unsafe Languages
on the Java Virtual Machine

How is this Relevant for ICW? 3 An improved version
of Sulong is used within GraalVM as a native function interface

4 GraalVM, its language interoperability mechanism, and Sulong’s role

4 GraalVM, its language interoperability mechanism, and Sulong’s role I
have not been working on language interoperability myself.

Unsafe languages 5 Heartbleed Cloudbleed

Unsafe languages 5 Heartbleed Cloudbleed Graalbleed

6 GraalVM, its language interoperability mechanism, and Sulong’s role Safe
Sulong and how it safely executes LLVM- based ´languages

Sulong Interacts also with Other Code 7 Compiler builtins System
calls External Libraries Low-level libc/POSIX functions Linkage features Compiler extensions Inline assembly

8 The importance of inline assembly and compiler builtins GraalVM,
its language interoperability mechanism, and Sulong’s role Safe Sulong and how it safely executes LLVM- based languages

9 GraalVM, its language interoperability mechanism, and Sulong’s role

GraalVM 10 https://www.graalvm.org/

GraalVM 11 (Würthinger et al. 2016) GraalVM supports the execution
of various languages TruffleRuby Graal.js Graal.python FastR

GraalVM 12 (Würthinger et al. 2016) TruffleRuby Graal.js Graal.python FastR
Truffle Truffle is an language- implementation framework • Written in Java • Optimization primitives • Debugging and profiling • Language interoperability!

Graal Truffle Graal is the compiler used by Truffle

Graal Truffle Can execute on the JVM, be compiled to a standalone executable, … JVM

The languages are implemented as Abstract Syntax Tree (AST) interpreters

AST Interpreters 16 = a b 3 + a =
b + 3 Parse input program

AST Interpreters 17 Set up input 2 a b =
a b 3 +

AST Interpreters 18 Execute = a b 3 + 2
a b

a b 3 2

a b 3 2 5 a

2 a b 3 2 5 a

AST Interpreters Optimization 19 = a b 3 + Truffle
AST Interpreters specialize for their input 5 2 a b Variable Integer

AST Interpreters Optimization 19 = a b 3 + Truffle
AST Interpreters specialize for their input 5 2 a b if (input is as expected) { execute specialized operation } else { rewrite node } Variable Integer

AST Interpreters Optimization 20 = a b 3 + Partial
Evaluation = a + b 3 Variable Integer

AST Interpreters Optimization 21 Compilation = a + b 3
if (b is an Integer) { a = b + 3 } else { deoptimize and rewrite node } Variable Integer

AST Interpreters Optimization 22 = a + b 3 5
“icw” a b = a b 3 + Variable Integer Deoptimize

AST Interpreters Optimization 23 = a b 3 + =
a b 3 + Respecialize Variable Integer Generic

GraalVM 24 (Grimmer et al. 2015) TruffleRuby Graal.js Graal.python FastR

GraalVM 24 (Grimmer et al. 2015) TruffleRuby Graal.js Graal.python FastR
Language interoperability support for individual language pairs would not scale

GraalVM 25 TruffleRuby Graal.js Graal.python FastR (Grimmer et al. 2015)

GraalVM 25 TruffleRuby Graal.js Graal.python FastR Idea: Implement a language-
independent mechanism based on messages (Grimmer et al. 2015)

Message-Based Foreign Access 26 a = b + 3 =
a READ b 3 + 2 b

Message-Based Foreign Access 26 a = b + 3 =
a READ b 3 + 2 b Foreign objects can be accessed by sending a message to the foreign language implementation

Message-Based Foreign Accesses 27 = a READ B 3 +
2 Execute = a 3 + b

Message-Based Foreign Accesses 27 = a READ B 3 +
2 Execute = a 3 + Subsequent reads do not need to send a message b

Sulong as Part of GraalVM 28 Java Virtual Machine Graal
Compiler Truffle Framework https://www.graalvm.org/ TruffleRuby Graal.js Graal.python FastR Native Extension

Compiler Truffle Framework https://www.graalvm.org/ TruffleRuby Graal.js Graal.python FastR Java Native Interface

Compiler Truffle Framework https://www.graalvm.org/ TruffleRuby Graal.js Graal.python FastR Optimization Boundary Java Native Interface

Compiler Truffle Framework TruffleRuby Graal.js Graal.python FastR LLVM IR Interpreter LLVM IR Clang Flang Optimization Boundary

How to Deal with C Code Accessing VM Internals? 32
Native Extension VM Native Extension API

How to Deal with C Code Accessing VM Internals? 32
Native Extension VM Native Extension API Native extension APIs allow to access VM internals

Example: Ruby C Extension 33 # Ruby Code: array.rb s
= CArray.new puts s.arraySum([1,2,3]) // The C extension: array.c #include “ruby.h” VALUE c_arraySum(VALUE self, VALUE array) { int sum = 0; for (int i = 0; i < RARRAY_LEN(array); i++) { sum += FIX2INT(rb_ary_entry(array, i)); } return INT2FIX(sum); } Slide modified from Matthias Grimmer, with permission

Example: Ruby C Extension 34 // The C extension: array.c
#include “ruby.h” VALUE c_arraySum(VALUE self, VALUE array) { int sum = 0; for (int i = 0; i < RARRAY_LEN(array); i++) { sum += FIX2INT(rb_ary_entry(array, i)); } return INT2FIX(sum); } // ruby.h typedef VALUE void*; typedef ID void *; VALUE rb_ary_entry(VALUE ary, long idx); Slide modified from Matthias Grimmer, with permission Programmers write their native extensions using the API provided by MRI

#include “ruby.h” VALUE c_arraySum(VALUE self, VALUE array) { int sum = 0; for (int i = 0; i < RARRAY_LEN(array); i++) { sum += FIX2INT(rb_ary_entry(array, i)); } return INT2FIX(sum); } Slide modified from Matthias Grimmer, with permission // ruby.c #include “ruby.h” #include “truffle.h” VALUE rb_ary_entry(VALUE ary, long idx) { return truffle_read_idx(ary, (int) idx); } int FIX2INT(VALUE value) { return truffle_invoke_i(RUBY_CEXT, “rb_fix2int”, value); } truffle_read_idx and truffle_invoke_i are Sulong intrinsics that send messages

#include “ruby.h” VALUE c_arraySum(VALUE self, VALUE array) { int sum = 0; for (int i = 0; i < RARRAY_LEN(array); i++) { sum += FIX2INT(rb_ary_entry(array, i)); } return INT2FIX(sum); } Slide modified from Matthias Grimmer, with permission // ruby.c #include “ruby.h” #include “truffle.h” VALUE rb_ary_entry(VALUE ary, long idx) { return truffle_read_idx(ary, (int) idx); } int FIX2INT(VALUE value) { return truffle_invoke_i(RUBY_CEXT, “rb_fix2int”, value); }

#include “ruby.h” VALUE c_arraySum(VALUE self, VALUE array) { int sum = 0; for (int i = 0; i < RARRAY_LEN(array); i++) { sum += FIX2INT(rb_ary_entry(array, i)); } return INT2FIX(sum); } Slide modified from Matthias Grimmer, with permission // ruby.c #include “ruby.h” #include “truffle.h” VALUE rb_ary_entry(VALUE ary, long idx) { return truffle_read_idx(ary, (int) idx); } int FIX2INT(VALUE value) { return truffle_invoke_i(RUBY_CEXT, “rb_fix2int”, value); } # ruby.rb def rb_fix2int(value) if value.nil? raise TypeError else int = value.to_int raise RangeError if int >= 2**32 int end end

Performance 37 11 32 0 5 10 15 20 25
30 35 Peak performance relative to MRI running pure Ruby MRI with C Extensions GraalVM with C Extensions Slide modified from Matthias Grimmer, with permission

Performance 37 11 32 0 5 10 15 20 25
30 35 Peak performance relative to MRI running pure Ruby MRI with C Extensions GraalVM with C Extensions Slide modified from Matthias Grimmer, with permission Truffle can inline the function call from Ruby to C!

38 Safe Sulong and how it safely executes LLVM-based Languages

Problem: C/C++ are unsafe languages 39 Undefined Behavior (UB) “behavior,
upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements “ (C99 standard)

Examples for Undefined Behavior Buffer overflow Use-after-free error Integer overflow
40

Buffer Overflows: Leaking Sensitive Data 41 long *arr = malloc(3
* sizeof(long)); arr: secret

* sizeof(long)); long dest[4]; memcpy(dest, arr, sizeof(dest)); arr: dest: secret

* sizeof(long)); long dest[4]; memcpy(dest, arr, sizeof(dest)); arr: dest: secret UB

* sizeof(long)); long dest[4]; memcpy(dest, arr, sizeof(dest)); arr: dest: secret secret

* sizeof(long)); long dest[4]; memcpy(dest, arr, sizeof(dest)); arr: dest: secret secret Heartbleed and Cloudbleed were such vulnerabilities

* sizeof(long)); long dest[4]; memcpy(dest, arr, sizeof(dest)); arr: dest: secret secret Heartbleed and Cloudbleed were such vulnerabilities Writes can allow attackers to change a program’s control flow

Use-after-free Error 44 long *arr = malloc(3 * sizeof(long)); free(arr);
arr[0] = …; UB

Use-after-free Error 44 long *arr = malloc(3 * sizeof(long)); free(arr);
arr[0] = …; UB Another object can be overwritten if the memory has been reallocated

Integer Overflow 45 int a = 1, b = INT_MAX;
int val = a + b; UB

Integer Overflow 45 int a = 1, b = INT_MAX;
int val = a + b; UB Can result in inconsistent or surprising behavior if UB is “optimized away”

Integer Overflow 46 void pause() { int a = 0;
// run until overflow while (a < a + 1) { a++; } }

// run until overflow while (a < a + 1) { a++; } } What’s the compilation output of Clang/GCC? 1. The function works as expected by the programmer 2. The function body is optimized away 3. The function results in an endless loop 4. It depends on the optimization level

// run until overflow while (a < a + 1) { a++; } }

// run until overflow while (a < a + 1) { a++; } } mov dword ptr [rsp - 4], 0 jmp loop_header loop_body: add dword ptr [rsp - 4], 1 loop_header: mov eax, dword ptr [rsp - 4] mov ecx, dword ptr [rsp - 4] add ecx, 1 cmp eax, ecx jl loop_body ret -O0

// run until overflow while (a < a + 1) { a++; } } loop: jmp loop mov dword ptr [rsp - 4], 0 jmp loop_header loop_body: add dword ptr [rsp - 4], 1 loop_header: mov eax, dword ptr [rsp - 4] mov ecx, dword ptr [rsp - 4] add ecx, 1 cmp eax, ecx jl loop_body ret -O3 -O0

Goal of my PhD 48 Tackle UB by safely and
efficiently executing unsafe languages on the JVM

efficiently executing unsafe languages on the JVM

efficiently executing unsafe languages on the JVM Well-defined semantics even for errors and corner cases

50 Existing Approaches Instrumentation- based bug-finding tools Symbolic execution Safe
languages Hardware security Static analysis Attacker mitigation

51 Existing Approaches Instrumentation- based bug-finding tools Symbolic execution Safe
languages Hardware security Static analysis Attacker mitigation

State of the Art: Instrumentation-based Tools 52 a.out Clang/GCC C
./a.out Hello world!

State of the Art: Instrumentation-based Tools Compile-time instrumentation • AddressSanitizer
• SoftBound+CETS 52 a.out Clang/GCC C ./a.out Hello world!

State of the Art: Instrumentation-based Tools Compile-time instrumentation • AddressSanitizer
• SoftBound+CETS 52 a.out Clang/GCC C ./a.out Hello world! Run-time instrumentation • Memcheck • Dr. Memory

Conundrum: Finding Bugs vs. Performance 53 a.out Clang/GCC C ./a.out
Hello world!

Conundrum: Finding Bugs vs. Performance 53 a.out Clang/GCC C ./a.out
Hello world! Static compilers: optimize code based on Undefined Behavior Bug-finding tools: find bugs assuming that violations are visible side effects (Wang et al. 2012, D'Silva 2015)

Conundrum: Finding Bugs vs. Performance 54 To find all bugs,
developers need to disable compiler optimizations

Map Data Structures and Operations to Java 55 long *arr
= malloc(3 * sizeof(long)); arr[4] = …

Map Data Structures and Operations to Java 55 long *arr
= malloc(3 * sizeof(long)); arr[4] = … Map to Java Code

Map Data Structures and Operations to Java 55 long[] arr
= new long[3]; arr[4] = … long *arr = malloc(3 * sizeof(long)); arr[4] = … Map to Java Code

= new long[3]; arr[4] = … long *arr = malloc(3 * sizeof(long)); arr[4] = … Map to Java Code The semantics of an out-of- bounds access are well specified

= new long[3]; arr[4] = … long *arr = malloc(3 * sizeof(long)); arr[4] = … Map to Java Code ArrayIndexOutOfBoundsException The semantics of an out-of- bounds access are well specified

= new long[3]; arr[4] = … long *arr = malloc(3 * sizeof(long)); arr[4] = … Map to Java Code ArrayIndexOutOfBoundsException The semantics of an out-of- bounds access are well specified The JVM’s compiler optimizes the program, but without optimizing Undefined Behavior away

Sulong 56 Sulong is a Truffle-based LLVM IR Interpreter LLVM
IR Interpreter LLVM IR Clang program.c libc.c Truffle Graal JVM

Sulong 56 Sulong is a Truffle-based LLVM IR Interpreter LLVM
IR Interpreter LLVM IR Clang program.c libc.c Truffle Graal JVM We need to disable Clang’s optimizations

{0, 0, 0} Address offset = 0 data I64Array contents
Prevent Out-Of-Bounds Accesses 57 long *arr = malloc(3 * sizeof(long)); [How do we know the type?] [Pointer to an integer?] [Array bounds check elimination] [Strict-aliasing rule]

Prevent Out-Of-Bounds Accesses 58 long *arr = malloc(3 * sizeof(long));
arr[4] = … {0, 0, 0} Address offset = 4 data I64Array contents [Pointer to an integer?] [Array bounds check elimination] [Strict-aliasing rule]

Prevent Out-Of-Bounds Accesses contents[4] → ArrayIndexOutOfBoundsException 58 long *arr =
malloc(3 * sizeof(long)); arr[4] = … {0, 0, 0} Address offset = 4 data I64Array contents [Pointer to an integer?] [Array bounds check elimination] [Strict-aliasing rule]

Prevent Use-after-Free Errors 59 long *arr = malloc(3 * sizeof(long));
free(arr); {0, 0, 0} Address offset = 0 data I64Array contents [Pointer to an integer?] [Strict-aliasing rule]

free(arr); Address offset = 0 data I64Array contents=null [Pointer to an integer?] [Strict-aliasing rule]

free(arr); arr[0] = … Address offset = 0 data I64Array contents=null [Pointer to an integer?] [Strict-aliasing rule]

Prevent Use-after-Free Errors contents[0] → NullPointerException 62 long *arr =
malloc(3 * sizeof(long)); free(arr); arr[0] = … Address offset = 0 data I64Array contents=null [Pointer to an integer?] [Strict-aliasing rule]

Prevent Integer Overflows 63 int a = 1, b =
INT_MAX; int val = a + b; Math.addExact(a, b); [Pointer to an integer?]

Prevent Integer Overflows 63 int a = 1, b =
INT_MAX; int val = a + b; Math.addExact(a, b); ArithmeticException [Pointer to an integer?]

Safe Optimizations 64 ArrayIndexOutOfBoundsException NullPointerException ArithmeticException Exceptions are visible side
effects and cannot be optimized away

Evaluation Hypotheses • Effectiveness: Safe Sulong detects bugs that are
overlooked by other tools • Performance: Safe Sulong’s performance overhead is “reasonable” 65

Effectiveness: Errors in GitHub Projects 66 http://ssw.jku.at/General/Staff/ManuelRigger/ASPLOS18-SafeSulong-Bugs.csv

Effectiveness: Errors in GitHub Projects 66 http://ssw.jku.at/General/Staff/ManuelRigger/ASPLOS18-SafeSulong-Bugs.csv 68 errors in
(small) open-source projects

Effectiveness: Errors in GitHub Projects • Valgrind detected half of
the errors • 8 errors not found by LLVM’s AddressSanitizer (and Valgrind) • Compiler optimizations (ASan –O3) prevented the detection of 4 additional bugs 67 [Comparison tools]

Effectiveness: Errors in GitHub Projects 68 int main(int argc, char**
argv) { printf("%d %s\n", argc, argv[5]); } [Comparison tools] Out-of-bounds accesses to argv are not instrumented by ASan

Effectiveness: Errors in GitHub Projects 69 https://github.com/google/sanitizers/issues/762

Effectiveness: Errors in GitHub Projects • 8 errors not found
by LLVM’s AddressSanitizer and Valgrind 70 int main(int argc, char** argv) { printf("%d %s\n", argc, argv[5]); } [Comparison tools] In Safe Sulong instrumentation cannot be omitted by design

Peak Performance 71 lower is better

Peak Performance 71 lower is better Safe Sulong‘s performance is
mostly between Clang –O0 and Clang –O3, and mostly faster than ASan –O0

Compiler Truffle Framework TruffleRuby Graal.js Graal.python FastR LLVM IR Interpreter LLVM IR Clang Flang Optimization Boundary

Compiler Truffle Framework TruffleRuby Graal.js Graal.python FastR LLVM IR Interpreter LLVM IR Clang Flang Optimization Boundary Managed Sulong, derived from Safe Sulong, is available in GraalVM

Sulong Key Collaborators 73 Jacob Kreindl Raphael Mosaner Roland Schatz
Josef Eisl Christian Häubl Matthias Grimmer Thomas Pointhuber Daniel Pekarek Chris Seaton Lukas Stadler Florian Angerer David Gnedt https://github.com/graalvm/sulong/graphs/contributors Swapnil Gaikwad

74 The importance of inline assembly and compiler builtins

C/C++ Fortran

What about inline assembly? 76

What about GCC builtins? 77

What about linkage features? 78

Inline Assembly Compiler builtins System calls External Libraries Low-level libc/POSIX
functions Linkage features C/C++ Fortran Compiler extensions Non-standard-compliant code

Collaborators 81 Stefan Marr Stephen Kell David Leopoldseder Hanspeter Mössenböck
Bram Adams

82 if (__builtin_expect(x, 0)) foo (); asm("rdtsc":"=a"(tickl),"=d"(tickh)); Inline Assembly C
Projects Consist of More Than C Code Compiler builtins [Inline assembly details] [Inline Assembly and GCC Builtins in Sulong]

83 if (__builtin_expect(x, 0)) foo (); asm("rdtsc":"=a"(tickl),"=d"(tickh)); Inline Assembly C
Projects Consist of More Than C Code Compiler builtins [Inline assembly details] [Inline Assembly and GCC Builtins in Sulong] ~1,000 instructions for a single complex ISA like x86-64

if (__builtin_expect(x, 0)) foo (); asm("rdtsc":"=a"(tickl),"=d"(tickh)); Inline Assembly C Projects
Consist of More Than C Code Compiler builtins [Inline assembly details] [Inline Assembly and GCC Builtins in Sulong] Over 1,000 GCC builtins 84

C Projects Consist of More Than C Code 85 How
frequently are these used? How are they used? What is the implementation effort to cover most programs? How well do comparable tools support them?

C Projects Consist of More Than C Code 85 How
frequently are these used? How are they used? What is the implementation effort to cover most programs? How well do comparable tools support them? Informed decision to decide whether and do what extent to implement them in Sulong!

Mining of C GitHub Projects 86 GCC Builtins Inline Assembly
# studied projects ~5,000 ~1,300 Considered projects All C projects C Client Applications Identification grep <builtin name> grep asm

Mining of C GitHub Projects 86 GCC Builtins Inline Assembly
# studied projects ~5,000 ~1,300 Considered projects All C projects C Client Applications Identification grep <builtin name> grep asm Different setups, so the comparison should be taken with a grain of salt

How widespread are GCC builtins and inline assembly fragments? 87

In How Many Projects are They Used? 28% 37% 0
10 20 30 40 % of projects Popular projects with inline assembly (Popular) projects with GCC builtins Both GCC builtins and inline assembly are frequently used by projects 88

How Often are They Used Within a Project? 50k 6k
0 10 20 30 40 50 Density (occurrence per KLOC) Popular projects with inline assembly (Popular) projects with GCC builtins They are infrequently used within a project 89

How are inline assembly and GCC builtins used? 90

Inline Assembly 91 Inline assembly fragments can contain an arbitrary
number of instructions; how many do they typically contain?

number of instructions; how many do they typically contain? uint64 sqlite3Hwtime(void){ unsigned long val; __asm__ ("rdtsc" : "=A" (val)); return val; }

number of instructions; how many do they typically contain? uint64 sqlite3Hwtime(void){ unsigned long val; __asm__ ("rdtsc" : "=A" (val)); return val; } __asm__ __volatile__ ( " leaq %0, %%rax\n" " movq %%rbp, 8(%%rax)\n" /* save regs rbp and rsp " movq %%rsp, (%%rax)\n" " movq %%rax, %%rsp\n" /* make rsp point to &ar " movq 16(%%rsp), %%rsi\n" /* rsi = in */ " movq 32(%%rsp), %%rdi\n" /* rdi = out */ " movq 24(%%rsp), %%r9\n" /* r9 = last */ " movq 48(%%rsp), %%r10\n" /* r10 = end */ " movq 64(%%rsp), %%rbp\n" /* rbp = lcode */ " movq 72(%%rsp), %%r11\n" /* r11 = dcode */ " movq 80(%%rsp), %%rdx\n" /* rdx = hold */ " movl 88(%%rsp), %%ebx\n" /* ebx = bits */ " movl 100(%%rsp), %%r12d\n" /* r12d = lmask */ " movl 104(%%rsp), %%r13d\n" /* r13d = dmask */ /* r14d = len */ /* r15d = dist */ " cld\n" " cmpq %%rdi, %%r10\n" " je .L_one_time\n" /* if only one decode le " cmpq %%rsi, %%r9\n" " je .L_one_time\n" " jmp .L_do_loop\n" ".L_one_time:\n" " movq %%r12, %%r8\n" /* r8 = lmask */ " cmpb $32, %%bl\n" " ja .L_get_length_code_one_time\n" " lodsl\n" /* eax = *(uint *)in++ * " movb %%bl, %%cl\n" /* cl = bits, needs it f " addb $32, %%bl\n" /* bits += 32 */ " shlq %%cl, %%rax\n" " orq %%rax, %%rdx\n" /* hold |= *((uint *)in) " jmp .L_get_length_code_one_time\n"

How are Inline Assembly Fragments Used? 92 0 10 20
30 40 50 60 70 80 90 100 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 Cumulative percentage Number of unique fragments per project 36% A number of projects only uses a single inline assembly fragments

30 40 50 60 70 80 90 100 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 Cumulative percentage Number of unique fragments per project 99% Almost all projects use less than 25 inline assembly fragments

40 50 60 70 80 90 100 1 2 3 4 5 6 7 8 9 10 11 12 Cumulative percentage Number of instructions per unique fragment 94 100% 438 … We also found fragments with several hundred instructions

How are Inline Assembly Fragments Used? 95 Inline assembly fragments
typically consist of a low number of instructions.

How are GCC Builtins Used? 96 if (__builtin_expect(x, 0)) foo
(); Architecture-independent builtin c = __builtin_ia32_paddb(a, b); Architecture-specific builtin Architecture-specific builtins are similar to inline assembly. Are they used?

How are GCC Builtins Used? 97 38% 36% 8% 0
500 1000 1500 2000 Number of projects Used builtins Machine-independent Machine-specific Mainly machine-independent GCC builtins are used.

Machine-specific vs. Machine-independent Builtins 98 17 3 4 A project
that uses machine-specific builtins uses them in a larger number.

How well do tools support them and how much effort
needs to be invested to support them? 99

Tool Support for Inline Assembly 100 c2go transpile test.c panic:
unknown node type: 'GCCAsmStmt 0x3a991f8 <line:5:3, col:38>'goroutine 1 [running]:github_com_elliotchance_c2go_ast.Parse go/src/github.com/elliotchance/c2go/ast/ast.go:211main.convertLinesToNodes go/src/github.com/elliotchance/c2go/main.go:81main.Start go/src/github.com/elliotchance/c2go/main.go:219main.runCommand go/src/github.com/elliotchance/c2go/main.go:350main.main go/src/github.com/elliotchance/c2go/main.go:277goroutine 6 [finalizer wait]: Splint 3.1.2 --- 03 May 2009 test.c: (in function rdtsc) test.c:5:3: Unrecognized identifier: asm Identifier used in code has not been declared. (Use –unrecog to inhibit warning) test.c:5:15: Parse Error. (For help on parse errors, see splint -help parseerrors.) *** Cannot continue.

Tool Support 101 Test suite for the most commonly-used 100
builtins

Bugs in CompCert 102 https://github.com/AbsInt/CompCert/issues/243 [Details bug]

103 Tool support is lacking behind

How much effort is needed to implement GCC Builtins? 104
[Details]

32 builtins to support half of projects [Details]

1600 builtins to support 99% of projects 32 builtins to support half of projects [Details]

1600 builtins to support 99% of projects 32 builtins to support half of projects [Details] Machine-independent builtins are the “low-hanging fruits”

Are they a legacy feature that has survived until today?
105

GCC Builtin Usage Over Time 106 [Details] We analyzed the
commit history of the GCC builtin projects

GCC Builtin Usage Over Time Trend Projects Increasing 38% Stagnant
26% Decreasing 14% Inconclusive 22% 107 64% of projects have been mainly adding builtins

Research Opportunities • Other elements, such as compiler pragmas and
function attributes are not widely understood • Testing the correct usage of inline assembly and GCC builtins • Support in formal models and static analysis tools • Automatic approaches? 108

Inline Assembly Compiler builtins System calls External Libraries Low-level libc/POSIX
functions Linkage features C/C++ Fortran Compiler extensions Non-standard-compliant code

110 Addressing the last 20% of the problem took 80%
of the time

Pareto Principle 111 80% of the effects come from 20%
of the causes

Pareto Principle 112 It is useful to consider the “seemingly”
less- important 20% of a problem • Avoids oversimplifications • Helps designing holistic solutions • Leads to new research questions

Discussion: What About Other Overlooked Problems? 113 In which 20%
of important use cases do current language interoperability approaches fail? Which 20% of important use cases cannot be expressed with <name of DSL> and how does it affect users? Which 20% of an approach for connecting heterogeneous code provides bad usability and how can we improve on it?

Summary 114

Sulong: Executing Low-level Languages on Truffle

Sulong: Executing Low-level Languages on Truffle

More Decks by Manuel Rigger

Other Decks in Research

Featured

Transcript