those who missed the keynote :-)) Overview of tracing JITs The PyPy JIT generator Just In Time talk last-modified: July, 4th, 12:06 antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 3 / 29
ideal for writing VMs JIT & GC for free Python interpreter written in RPython Whatever (dynamic) language you want smalltalk, prolog, javascript, ... antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 4 / 29
accounts for the 80% of the runtime hot-spots Fast Path principle optimize only what is necessary fall back for uncommon cases Most of runtime spent in loops Always the same code paths (likely) antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 7 / 29
accounts for the 80% of the runtime hot-spots Fast Path principle optimize only what is necessary fall back for uncommon cases Most of runtime spent in loops Always the same code paths (likely) antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 7 / 29
} class IncrOrDecr implements Operation { public int DoSomething(int x) { if (x < 0) return x-1; else return x+1; } } class tracing { public static void main(String argv[]) { int N = 100; int i = 0; Operation op = new IncrOrDecr(); while (i < N) { i = op.DoSomething(i); } System.out.println(i); } } antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 10 / 29
Instruction added to the trace but not executed Method Java code Trace Value antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 12 / 29
Instruction added to the trace but not executed Method Java code Trace Value Main while (i < N) { ILOAD 2 3 ILOAD 1 100 IF ICMPGE LABEL 1 f alse GUARD ICMPLT antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 12 / 29
Instruction added to the trace but not executed Method Java code Trace Value Main while (i < N) { ILOAD 2 3 ILOAD 1 100 IF ICMPGE LABEL 1 f alse GUARD ICMPLT i = op.DoSomething(i); ALOAD 3 IncrOrDecr obj ILOAD 2 3 INVOKEINTERFACE ... GUARD CLASS(IncrOrDecr) antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 12 / 29
Instruction added to the trace but not executed Method Java code Trace Value Main while (i < N) { ILOAD 2 3 ILOAD 1 100 IF ICMPGE LABEL 1 f alse GUARD ICMPLT i = op.DoSomething(i); ALOAD 3 IncrOrDecr obj ILOAD 2 3 INVOKEINTERFACE ... GUARD CLASS(IncrOrDecr) DoSomething if (x < 0) ILOAD 1 3 IFGE LABEL 0 true GUARD GE antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 12 / 29
a = 0; int i = 0; int N = 100; while(i < N) { if (i%2 == 0) a++; else a*=2; i++; } } antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 13 / 29
i19 = int_add_ovf(i15, 2) guard_no_overflow() ... optimized ... i17 = int_lt(i15, 5000) guard_true(i17) i19 = int_add(i15, 2) ... It works often array bound checking intbound info propagates all over the trace antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 20 / 29
i19 = int_add_ovf(i15, 2) guard_no_overflow() ... optimized ... i17 = int_lt(i15, 5000) guard_true(i17) i19 = int_add(i15, 2) ... It works often array bound checking intbound info propagates all over the trace antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 20 / 29
i19 = int_add_ovf(i15, 2) guard_no_overflow() ... optimized ... i17 = int_lt(i15, 5000) guard_true(i17) i19 = int_add(i15, 2) ... It works often array bound checking intbound info propagates all over the trace antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 20 / 29
i2 = int_add(i1, 2) p3 = new(W_IntObject) setfield_gc(p3, i2, ’intval’) ... optimized ... i2 = int_add(i1, 2) ... The most important optimization (TM) It works both inside the trace and across the loop It works for tons of cases e.g. function frames antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 22 / 29
i2 = int_add(i1, 2) p3 = new(W_IntObject) setfield_gc(p3, i2, ’intval’) ... optimized ... i2 = int_add(i1, 2) ... The most important optimization (TM) It works both inside the trace and across the loop It works for tons of cases e.g. function frames antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 22 / 29
i2 = int_add(i1, 2) p3 = new(W_IntObject) setfield_gc(p3, i2, ’intval’) ... optimized ... i2 = int_add(i1, 2) ... The most important optimization (TM) It works both inside the trace and across the loop It works for tons of cases e.g. function frames antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 22 / 29
= getfield_pure(<W_Int(2)>, ’intval’) i3 = int_add(i1, i2) ... optimized ... i1 = getfield_pure(p0, ’intval’) i3 = int_add(i1, 2) ... It “finishes the job” Works well together with other optimizations (e.g. virtuals) It also does “normal, boring, static” constant-folding antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 24 / 29
= getfield_pure(<W_Int(2)>, ’intval’) i3 = int_add(i1, i2) ... optimized ... i1 = getfield_pure(p0, ’intval’) i3 = int_add(i1, 2) ... It “finishes the job” Works well together with other optimizations (e.g. virtuals) It also does “normal, boring, static” constant-folding antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 24 / 29
= getfield_pure(<W_Int(2)>, ’intval’) i3 = int_add(i1, i2) ... optimized ... i1 = getfield_pure(p0, ’intval’) i3 = int_add(i1, 2) ... It “finishes the job” Works well together with other optimizations (e.g. virtuals) It also does “normal, boring, static” constant-folding antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 24 / 29
p0 = getfield_gc(<Cell>, ’val’) ... i2 = getfield_pure(p0, ’intval’) i3 = int_add(i1, i2) optimized ... guard_not_invalidated() ... i3 = int_add(i1, 2) ... Python is too dynamic, but we don’t care :-) No overhead in assembler code Used a bit “everywhere” Credits to Mark Shannon for the name :-) antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 26 / 29
p0 = getfield_gc(<Cell>, ’val’) ... i2 = getfield_pure(p0, ’intval’) i3 = int_add(i1, i2) optimized ... guard_not_invalidated() ... i3 = int_add(i1, 2) ... Python is too dynamic, but we don’t care :-) No overhead in assembler code Used a bit “everywhere” Credits to Mark Shannon for the name :-) antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 26 / 29
p0 = getfield_gc(<Cell>, ’val’) ... i2 = getfield_pure(p0, ’intval’) i3 = int_add(i1, i2) optimized ... guard_not_invalidated() ... i3 = int_add(i1, 2) ... Python is too dynamic, but we don’t care :-) No overhead in assembler code Used a bit “everywhere” Credits to Mark Shannon for the name :-) antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 26 / 29