primitive types (java.lang.Integer, ...) class C[T](t: T) class C[T](t: T) The process is called erasure, and it replaces type paramerers by their upper bound scalac / javac
primitive types (java.lang.Integer, ...) class C[T](t: T) class C[T](t: T) The process is called erasure, and it replaces type paramerers by their upper bound scalac / javac You don't see it: scala.Int can be either int or j.l.Integer. Scalac does the work for you!
primitive types (java.lang.Integer, ...) class C[T](t: T) class C[T](t: T) The process is called erasure, and it replaces type paramerers by their upper bound scalac / javac Yet boxing degrades performance - heap allocations / GC cycles ... - indirect reads, broken locality You don't see it: scala.Int can be either int or j.l.Integer. Scalac does the work for you!
T) C$mcI$sp C$mcJ$sp C$mcD$sp … and 6 others C[T] new C[Int](4) spec new C$mcI$sp(4) Adapted to integers (t$mcI$sp: int) new C(“abc”) spec new C[String](“abc”)
T) C$mcI$sp C$mcJ$sp C$mcD$sp … and 6 others C[T] new C[Int](4) spec new C$mcI$sp(4) * similar transformation for methods Adapted to integers (t$mcI$sp: int) new C(“abc”) spec new C[String](“abc”)
T) C$mcI$sp C$mcJ$sp C$mcD$sp … and 6 others C[T] new C[Int](4) spec new C$mcI$sp(4) * similar transformation for methods Adapted to integers (t$mcI$sp: int) new C(“abc”) spec new C[String](“abc”) Can speed up certain code patterns by up to 20x
new C[String](“abc”) t$mcI$sp: int class C[ class C[@specialized @specialized T](t: T) T](t: T) C[T] C$mcI$sp C$mcJ$sp C$mcD$sp … and 6 others t: Object
• fully specializing Function2 → 103 traits • upfront bytecode (not on-demand) • too much for the Scala library bytecode bloat? bytecode bloat? Unit, Boolean, Byte, Char, Short, Int, Long, Float, Double, Object Unit, Boolean, Byte, Char, Short, Int, Long, Float, Double, Object Still want to distribute it via maven, not via torrents
do better One day in 2012 Miguel Garcia walked into my office and said: “From a low-level perspective, there are only values and pointers. Maybe you can use that!”
do better One day in 2012 Miguel Garcia walked into my office and said: “From a low-level perspective, there are only values and pointers. Maybe you can use that!” ... LONG DOUBLE INT FLOAT SHORT
do better One day in 2012 Miguel Garcia walked into my office and said: “From a low-level perspective, there are only values and pointers. Maybe you can use that!” ... LONG DOUBLE INT FLOAT SHORT a long integer
born idea was born it started from the tagged union TAG DATA (VALUE) • somewhat similar to a boxed object • but not in the heap memory • direct access to the value
born idea was born it started from the tagged union TAG DATA (VALUE) • somewhat similar to a boxed object • but not in the heap memory • direct access to the value Same benefits as for unboxed values
born idea was born we can reduce the number of variants C_J[T] C[T] C_L[T] “From a low-level perspective, there are only values and pointers.” Let's take an example
def choice_J[T](t1: (Tag,Value), t2: (Tag,Value)):(Tag,Value)= if (util.Random.nextBoolean()) t1 else t2 That's wasteful: we carry the tag for T twice This is naive tagged union
def choice_J[T](t1: (Tag,Value), t2: (Tag,Value)):(Tag,Value)= if (util.Random.nextBoolean()) t1 else t2 That's wasteful: we carry the tag for T twice And we even return it, despite the caller having passed it This is naive tagged union
def choice_J[T](t1: (Tag,Value), t2: (Tag,Value)):(Tag,Value)= if (util.Random.nextBoolean()) t1 else t2 That's wasteful: we carry the tag for T twice And we even return it, despite the caller having passed it This is naive tagged union Insight: we're in a statically typed language, use that!
def choice_J[T](T_Tag: Tag, t1: Value, t2: Value):Value= if (util.Random.nextBoolean()) t1 else t2 T_Tag corresponds to the type parameter Sort of a class tag
def choice_J[T](T_Tag: Tag, t1: Value, t2: Value):Value= if (util.Random.nextBoolean()) t1 else t2 T_Tag corresponds to the type parameter Sort of a class tag Encoded as Long
bytecode issue bytecode issue trait Function2[-T1, -T2, +R] • with specialization this produces 103 traits • with miniboxing only 23 (100x less bytecode)
bytecode issue bytecode issue trait Function2[-T1, -T2, +R] • with specialization this produces 103 traits • with miniboxing only 23 (100x less bytecode) • so we expect it will be usable on the library
bytecode issue bytecode issue trait Function2[-T1, -T2, +R] • with specialization this produces 103 traits • with miniboxing only 23 (100x less bytecode) • so we expect it will be usable on the library But before we wrap this up
• conversions – between minboxed and unboxed integer types • free on x64 – between miniboxed and floating point types • low overhead (not free*) * improved translation (thanks Rex!)
• conversions – between minboxed and unboxed integer types • free on x64 – between miniboxed and floating point types • low overhead (not free*) – between miniboxed and boxed values • avoided by @miniboxed! * improved translation (thanks Rex!)
benchmarks before the benchmarks List[T] what is List[T] when T is miniboxed? – for specialization: [T Int] List[T] => List[Int] ← – for miniboxing: still List[T]
benchmarks before the benchmarks List[T] what is List[T] when T is miniboxed? – for specialization: [T Int] List[T] => List[Int] ← – for miniboxing: still List[T] List_J[T] List_L[T] List[T]
benchmarks before the benchmarks List[T] what is List[T] when T is miniboxed? – for specialization: [T Int] List[T] => List[Int] ← – for miniboxing: still List[T] List_J[T] List_L[T] List[T] But List[T] is an interface, we can still have an adapted implementation class (List_J)
more here more here • essentially a very general mechanism • and Eugene (xeno-by) gave it a shot – in a week he implemented – a value class plugin – with multi-param value classes
more here more here • essentially a very general mechanism • and Eugene (xeno-by) gave it a shot – in a week he implemented – a value class plugin – with multi-param value classes https://github.com/miniboxing/value-plugin
more here more here • essentially a very general mechanism • and Eugene (xeno-by) gave it a shot – in a week he implemented – a value class plugin – with multi-param value classes https://github.com/miniboxing/value-plugin
initial prototype, as a semester project • Eugene Burmako - the value class plugin based on the LDL transformation • Aymeric Genet - developing collection-like benchmarks for the miniboxing plugin • Martin Odersky, for his patient guidance • Iulian Dragos, for his work on specialization and many explanations • Miguel Garcia, for his original insights that spawned the miniboxing idea • Michel Schinz, for his wonderful comments and enlightening ACC course • Andrew Myers and Roland Ducournau for the discussions we had and the feedback provided • Heather Miller for the eye-opening discussions we had • Vojin Jovanovic, Sandro Stucki, Manohar Jonalagedda and the whole LAMP laboratory in EPFL for the extraordinary atmosphere • Adriaan Moors, for the miniboxing name which stuck :)) • Thierry Coppey, Vera Salvisberg and George Nithin, who patiently listened to many presentations and provided valuable feedback • Grzegorz Kossakowski, for the many brainstorming sessions on specialization • Erik Osheim, Tom Switzer and Rex Kerr for their guidance on the Scala community side • OOPSLA paper and artifact reviewers, who reshaped the paper with their feedback • Sandro, Vojin, Nada, Heather, Manohar - reviews and discussions on the LDL paper • Hubert Plociniczak for the type notation in the LDL paper • Denys Shabalin, Dmitry Petrashko for their patient reviews of the LDL paper