Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Don't @ Me! Faster Instance Variables with Object Shapes

Don't @ Me! Faster Instance Variables with Object Shapes

This presentation is about the Object Shapes implementation in Ruby 3.2 and the impact that it has on the JIT compiler

Aaron Patterson

November 30, 2022
Tweet

More Decks by Aaron Patterson

Other Decks in Programming

Transcript

  1. Instance Variables How to store them? class Hello def initialize

    @foo = 1 @bar = 2 end def foo @foo + @bar end end Hello.new Instance of Hello IV Hash Table IV Name IV Value :@foo 1 :@bar 2
  2. Hash based implementation If it were written in Ruby class

    Object def initialize @ivs = {} # Magic instance variable hash end def instance_variable_set name, value @ivs[name] = value end def instance_variable_get name @ivs[name] end def instance_variable_defined? name # ooohhh, why did he put this method in the example? # I bet it's foreshadowing! @ivs.key? name end end
  3. Tree Walking Interpreter class Hello def initialize @foo = 1

    @bar = 2 end def foo @foo + @bar end end + @foo @bar 1 2 3 Hash Lookup! Hash Lookup!
  4. YARV Execution Code is compiled to instructions before execution class

    Hello def initialize @foo = 1 @bar = 2 end def foo @foo + @bar end end hi = Hello.new hi.foo Source Code Byte Code for "foo" [:getivar, :@foo] [:getivar, :@bar] [:plus] VM Stack 1 2 3
  5. Instruction Implementation Example implementation written in Ruby def getivar name

    get_self.instance_variables[name] end getivar Implementation [:getivar, :@foo] [:getivar, :@bar] [:plus] Get self from current stack frame G et H ash of IVS
  6. Instance Variables How to store them? class Hello def initialize

    @foo = 1 @bar = 2 end def foo @foo + @bar end end Hello.new Instance of Hello IV Index Table Name Index :@foo 0 :@bar 1 Hello Class IV Array 0 1 1 2
  7. Instance Variables (second instance) How to store them? class Hello

    def initialize @foo = 1 @bar = 2 end def foo @foo + @bar end end Hello.new Instance of Hello IV Index Table Name Index :@foo 0 :@bar 1 Hello Class IV Array 0 1 1 2
  8. Instance Variables (many instances) Hash table size is amortized class

    Hello def initialize @foo = 1 @bar = 2 end def foo @foo + @bar end end Hello.new Instance of Hello IV Index Table Name Index :@foo 0 :@bar 1 Hello Class IV Array 0 1 1 2 Instance of Hello 0 1 1 2
  9. Instance Variables Storage Location References are stored inside the object

    (it's in the computer) class Hello def initialize @foo = 1 @bar = 2 end def foo @foo + @bar end end Hello.new Instance of Hello In-Memory Layout Byte Index Value 0 Flags (a 64 bit bitmap) 8 Pointer to Class 16 24 32 First IV Second IV Third IV 1 2 Qundef
  10. Storage Location Depends on Type Objects store instance variables "in

    line", others in an external table class Hello def initialize @foo = 1 @bar = 2 end def foo @foo + @bar end end class PleaseDoNotDoThis < Array def initialize @foo = 1 @bar = 2 super end def foo @foo + @bar end end
  11. Instruction Implementation "foo" method instructions def getivar name # get

    the class klass = get_self.class # get the index of the ivar index = klass.ivar_index[name] if get_self.is_a?(Object) # get the ivar value get_self.instance_variables[index] else # do something different end end getivar Implementation [:getivar, :@foo] [:getivar, :@bar] [:plus] Still doing a hash lookup 😆
  12. Instruction Implementation "foo" method instructions, with inline caches def getivar

    name, cache # If there is no cached index unless cache.index # get the class klass = get_self.class # get the index of the ivar index = klass.ivar_index[name] # store the index cache.index = index end # get the cached index index = cache.index if get_self.is_a?(Object) # get the ivar value get_self.instance_variables[index] else # do something different end end getivar Implementation [:getivar, :@foo, cache] [:getivar, :@bar, cache] [:plus] Find and cache the index Use the cached index
  13. Cache Lookup Problem Name to Index mapping is per class

    class Hello def initialize @foo = 1 @bar = 2 end def foo @foo + @bar end end class World < Hello def initialize @oops = "yikes!!" super end end Hello.new.foo World.new.foo Instance of Hello IV Index Table Name Index :@foo 0 :@bar 1 Hello Class IV Array 0 1 1 2 Cache Index 0 and 1
  14. Cache Lookup Problem Name to Index mapping is per class

    class Hello def initialize @foo = 1 @bar = 2 end def foo @foo + @bar end end class World < Hello def initialize @oops = "yikes!!" super end end Hello.new.foo World.new.foo IV Index Table Name Index :@foo 0 :@bar 1 Hello Class Cache Index 0 and 1 Name Index :@oops 0 :@foo 1 :@bar 2 World Class Oops was set fi rst! Oops was set fi rst!
  15. Compare Class in Cache Cache miss if no index or

    the class doesn't match def getivar name, cache # get the class klass = get_self.class # If there is no cached index and class doesn't match if !(cache.index && cache.klass == klass) # get the index of the ivar index = klass.ivar_index[name] # store the index cache.index = index # store the class cache.klass = klass end # get the cached index index = cache.index if get_self.is_a?(Object) # get the ivar value get_self.instance_variables[index] else # do something different end end Class must match and IV index set Return value at index inside list
  16. Subclasses Cause Cache Misses Since the class is a cache

    key, subclasses can't share cache with superclass class Hello def initialize @foo = 1 @bar = 2 end def foo @foo + @bar end end class World < Hello end hello = Hello.new world = World.new loop do hello.foo world.foo end IV Index Table Name Index :@foo 0 :@bar 1 Hello Class Name Index :@foo 0 :@bar 1 World Class
  17. class Hello def initialize(set_bar) @foo = 1 @bar = 2

    if set_bar @baz = 3 end def foo if !instance_variable_defined?(:@bar) puts "oh!" end @foo + @bar.to_i end end p Hello.new(true).foo # => 3 p Hello.new(false).foo # => 1 Handling "Undefined" Instance Variables Unde fi ned IVs return `nil`, but how do we know it's unde fi ned? IV Index Table Name Index :@foo 0 :@bar 1 :@baz 2 Hello Class Hello Instance In-Memory Layout Byte Value 0 Flags (a 64 bit bitmap) 8 Pointer to Class 16 1 24 2 32 3
  18. class Hello def initialize(set_bar) @foo = 1 @bar = 2

    if set_bar @baz = 3 end def foo if !instance_variable_defined?(:@bar) puts "oh!" end @foo + @bar.to_i end end p Hello.new(true).foo # => 3 p Hello.new(false).foo # => 1 Handling "Undefined" Instance Variables Unde fi ned IVs return `nil`, but how do we know it's unde fi ned? IV Index Table Name Index :@foo 0 :@bar 1 :@baz 2 Hello Class Hello Instance In-Memory Layout Byte Value 0 Flags (a 64 bit bitmap) 8 Pointer to Class 16 1 24 Qundef (0x24) 32 3 Cache Index 0 and 1
  19. Return `nil` for Undefined IVs If the value stored in

    the array is Qundef, return nil, otherwise return the value def getivar name, cache # get the class klass = get_self.class # If there is no cached index and class doesn't match if !(cache.index && cache.klass == klass) # get the index of the ivar index = klass.ivar_index[name] # store the index cache.index = index # store the class cache.klass = klass end # get the cached index index = cache.index if get_self.is_a?(Object) # get the ivar value iv = get_self.instance_variables[index] if iv == Qundef nil else iv end else # do something different end end Return nil if Qundef
  20. Conditionals for Reading an IV Just a Recap! • Is

    an index set? • Do the classes match? • Is it an "Object" type? • Is the IV value equal to Qundef?
  21. JIT Compilation JIT compiler translates byte code to machine code

    class Hello def initialize @foo = 1 @bar = 2 end def foo @foo + @bar end end hi = Hello.new hi.foo Source Code Byte Code for "foo" [:getivar, :@foo, cache] [:getivar, :@bar, cache] [:plus] Machine Code == BLOCK 1/5, ISEQ RANGE [0,3), 93 bytes ====================== # getinstancevariable # guard not immediate 0x55a658d0a6dd: test qword ptr [r13 + 0x18], 7 0x55a658d0a6e5: jne 0x55a660d0a0e5 0x55a658d0a6eb: cmp qword ptr [r13 + 0x18], 8 0x55a658d0a6f0: jbe 0x55a660d0a0fe 0x55a658d0a6f6: mov rax, qword ptr [r13 + 0x18] # guard known class 0x55a658d0a6fa: movabs rcx, 0x7fbb2af48f20 0x55a658d0a704: cmp qword ptr [rax + 8], rcx 0x55a658d0a708: jne 0x55a660d0a117 0x55a658d0a70e: mov rax, qword ptr [r13 + 0x18] 0x55a658d0a712: cmp qword ptr [rax + 0x10], 0 0x55a658d0a717: jbe 0x55a660d0a0cc # guard embedded getivar 0x55a658d0a71d: test word ptr [rax], 0x2000 0x55a658d0a722: je 0x55a660d0a130 0x55a658d0a728: cmp qword ptr [rax + 0x18], 0x34 0x55a658d0a72d: mov ecx, 8 0x55a658d0a732: cmovne rcx, qword ptr [rax + 0x18] 0x55a658d0a737: mov qword ptr [rbx], rcx == BLOCK 2/5, ISEQ RANGE [3,6), 0 bytes ======================= == BLOCK 3/5, ISEQ RANGE [3,6), 69 bytes ====================== # getinstancevariable # regenerate_branch # getinstancevariable # regenerate_branch 0x55a658d0a73a: mov rax, qword ptr [r13 + 0x18] # guard known class 0x55a658d0a73e: movabs rcx, 0x7fbb2af48f20 0x55a658d0a748: cmp qword ptr [rax + 8], rcx 0x55a658d0a74c: jne 0x55a660d0a183 0x55a658d0a752: mov rax, qword ptr [r13 + 0x18] 0x55a658d0a756: cmp qword ptr [rax + 0x10], 1 0x55a658d0a75b: jbe 0x55a660d0a162 # guard embedded getivar 0x55a658d0a761: test word ptr [rax], 0x2000 0x55a658d0a766: je 0x55a660d0a19c 0x55a658d0a76c: cmp qword ptr [rax + 0x20], 0x34 0x55a658d0a771: mov ecx, 8 0x55a658d0a776: cmovne rcx, qword ptr [rax + 0x20] 0x55a658d0a77b: mov qword ptr [rbx + 8], rcx == BLOCK 4/5, ISEQ RANGE [6,8), 0 bytes ======================= == BLOCK 5/5, ISEQ RANGE [6,9), 86 bytes ====================== # opt_plus # regenerate_branch # opt_plus # guard arg0 fixnum # regenerate_branch 0x55a658d0a77f: test byte ptr [rbx], 1 0x55a658d0a782: je 0x55a660d0a1ef # guard arg1 fixnum 0x55a658d0a788: test byte ptr [rbx + 8], 1 0x55a658d0a78c: je 0x55a660d0a208 0x55a658d0a792: mov rax, qword ptr [rbx] 0x55a658d0a795: sub rax, 1 0x55a658d0a799: add rax, qword ptr [rbx + 8] 0x55a658d0a79d: jo 0x55a660d0a1ce 0x55a658d0a7a3: mov qword ptr [rbx], rax # leave # RUBY_VM_CHECK_INTS(ec) 0x55a658d0a7a6: mov eax, dword ptr [r12 + 0x24] 0x55a658d0a7ab: not eax 0x55a658d0a7ad: test dword ptr [r12 + 0x20], eax 0x55a658d0a7b2: jne 0x55a660d0a221 # pop stack frame 0x55a658d0a7b8: mov rax, r13 0x55a658d0a7bb: add rax, 0x40 0x55a658d0a7bf: mov r13, rax Machine Code == BLOCK 1/5, ISEQ RANGE [0,3), 93 bytes ====================== # getinstancevariable # guard not immediate 0x55a658d0a6dd: test qword ptr [r13 + 0x18], 7 0x55a658d0a6e5: jne 0x55a660d0a0e5 0x55a658d0a6eb: cmp qword ptr [r13 + 0x18], 8 0x55a658d0a6f0: jbe 0x55a660d0a0fe 0x55a658d0a6f6: mov rax, qword ptr [r13 + 0x18] # guard known class 0x55a658d0a6fa: movabs rcx, 0x7fbb2af48f20 0x55a658d0a704: cmp qword ptr [rax + 8], rcx 0x55a658d0a708: jne 0x55a660d0a117 0x55a658d0a70e: mov rax, qword ptr [r13 + 0x18] 0x55a658d0a712: cmp qword ptr [rax + 0x10], 0 0x55a658d0a717: jbe 0x55a660d0a0cc # guard embedded getivar 0x55a658d0a71d: test word ptr [rax], 0x2000 0x55a658d0a722: je 0x55a660d0a130 0x55a658d0a728: cmp qword ptr [rax + 0x18], 0x34 0x55a658d0a72d: mov ecx, 8 0x55a658d0a732: cmovne rcx, qword ptr [rax + 0x18] 0x55a658d0a737: mov qword ptr [rbx], rcx == BLOCK 2/5, ISEQ RANGE [3,6), 0 bytes ======================= == BLOCK 3/5, ISEQ RANGE [3,6), 69 bytes ====================== # getinstancevariable # regenerate_branch # getinstancevariable # regenerate_branch 0x55a658d0a73a: mov rax, qword ptr [r13 + 0x18] # guard known class 0x55a658d0a73e: movabs rcx, 0x7fbb2af48f20 0x55a658d0a748: cmp qword ptr [rax + 8], rcx 0x55a658d0a74c: jne 0x55a660d0a183 0x55a658d0a752: mov rax, qword ptr [r13 + 0x18] 0x55a658d0a756: cmp qword ptr [rax + 0x10], 1 0x55a658d0a75b: jbe 0x55a660d0a162 # guard embedded getivar 0x55a658d0a761: test word ptr [rax], 0x2000 0x55a658d0a766: je 0x55a660d0a19c 0x55a658d0a76c: cmp qword ptr [rax + 0x20], 0x34 0x55a658d0a771: mov ecx, 8 0x55a658d0a776: cmovne rcx, qword ptr [rax + 0x20] 0x55a658d0a77b: mov qword ptr [rbx + 8], rcx == BLOCK 4/5, ISEQ RANGE [6,8), 0 bytes ======================= == BLOCK 5/5, ISEQ RANGE [6,9), 86 bytes ====================== # opt_plus # regenerate_branch # opt_plus # guard arg0 fixnum # regenerate_branch 0x55a658d0a77f: test byte ptr [rbx], 1 0x55a658d0a782: je 0x55a660d0a1ef # guard arg1 fixnum 0x55a658d0a788: test byte ptr [rbx + 8], 1 0x55a658d0a78c: je 0x55a660d0a208 0x55a658d0a792: mov rax, qword ptr [rbx] 0x55a658d0a795: sub rax, 1 0x55a658d0a799: add rax, qword ptr [rbx + 8] 0x55a658d0a79d: jo 0x55a660d0a1ce 0x55a658d0a7a3: mov qword ptr [rbx], rax # leave # RUBY_VM_CHECK_INTS(ec) 0x55a658d0a7a6: mov eax, dword ptr [r12 + 0x24] 0x55a658d0a7ab: not eax 0x55a658d0a7ad: test dword ptr [r12 + 0x20], eax 0x55a658d0a7b2: jne 0x55a660d0a221 # pop stack frame 0x55a658d0a7b8: mov rax, r13 0x55a658d0a7bb: add rax, 0x40 0x55a658d0a7bf: mov r13, rax Machine Code == BLOCK 1/5, ISEQ RANGE [0,3), 93 bytes ====================== # getinstancevariable # guard not immediate 0x55a658d0a6dd: test qword ptr [r13 + 0x18], 7 0x55a658d0a6e5: jne 0x55a660d0a0e5 0x55a658d0a6eb: cmp qword ptr [r13 + 0x18], 8 0x55a658d0a6f0: jbe 0x55a660d0a0fe 0x55a658d0a6f6: mov rax, qword ptr [r13 + 0x18] # guard known class 0x55a658d0a6fa: movabs rcx, 0x7fbb2af48f20 0x55a658d0a704: cmp qword ptr [rax + 8], rcx 0x55a658d0a708: jne 0x55a660d0a117 0x55a658d0a70e: mov rax, qword ptr [r13 + 0x18] 0x55a658d0a712: cmp qword ptr [rax + 0x10], 0 0x55a658d0a717: jbe 0x55a660d0a0cc # guard embedded getivar 0x55a658d0a71d: test word ptr [rax], 0x2000 0x55a658d0a722: je 0x55a660d0a130 0x55a658d0a728: cmp qword ptr [rax + 0x18], 0x34 0x55a658d0a72d: mov ecx, 8 0x55a658d0a732: cmovne rcx, qword ptr [rax + 0x18] 0x55a658d0a737: mov qword ptr [rbx], rcx == BLOCK 2/5, ISEQ RANGE [3,6), 0 bytes ======================= == BLOCK 3/5, ISEQ RANGE [3,6), 69 bytes ====================== # getinstancevariable # regenerate_branch # getinstancevariable # regenerate_branch 0x55a658d0a73a: mov rax, qword ptr [r13 + 0x18] # guard known class 0x55a658d0a73e: movabs rcx, 0x7fbb2af48f20 0x55a658d0a748: cmp qword ptr [rax + 8], rcx 0x55a658d0a74c: jne 0x55a660d0a183 0x55a658d0a752: mov rax, qword ptr [r13 + 0x18] 0x55a658d0a756: cmp qword ptr [rax + 0x10], 1 0x55a658d0a75b: jbe 0x55a660d0a162 # guard embedded getivar 0x55a658d0a761: test word ptr [rax], 0x2000 0x55a658d0a766: je 0x55a660d0a19c 0x55a658d0a76c: cmp qword ptr [rax + 0x20], 0x34 0x55a658d0a771: mov ecx, 8 0x55a658d0a776: cmovne rcx, qword ptr [rax + 0x20] 0x55a658d0a77b: mov qword ptr [rbx + 8], rcx == BLOCK 4/5, ISEQ RANGE [6,8), 0 bytes ======================= == BLOCK 5/5, ISEQ RANGE [6,9), 86 bytes ====================== # opt_plus # regenerate_branch # opt_plus # guard arg0 fixnum # regenerate_branch 0x55a658d0a77f: test byte ptr [rbx], 1 0x55a658d0a782: je 0x55a660d0a1ef # guard arg1 fixnum 0x55a658d0a788: test byte ptr [rbx + 8], 1 0x55a658d0a78c: je 0x55a660d0a208 0x55a658d0a792: mov rax, qword ptr [rbx] 0x55a658d0a795: sub rax, 1 0x55a658d0a799: add rax, qword ptr [rbx + 8] 0x55a658d0a79d: jo 0x55a660d0a1ce 0x55a658d0a7a3: mov qword ptr [rbx], rax # leave # RUBY_VM_CHECK_INTS(ec) 0x55a658d0a7a6: mov eax, dword ptr [r12 + 0x24] 0x55a658d0a7ab: not eax 0x55a658d0a7ad: test dword ptr [r12 + 0x20], eax 0x55a658d0a7b2: jne 0x55a660d0a221 # pop stack frame 0x55a658d0a7b8: mov rax, r13 0x55a658d0a7bb: add rax, 0x40 0x55a658d0a7bf: mov r13, rax Machine Code
  22. Machine Code for Reading an IV def getivar name, cache

    # get the class klass = get_self.class # If there is no cached index and class doesn't match if !(cache.index && cache.klass == klass) # get the index of the ivar index = klass.ivar_index[name] # store the index cache.index = index # store the class cache.klass = klass end # get the cached index index = cache.index if get_self.is_a?(Object) # get the ivar value iv = get_self.instance_variables[index] if iv == Qundef nil else iv end else # do something different end end Instruction Implementation == BLOCK 1/5, ISEQ RANGE [0,3), 93 bytes ====================== # getinstancevariable # guard not immediate 0x55a658d0a6dd: test qword ptr [r13 + 0x18], 7 0x55a658d0a6e5: jne 0x55a660d0a0e5 0x55a658d0a6eb: cmp qword ptr [r13 + 0x18], 8 0x55a658d0a6f0: jbe 0x55a660d0a0fe 0x55a658d0a6f6: mov rax, qword ptr [r13 + 0x18] # guard known class 0x55a658d0a6fa: movabs rcx, 0x7fbb2af48f20 0x55a658d0a704: cmp qword ptr [rax + 8], rcx 0x55a658d0a708: jne 0x55a660d0a117 0x55a658d0a70e: mov rax, qword ptr [r13 + 0x18] 0x55a658d0a712: cmp qword ptr [rax + 0x10], 0 0x55a658d0a717: jbe 0x55a660d0a0cc # guard embedded getivar 0x55a658d0a71d: test word ptr [rax], 0x2000 0x55a658d0a722: je 0x55a660d0a130 0x55a658d0a728: cmp qword ptr [rax + 0x18], 0x34 0x55a658d0a72d: mov ecx, 8 0x55a658d0a732: cmovne rcx, qword ptr [rax + 0x18] 0x55a658d0a737: mov qword ptr [rbx], rcx Generated Machine Code def getivar name, cache # get the class klass = get_self.class # If there is no cached index and class doesn't match if !(cache.index && cache.klass == klass) # get the index of the ivar index = klass.ivar_index[name] # store the index cache.index = index # store the class cache.klass = klass end # get the cached index index = cache.index if get_self.is_a?(Object) # get the ivar value iv = get_self.instance_variables[index] if iv == Qundef nil else iv end else # do something different end end Instruction Implementation == BLOCK 1/5, ISEQ RANGE [0,3), 93 bytes ====================== # getinstancevariable # guard not immediate 0x55a658d0a6dd: test qword ptr [r13 + 0x18], 7 0x55a658d0a6e5: jne 0x55a660d0a0e5 0x55a658d0a6eb: cmp qword ptr [r13 + 0x18], 8 0x55a658d0a6f0: jbe 0x55a660d0a0fe 0x55a658d0a6f6: mov rax, qword ptr [r13 + 0x18] # guard known class 0x55a658d0a6fa: movabs rcx, 0x7fbb2af48f20 0x55a658d0a704: cmp qword ptr [rax + 8], rcx 0x55a658d0a708: jne 0x55a660d0a117 0x55a658d0a70e: mov rax, qword ptr [r13 + 0x18] 0x55a658d0a712: cmp qword ptr [rax + 0x10], 0 0x55a658d0a717: jbe 0x55a660d0a0cc # guard embedded getivar 0x55a658d0a71d: test word ptr [rax], 0x2000 0x55a658d0a722: je 0x55a660d0a130 0x55a658d0a728: cmp qword ptr [rax + 0x18], 0x34 0x55a658d0a72d: mov ecx, 8 0x55a658d0a732: cmovne rcx, qword ptr [rax + 0x18] 0x55a658d0a737: mov qword ptr [rbx], rcx Generated Machine Code def getivar name, cache # get the class klass = get_self.class # If there is no cached index and class doesn't match if !(cache.index && cache.klass == klass) # get the index of the ivar index = klass.ivar_index[name] # store the index cache.index = index # store the class cache.klass = klass end # get the cached index index = cache.index if get_self.is_a?(Object) # get the ivar value iv = get_self.instance_variables[index] if iv == Qundef nil else iv end else # do something different end end Instruction Implementation == BLOCK 1/5, ISEQ RANGE [0,3), 93 bytes ====================== # getinstancevariable # guard not immediate 0x55a658d0a6dd: test qword ptr [r13 + 0x18], 7 0x55a658d0a6e5: jne 0x55a660d0a0e5 0x55a658d0a6eb: cmp qword ptr [r13 + 0x18], 8 0x55a658d0a6f0: jbe 0x55a660d0a0fe 0x55a658d0a6f6: mov rax, qword ptr [r13 + 0x18] # guard known class 0x55a658d0a6fa: movabs rcx, 0x7fbb2af48f20 0x55a658d0a704: cmp qword ptr [rax + 8], rcx 0x55a658d0a708: jne 0x55a660d0a117 0x55a658d0a70e: mov rax, qword ptr [r13 + 0x18] 0x55a658d0a712: cmp qword ptr [rax + 0x10], 0 0x55a658d0a717: jbe 0x55a660d0a0cc # guard embedded getivar 0x55a658d0a71d: test word ptr [rax], 0x2000 0x55a658d0a722: je 0x55a660d0a130 0x55a658d0a728: cmp qword ptr [rax + 0x18], 0x34 0x55a658d0a72d: mov ecx, 8 0x55a658d0a732: cmovne rcx, qword ptr [rax + 0x18] 0x55a658d0a737: mov qword ptr [rbx], rcx Generated Machine Code def getivar name, cache # get the class klass = get_self.class # If there is no cached index and class doesn't match if !(cache.index && cache.klass == klass) # get the index of the ivar index = klass.ivar_index[name] # store the index cache.index = index # store the class cache.klass = klass end # get the cached index index = cache.index if get_self.is_a?(Object) # get the ivar value iv = get_self.instance_variables[index] if iv == Qundef nil else iv end else # do something different end end Instruction Implementation == BLOCK 1/5, ISEQ RANGE [0,3), 93 bytes ====================== # getinstancevariable # guard not immediate 0x55a658d0a6dd: test qword ptr [r13 + 0x18], 7 0x55a658d0a6e5: jne 0x55a660d0a0e5 0x55a658d0a6eb: cmp qword ptr [r13 + 0x18], 8 0x55a658d0a6f0: jbe 0x55a660d0a0fe 0x55a658d0a6f6: mov rax, qword ptr [r13 + 0x18] # guard known class 0x55a658d0a6fa: movabs rcx, 0x7fbb2af48f20 0x55a658d0a704: cmp qword ptr [rax + 8], rcx 0x55a658d0a708: jne 0x55a660d0a117 0x55a658d0a70e: mov rax, qword ptr [r13 + 0x18] 0x55a658d0a712: cmp qword ptr [rax + 0x10], 0 0x55a658d0a717: jbe 0x55a660d0a0cc # guard embedded getivar 0x55a658d0a71d: test word ptr [rax], 0x2000 0x55a658d0a722: je 0x55a660d0a130 0x55a658d0a728: cmp qword ptr [rax + 0x18], 0x34 0x55a658d0a72d: mov ecx, 8 0x55a658d0a732: cmovne rcx, qword ptr [rax + 0x18] 0x55a658d0a737: mov qword ptr [rbx], rcx Generated Machine Code def getivar name, cache # get the class klass = get_self.class # If there is no cached index and class doesn't match if !(cache.index && cache.klass == klass) # get the index of the ivar index = klass.ivar_index[name] # store the index cache.index = index # store the class cache.klass = klass end # get the cached index index = cache.index if get_self.is_a?(Object) # get the ivar value iv = get_self.instance_variables[index] if iv == Qundef nil else iv end else # do something different end end Instruction Implementation
  23. Shape Transitions on Write Shapes form a tree representing Object

    properties class Hello def initialize @foo = 1 @bar = 2 end def foo @foo + @bar end end Sample Code Shape Tree Root id: 0 @foo id: 1, index: 0 @bar id: 2, index: 1 @foo @bar from: 0, to: 1, iv index: 0 from:1, to: 2, iv index: 1 Cache Key Cache Key Destination Shape Destination Shape IV Index IV Index
  24. Shape Tree Shape Transitions on Write Shape ID is used

    as the cache key class Hello def initialize @foo = 1 @bar = 2 end def foo @foo + @bar end end Sample Code Root id: 0 @foo id: 1, index: 0 @bar id: 2, index: 1 @foo @bar from: 0, to: 1, iv index: 0 from:1, to: 2, iv index: 1
  25. Object can share shapes Hello and World can share caches

    class Hello def initialize @foo = 1 @bar = 2 end end class World < Hello def initialize super @baz = 3 end end Sample Code Shape Tree from: 0, to: 1, iv index: 0 from:1, to: 2, iv index: 1 from:2, to: 3, iv index: 2 Root id: 0 @foo id: 1, index: 0 @bar id: 2, index: 1 @baz id:3, index:2 Shared between Hello and World instances
  26. Shared Shape Tree Shape Tree is Shared All objects use

    the shape tree, so more types can share info class Hello def initialize @foo = 1 @bar = 2 end def foo @foo + @bar end end class World < Hello end hello = Hello.new world = World.new loop do hello.foo world.foo end IV Index Table Name Index :@foo 0 :@bar 1 Hello Class Name Index :@foo 0 :@bar 1 World Class Root id: 0 @foo id: 1, index: 0 @bar id: 2, index: 1 Same shape on both instances Cache Shape 2 and 2
  27. Cross Type Cache Hits require 'harness' class Hello def initialize

    @foo = 1 @bar = 2 end def foo @foo + @bar end end class World < Hello end hello = Hello.new world = World.new run_benchmark(100) do i = 0 while i < 90_000 hello.foo world.foo i += 1 end end Microbenchmark before: ruby 3.2.0dev (2022-09-28T14:51:38Z before-shapes a05b261464) [x86_64-linux] after: ruby 3.2.0dev (2022-11-22T05:20:45Z master 20b9d7b9fd) [x86_64-linux] ------------------- ----------- ---------- ---------- ---------- ------------ ------------- bench before (ms) stddev (%) after (ms) stddev (%) before/after after 1st itr getivar-polymorphic 12.1 1.4 4.4 2.1 2.76 2.82 ------------------- ----------- ---------- ---------- ---------- ------------ ------------- Legend: - before/after: ratio of before/after time. Higher is better for after. Above 1 represents a speedup. - after 1st itr: ratio of before/after time for the first benchmarking iteration. Results 2.76x Faster
  28. Class Name is an IV Class names are stored as

    an instance variable on the class instance class Hello def initialize @foo = 1 @bar = 2 end def foo @foo + @bar end end puts Hello.name # => IV read
  29. Freezing Changes Shape When we freeze an object, it changes

    shape class Hello def initialize @foo = 1 @bar = 2 end def set @baz = 3 end end hello = Hello.new hello.set hello = Hello.new hello.freeze hello.set Sample Code Shape Tree Root id: 0 @foo id: 1, index: 0 @bar id: 2, index: 1 @baz id:3, index:2 Shape: 2 from: 0, to: 1, iv index: 0 from: 1, to: 2, iv index: 1 from: 2, to: 3, iv index: 2 Shape: 3 Shape: 2 Shape: 4 @foo @bar @baz frozen id:4 freeze
  30. Set Instance Variable Instruction Frozen check only on cache misses

    def setinstancevariable iv_name, cache if get_self.frozen? raise "It's frozen!" end if cache.klass == get_self.class && cache.index # CACHE HIT!! # set the instance variable else cache.klass = get_self.class cache.index = get_self.iv_index_table[iv_name] # set the instance variable end end Before Shapes def setinstancevariable iv_name, cache if cache.from_shape_id == get_self.shape_id # CACHE HIT!! # set the instance variable else if get_self.frozen? raise "It's frozen!" end cache.shape_id = get_self.shape_id # set the instance variable end end After Shapes
  31. IV Write Performance Improvement require 'harness' class TheClass def initialize

    @v0 = 1 @v1 = 2 @v3 = 3 @levar = 1 end def set_value_loop # 1M i = 0 while i < 1000000 # 10 times to de-emphasize loop overhead @levar = i @levar = i @levar = i @levar = i @levar = i @levar = i @levar = i @levar = i @levar = i @levar = i i += 1 end end end obj = TheClass.new run_benchmark(100) do obj.set_value_loop end Micro Benchmark before: ruby 3.2.0dev (2022-09-28T14:51:38Z before-shapes a05b261464) [x86_64-linux] after: ruby 3.2.0dev (2022-11-22T05:20:45Z master 20b9d7b9fd) [x86_64-linux] ------- ----------- ---------- ---------- ---------- ------------ ------------- bench before (ms) stddev (%) after (ms) stddev (%) before/after after 1st itr setivar 64.0 0.7 53.0 2.5 1.21 1.19 ------- ----------- ---------- ---------- ---------- ------------ ------------- Legend: - before/after: ratio of before/after time. Higher is better for after. Above 1 represents a speedup. - after 1st itr: ratio of before/after time for the first benchmarking iteration Results 21% Faster
  32. Object Layout All objects have 2 common fi elds: "

    fl ags" and "class" Basic Object Layout Byte Value 0 Flags (a 64 bit bitmap) 8 Pointer to Class 16 24 32 T_OBJECT Byte Value 0 Flags (a 64 bit bitmap) 8 Pointer to Class 16 Instance Variable 24 Instance Variable 32 Instance Variable T_ARRAY Byte Value 0 Flags (a 64 bit bitmap) 8 Pointer to Class 16 Array Element 24 Array Element 32 Array Element
  33. Flags Bitmap Layout Bottom 5 bits represent Object Type Flags

    Bitmap 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Object Type ruby_value_type { RUBY_T_OBJECT = 0x01, /**< @see struct ::RObject */ RUBY_T_CLASS = 0x02, /**< @see struct ::RClass and ::rb_cClass */ RUBY_T_MODULE = 0x03, /**< @see struct ::RClass and ::rb_cModule */ RUBY_T_FLOAT = 0x04, /**< @see struct ::RFloat */ RUBY_T_STRING = 0x05, /**< @see struct ::RString */ RUBY_T_REGEXP = 0x06, /**< @see struct ::RRegexp */ RUBY_T_ARRAY = 0x07, /**< @see struct ::RArray */ RUBY_T_HASH = 0x08, /**< @see struct ::RHash */ RUBY_T_STRUCT = 0x09, /**< @see struct ::RStruct */ RUBY_T_BIGNUM = 0x0a, /**< @see struct ::RBignum */ RUBY_T_FILE = 0x0b, /**< @see struct ::RFile */ RUBY_T_DATA = 0x0c, /**< @see struct ::RTypedData */ RUBY_T_MATCH = 0x0d, /**< @see struct ::RMatch */ RUBY_T_COMPLEX = 0x0e, /**< @see struct ::RComplex */ RUBY_T_RATIONAL = 0x0f, /**< @see struct ::RRational */ }
  34. Flags Bitmap Layout Bottom 12 bits have a common "meaning"

    (see fl _type.h) Flags Bitmap 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Object Type Object ID has been seen? Object.new.object_id [].object_id
  35. Flags Bitmap Layout Bottom 12 bits have a common "meaning"

    (see fl _type.h) Flags Bitmap 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Object Type Object ID has been seen? Object.new.object_id [].object_id
  36. Flags Bitmap Layout Object Type gives upper bits meaning Flags

    Bitmap 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Object Type Common Stu ff (see fl _type.h) Object Extended? Depends on Object Type
  37. T_OBJECT Extended Layout Byte Value 0 Flags (a 64 bit

    bitmap) 8 Pointer to Class 16 Pointer to Bu ff er 24 32 IV Array Byte Value 0 Instance Variable 8 Instance Variable 16 Instance Variable 24 Instance Variable 32 Instance Variable ... ... class Hello def initialize @foo = 1 @bar = 2 @baz = 3 @hoge = 4 end end Hello.new T_OBJECT Layout Byte Value 0 Flags (a 64 bit bitmap) 8 Pointer to Class 16 Instance Variable 24 Instance Variable 32 Instance Variable T_OBJECT Layout
  38. Flags Bitmap Layout Extended Bit means "read from external table"

    Flags Bitmap 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Object Type Common Stu ff (see fl _type.h) Object Extended? Depends on Object Type
  39. JIT Compilation JIT compilation must write guards for assumptions class

    Hello def initialize @foo = 1 @bar = 2 @baz = 3 @hoge = 4 end def foo @foo + @bar end end What is the type? Is it embedded or extended? Is the IV Qundef? Is the Class correct?
  40. Runtime Check Locations We need to test object type, extended

    bit, IV value Flags Bitmap 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Object Type Common Stu ff (see fl _type.h) Object Extended? Depends on Object Type Byte Value 0 Flags (a 64 bit bitmap) 8 Pointer to Class 16 Instance Variable 24 Instance Variable 32 Instance Variable Object Type Qundef? Right class?
  41. Machine Code for reading one IV == BLOCK 1/5, ISEQ

    RANGE [0,3), 93 bytes ====================== # getinstancevariable # guard not immediate 0x55a658d0a6dd: test qword ptr [r13 + 0x18], 7 0x55a658d0a6e5: jne 0x55a660d0a0e5 0x55a658d0a6eb: cmp qword ptr [r13 + 0x18], 8 0x55a658d0a6f0: jbe 0x55a660d0a0fe 0x55a658d0a6f6: mov rax, qword ptr [r13 + 0x18] # guard known class 0x55a658d0a6fa: movabs rcx, 0x7fbb2af48f20 0x55a658d0a704: cmp qword ptr [rax + 8], rcx 0x55a658d0a708: jne 0x55a660d0a117 0x55a658d0a70e: mov rax, qword ptr [r13 + 0x18] 0x55a658d0a712: cmp qword ptr [rax + 0x10], 0 0x55a658d0a717: jbe 0x55a660d0a0cc # guard embedded getivar 0x55a658d0a71d: test word ptr [rax], 0x2000 0x55a658d0a722: je 0x55a660d0a130 0x55a658d0a728: cmp qword ptr [rax + 0x18], 0x34 0x55a658d0a72d: mov ecx, 8 0x55a658d0a732: cmovne rcx, qword ptr [rax + 0x18] 0x55a658d0a737: mov qword ptr [rbx], rcx Generated Machine Code
  42. Shape ID Storage Shape id is stored in the upper

    32 bits Flags Bitmap 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Object Type Common Stu ff (see fl _type.h) Depends on Object Type Shape ID
  43. Class Check Isn't Necessary Shapes are independent of class Flags

    Bitmap 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Object Type Common Stu ff (see fl _type.h) Object Extended? Depends on Object Type Byte Value 0 Flags (a 64 bit bitmap) 8 Pointer to Class 16 Instance Variable 24 Instance Variable 32 Instance Variable Object Type Qundef? Right class? Shape ID
  44. Handling "Undefined" Instance Variables Shapes care about IV set order

    class Hello def initialize(set_bar) @foo = 1 @bar = 2 if set_bar @baz = 3 end def foo if !instance_variable_defined?(:@bar) puts "oh!" end @foo + @bar.to_i end end p Hello.new(true).foo # => 3 p Hello.new(false).foo # => 1 Root id: 0 @foo id: 1 @bar id: 2 @baz id: 3 @baz id: 4 Shape 3 Shape 4
  45. Handling "Undefined" Instance Variables Shape 3 has a "bar" instance

    variable class Hello def initialize(set_bar) @foo = 1 @bar = 2 if set_bar @baz = 3 end def foo if !instance_variable_defined?(:@bar) puts "oh!" end @foo + @bar.to_i end end p Hello.new(true).foo # => 3 p Hello.new(false).foo # => 1 Shape 3 Root id: 0 @foo id: 1 @bar id: 2 @baz id: 3 @baz id: 4
  46. Handling "Undefined" Instance Variables Shape 4 doesn't have a "bar"

    instance variable class Hello def initialize(set_bar) @foo = 1 @bar = 2 if set_bar @baz = 3 end def foo if !instance_variable_defined?(:@bar) puts "oh!" end @foo + @bar.to_i end end p Hello.new(true).foo # => 3 p Hello.new(false).foo # => 1 Shape 4 Root id: 0 @foo id: 1 @bar id: 2 @baz id: 3 @baz id: 4
  47. Class Check Isn't Necessary Shapes are independent of class Flags

    Bitmap 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Object Type Common Stu ff (see fl _type.h) Object Extended? Depends on Object Type Byte Value 0 Flags (a 64 bit bitmap) 8 Pointer to Class 16 Instance Variable 24 Instance Variable 32 Instance Variable Object Type Qundef? Shape ID
  48. Multiple Possible Layouts Objects can vary in width, so there

    are 2 possible layouts class Hello def initialize @foo = 1 @bar = 2 @baz = 3 end def foo @foo + @bar end end Hello.new Embedded Layout Byte Value 0 Flags (a 64 bit bitmap) 8 Pointer to Class 16 1 24 2 32 3 Extended Layout Byte Value 0 Flags (a 64 bit bitmap) 8 Pointer to Class 16 Pointer to Bu ff er 24 32 IV Array Byte Value 0 1 8 2 16 3 24 ... 32 ... ... ...
  49. Multiple Possible Layouts "Extending" adds a shape transition class Hello

    def initialize @foo = 1 @bar = 2 @baz = 3 end def foo @foo + @bar end end Hello.new Extended Layout Byte Value 0 Flags 8 Class 16 24 Byte Val 0 1 8 2 16 3 24 ... 32 ... ... ... IV Ptr 1 2 Root id: 0 @bar id: 2 @foo id: 1 EXTEND id: 3 @baz id: 4 Shape 4
  50. Multiple Possible Layouts "Extending" adds a shape transition class Hello

    def initialize @foo = 1 @bar = 2 @baz = 3 end def foo @foo + @bar end end Hello.new Root id: 0 @bar id: 2 @foo id: 1 EXTEND id: 3 @baz id: 4 @baz id: 5 Embedded Layout Byte Value 0 Flags 8 Class 16 24 32 2 3 1 Shape 5
  51. Different Layouts Have Different Shapes JIT Compiler can di ff

    erentiate based on shape id class Hello def initialize @foo = 1 @bar = 2 @baz = 3 end def foo @foo + @bar end end Hello.new Root id: 0 @bar id: 2 @foo id: 1 EXTEND id: 3 @baz id: 4 @baz id: 5 Embedded Layout Byte Value 0 Flags 8 Class 16 1 24 2 32 3 Extended Layout Byte Value 0 Flags 8 Class 16 PTR 24 Byte Val 0 1 8 2 16 3 24 ... 32 ... ... ...
  52. Extended Check Isn't Necessary Shapes di ff er depending on

    embedded vs extended Flags Bitmap 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Object Type Common Stu ff (see fl _type.h) Object Extended? Depends on Object Type Object Type Shape ID
  53. Different Types, Same Shape Di ff erent types can have

    the same shape, but IV storage is di ff erent class Hello def initialize @foo = 1 @bar = 2 @baz = 3 end end Hello.new ary = [] ary.instance_variable_set(:@foo, 4) ary.instance_variable_set(:@bar, 5) ary.instance_variable_set(:@baz, 6) ary Sample Code Shape Tree Root id: 0 @foo id: 1 @bar id: 2 @baz id: 3 Shape 3 Shape 3
  54. Assign Shape at Allocation Time When a T_OBJECT is allocated,

    immediately set a new shape class Hello def initialize @foo = 1 @bar = 2 @baz = 3 end end Hello.new ary = [] ary.instance_variable_set(:@foo, 4) ary.instance_variable_set(:@bar, 5) ary.instance_variable_set(:@baz, 6) ary Sample Code Shape Tree Root id: 0 Shape 4 Shape 7 T_OBJECT id: 1 @foo id: 2 @bar id: 3 @baz id: 4 @foo id: 5 @bar id: 6 @baz id: 7
  55. Object Type Check Isn't Necessary Shapes di ff er depending

    on object type Flags Bitmap 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Object Type Common Stu ff (see fl _type.h) Depends on Object Type Object Type Shape ID
  56. JIT Code Comparison Machine code for reading 1 instance variable

    == BLOCK 1/5, ISEQ RANGE [0,3), 93 bytes ====================== # getinstancevariable # guard not immediate 0x55ce5998b6dd: test qword ptr [r13 + 0x18], 7 0x55ce5998b6e5: jne 0x55ce6198b0e5 0x55ce5998b6eb: cmp qword ptr [r13 + 0x18], 8 0x55ce5998b6f0: jbe 0x55ce6198b0fe 0x55ce5998b6f6: mov rax, qword ptr [r13 + 0x18] # guard known class 0x55ce5998b6fa: movabs rcx, 0x7f09fd1e8f30 0x55ce5998b704: cmp qword ptr [rax + 8], rcx 0x55ce5998b708: jne 0x55ce6198b117 0x55ce5998b70e: mov rax, qword ptr [r13 + 0x18] 0x55ce5998b712: cmp qword ptr [rax + 0x10], 0 0x55ce5998b717: jbe 0x55ce6198b0cc # guard embedded getivar 0x55ce5998b71d: test word ptr [rax], 0x2000 0x55ce5998b722: je 0x55ce6198b130 0x55ce5998b728: cmp qword ptr [rax + 0x18], 0x34 0x55ce5998b72d: mov ecx, 8 0x55ce5998b732: cmovne rcx, qword ptr [rax + 0x18] 0x55ce5998b737: mov qword ptr [rbx], rcx Before Object Shapes == BLOCK 1/5, ISEQ RANGE [0,3), 40 bytes ====================== # getinstancevariable 0x5594850ba13a: mov rax, qword ptr [r13 + 0x18] # guard object is heap 0x5594850ba13e: test al, 7 0x5594850ba141: jne 0x5594850bc090 0x5594850ba147: cmp rax, 0 0x5594850ba14b: je 0x5594850bc090 # guard shape 0x5594850ba151: cmp dword ptr [rax + 4], 0x19 0x5594850ba155: jne 0x5594850bc0a9 0x5594850ba15b: mov rax, qword ptr [rax + 0x10] 0x5594850ba15f: mov qword ptr [rbx], rax After Object Shapes
  57. JIT Code Comparison Machine code for reading 1 instance variable

    == BLOCK 1/5, ISEQ RANGE [0,3), 93 bytes ====================== # getinstancevariable # guard not immediate 0x55ce5998b6dd: test qword ptr [r13 + 0x18], 7 0x55ce5998b6e5: jne 0x55ce6198b0e5 0x55ce5998b6eb: cmp qword ptr [r13 + 0x18], 8 0x55ce5998b6f0: jbe 0x55ce6198b0fe 0x55ce5998b6f6: mov rax, qword ptr [r13 + 0x18] # guard known class 0x55ce5998b6fa: movabs rcx, 0x7f09fd1e8f30 0x55ce5998b704: cmp qword ptr [rax + 8], rcx 0x55ce5998b708: jne 0x55ce6198b117 0x55ce5998b70e: mov rax, qword ptr [r13 + 0x18] 0x55ce5998b712: cmp qword ptr [rax + 0x10], 0 0x55ce5998b717: jbe 0x55ce6198b0cc # guard embedded getivar 0x55ce5998b71d: test word ptr [rax], 0x2000 0x55ce5998b722: je 0x55ce6198b130 0x55ce5998b728: cmp qword ptr [rax + 0x18], 0x34 0x55ce5998b72d: mov ecx, 8 0x55ce5998b732: cmovne rcx, qword ptr [rax + 0x18] 0x55ce5998b737: mov qword ptr [rbx], rcx Before Object Shapes == BLOCK 1/5, ISEQ RANGE [0,3), 40 bytes ====================== # getinstancevariable 0x5594850ba13a: mov rax, qword ptr [r13 + 0x18] # guard object is heap 0x5594850ba13e: test al, 7 0x5594850ba141: jne 0x5594850bc090 0x5594850ba147: cmp rax, 0 0x5594850ba14b: je 0x5594850bc090 # guard shape 0x5594850ba151: cmp dword ptr [rax + 4], 0x19 0x5594850ba155: jne 0x5594850bc0a9 After Object Shapes Make sure it's shape 0x19
  58. JIT Code Comparison Machine code for reading 1 instance variable

    == BLOCK 1/5, ISEQ RANGE [0,3), 93 bytes ====================== # getinstancevariable # guard not immediate 0x55ce5998b6dd: test qword ptr [r13 + 0x18], 7 0x55ce5998b6e5: jne 0x55ce6198b0e5 0x55ce5998b6eb: cmp qword ptr [r13 + 0x18], 8 0x55ce5998b6f0: jbe 0x55ce6198b0fe 0x55ce5998b6f6: mov rax, qword ptr [r13 + 0x18] # guard known class 0x55ce5998b6fa: movabs rcx, 0x7f09fd1e8f30 0x55ce5998b704: cmp qword ptr [rax + 8], rcx 0x55ce5998b708: jne 0x55ce6198b117 0x55ce5998b70e: mov rax, qword ptr [r13 + 0x18] 0x55ce5998b712: cmp qword ptr [rax + 0x10], 0 0x55ce5998b717: jbe 0x55ce6198b0cc # guard embedded getivar 0x55ce5998b71d: test word ptr [rax], 0x2000 0x55ce5998b722: je 0x55ce6198b130 0x55ce5998b728: cmp qword ptr [rax + 0x18], 0x34 0x55ce5998b72d: mov ecx, 8 0x55ce5998b732: cmovne rcx, qword ptr [rax + 0x18] 0x55ce5998b737: mov qword ptr [rbx], rcx Before Object Shapes == BLOCK 1/5, ISEQ RANGE [0,3), 40 bytes ====================== # getinstancevariable 0x5594850ba13a: mov rax, qword ptr [r13 + 0x18] # guard object is heap 0x5594850ba13e: test al, 7 0x5594850ba141: jne 0x5594850bc090 0x5594850ba147: cmp rax, 0 0x5594850ba14b: je 0x5594850bc090 # guard shape 0x5594850ba151: cmp dword ptr [rax + 4], 0x19 0x5594850ba155: jne 0x5594850bc0a9 0x5594850ba15b: mov rax, qword ptr [rax + 0x10] 0x5594850ba15f: mov qword ptr [rbx], rax After Object Shapes Read the IV, and push on the stack
  59. Benchmark Comparison Measure the cost of fetching and instance variable

    class TheClass def initialize @v0 = 1 @v1 = 2 @v3 = 3 @levar = 1 end def get_value_loop sum = 0 # 1M i = 0 while i < 1000000 # 10 times to de-emphasize loop overhead sum += (@levar + @levar + @levar + @levar + @levar + @levar + @levar + @levar + @levar + @levar) i += 1 end return sum end end obj = TheClass.new run_benchmark(100) do obj.get_value_loop end
  60. Benchmark Results before: ruby 3.2.0dev (2022-09-28T14:51:38Z before-shapes a05b261464) +YJIT [x86_64-linux]

    after: ruby 3.2.0dev (2022-11-22T05:20:45Z master 20b9d7b9fd) +YJIT [x86_64-linux] ------- ----------- ---------- ---------- ---------- ------------ ------------- bench before (ms) stddev (%) after (ms) stddev (%) before/after after 1st itr getivar 17.4 0.5 12.0 0.3 1.45 0.97 ------- ----------- ---------- ---------- ---------- ------------ ------------- Legend: - before/after: ratio of before/after time. Higher is better for after. Above 1 represents a speedup. - after 1st itr: ratio of before/after time for the first benchmarking iteration. 45% Speed up!