Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Compacting GC in MRI v2

Compacting GC in MRI v2

About compacting GC in MRI (v2). For Ruby Kaigi 2019

Aaron Patterson

April 18, 2019
Tweet

More Decks by Aaron Patterson

Other Decks in Technology

Transcript

  1. CoW Friendliness Computer Memory Allocated Memory Allocated Memory Allocated Memory

    Free Memory Free Memory Allocated Memory Allocated Memory Allocated Memory Free Memory Free Memory Parent Process Child Process
  2. CoW Friendliness Computer Memory Allocated Memory Allocated Memory Allocated Memory

    Free Memory Free Memory Allocated Memory Allocated Memory Allocated Memory Free Memory Free Memory Parent Process Child Process
  3. CoW Friendliness Computer Memory Allocated Memory Allocated Memory Allocated Memory

    Free Memory Free Memory Allocated Memory Allocated Memory Allocated Memory Free Memory Free Memory Parent Process Child Process
  4. CoW Friendliness Computer Memory Allocated Memory Allocated Memory Allocated Memory

    Free Memory Free Memory Allocated Memory Allocated Memory Allocated Memory Free Memory Free Memory Parent Process Child Process
  5. Ruby Heaps System Memory Malloc Heap Ruby’s Object Heap String.new

    "The Quick Brown Fox Jumps Over The Lazy Dog"
  6. Ruby’s Heap Layout 40 bytes Each chunk is a "slot"

    Em pty Filled Empty Filled Moved
  7. Ruby’s Heap Layout Each slot has a unique address 1

    2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
  8. Moving Objects A B C D E F Free Scan

    1 2 3 4 5 6 7 8 9 10 Empty Filled Moved 4 5
  9. Updating References A B C D E F 1 2

    3 4 5 6 7 8 9 10 Empty Filled Moved Before Moving Objects
  10. Updating References A B C D 5 4 1 2

    3 4 5 6 7 8 9 10 Empty Filled F Moved E After Moving Objects
  11. Updating References A B C D 5 4 1 2

    3 4 5 6 7 8 9 10 Empty Filled F Moved E After Moving Objects
  12. Object Movement def compact heap = [ ... ] #

    many slots left = 0 right = heap.length - 1 while left < right left_slot = heap[left] right_slot = heap[right] if is_empty?(left_slot) && !is_empty?(right_slot) && can_move?(right_slot) swap(left, right) heap[right] = T_MOVED.new(left) # leave forwarding address end while !is_empty?(heap[left]) left += 1 end while is_empty?(heap[right]) || !can_move?(heap[right]) right -= 1 end end end Pointers Met? Copy / Forward Advance "free" Retreat "scan"
  13. Reference Updating def update_references heap.each do |slot| next if is_empty?(slot)

    || is_moved?(slot) slot.references.each_with_index do |child, i| if is_moved?(child) slot.set_reference(i, child.new_location) end end end end How are references stored?
  14. Finding References • How do Hashes hold references? • How

    do Arrays hold references? • How do Objects hold references? • … • … • …
  15. Reference Updating static void gc_ref_update_array(rb_objspace_t * objspace, VALUE v) {

    long i, len; if (FL_TEST(v, ELTS_SHARED)) return; len = RARRAY_LEN(v); if (len > 0) { VALUE *ptr = (VALUE *)RARRAY_CONST_PTR_TRANSIENT(v); for(i = 0; i < len; i++) { UPDATE_IF_MOVED(objspace, ptr[i]); } } } static void gc_ref_update_object(rb_objspace_t * objspace, VALUE v) { uint32_t i, len = ROBJECT_NUMIV(v); VALUE *ptr = ROBJECT_IVPTR(v); for (i = 0; i < len; i++) { UPDATE_IF_MOVED(objspace, ptr[i]); } } static int hash_replace_ref(st_data_t *key, st_data_t *value, st_data_t argp, int existing)
  16. Yajl typedef struct { VALUE builderStack; VALUE parse_complete_callback; int nestedArrayLevel;

    int nestedHashLevel; int objectsFound; int symbolizeKeys; yajl_handle parser; } yajl_parser_wrapper; C Code (yajl_ext.h) malloc(yajl_parser_wrapper) Ruby Object T_DATA Ruby Object Ruby Object builderStack parse_complete_callback
  17. Yajl typedef struct { VALUE builderStack; VALUE parse_complete_callback; int nestedArrayLevel;

    int nestedHashLevel; int objectsFound; int symbolizeKeys; yajl_handle parser; } yajl_parser_wrapper; C Code (yajl_ext.h) malloc(yajl_parser_wrapper) Ruby Object T_DATA Ruby Object Ruby Object builderStack parse_complete_callback GC: "idk, " MOVED!
  18. Yajl Mark Function void yajl_parser_wrapper_mark(void * wrapper) { yajl_parser_wrapper *

    w = wrapper; if (w) { rb_gc_mark(w->builderStack); rb_gc_mark(w->parse_complete_callback); } } malloc(yajl_parser_wrapper) Ruby Object T_DATA Ruby Object Ruby Object rb_gc_mark(builderStack) rb_gc_mark(parse_complete_callback)
  19. Pinning Bits 1 2 3 4 5 6 7 8

    9 10 Yajl [ ] ? "foo" "bar" ? Address Content Pinned x = [ "foo", "bar" ] y = Yajl.new Ruby Code rb_gc_m ark rb_gc_m ark gc_m ark_no_pin gc_m ark_no_pin
  20. Pinning Bits 1 2 3 4 5 6 7 8

    9 10 Yajl [ ] ? ? Address Content Pinned x = [ "foo", "bar" ] y = Yajl.new Ruby Code Free Scan "bar" "foo" 4 5 Move Step
  21. Pinning Bits 1 2 3 4 5 6 7 8

    9 10 Yajl [ ] ? ? Address Content Pinned x = [ "foo", "bar" ] y = Yajl.new Ruby Code 4 5 Reference Update Step "bar" "foo" Update
  22. Compaction Callback static const rb_data_type_t yajl_parser_type = { "Yajl/parser", {yajl_parser_wrapper_mark,

    yajl_parser_wrapper_free, NULL,}, 0, 0, RUBY_TYPED_FREE_IMMEDIATELY, }; Mark No Compaction Callback static const rb_data_type_t yajl_parser_type = { "Yajl/parser", {yajl_parser_wrapper_mark, yajl_parser_wrapper_free, NULL, yajl_parser_compact}, 0, 0, RUBY_TYPED_FREE_IMMEDIATELY, }; Compact With Compaction Callback Sweep
  23. "No Pin" Marking void yajl_parser_wrapper_mark(void * wrapper) { yajl_parser_wrapper *

    w = wrapper; if (w) { rb_gc_mark(w->builderStack); rb_gc_mark(w->parse_complete_callback); } } No Compaction Support void yajl_parser_wrapper_mark(void * wrapper) { yajl_parser_wrapper * w = wrapper; if (w) { rb_gc_mark_no_pin(w->builderStack); rb_gc_mark_no_pin(w->parse_complete_callback); } } With Compaction Support
  24. Compaction Callback void yajl_parser_compact(void *wrapper) { yajl_parser_wrapper * w =

    wrapper; if (w) { w->builderStack = rb_gc_new_location(w->builderStack); w->parse_complete_callback = rb_gc_new_location(w->parse_complete_callback); } } New Location
  25. Problem Object Graph Object Implemented in Ruby Object Implemented in

    C Some Object Automatically Marked!! (gc_mark_no_pin) Not Marked
  26. Compaction 1 2 3 4 5 6 7 8 9

    10 Ruby Obj C Obj ? 4 5 3
  27. RubyVM Instruction Sequence ISeq Object (in C) def foo "bar"

    end Code Mark Array (Ruby) "bar" Marked Marked NOT Marked
  28. "Direct Marking" in Ruby 2.6 ISeq Object (in C) def

    foo "bar" end Code "bar" Marked https://bugs.ruby-lang.org/issues/14370
  29. MsgPack Object Implemented in Ruby Object Implemented in C Some

    Object Automatically Marked!! (gc_mark) Not Marked
  30. object_id is based on location 1 2 3 4 5

    6 7 8 9 10 Ruby Obj Ruby Obj Ruby Obj 5 object#object_id => 1 object#object_id => 2 object#object_id => 9 object#object_id => ?
  31. "Seen" Object IDs $seen_object_id = {} class Object def object_id

    $seen_object_id[memory_location] ||= memory_location end end
  32. Object ID After Move x = Object.new x.object_id GC.compact x.object_id

    1 2 3 4 X Heap Object ID Table Memory Location Object ID 4 4 Updated Object ID Table Memory Location Object ID 1 4
  33. Object ID Collisions x = Object.new x.object_id GC.compact x.object_id y

    = Object.new y.object_id 1 2 3 4 X Heap Object ID Table Memory Location Object ID 4 4 Updated Object ID Table Memory Location Object ID 1 4 y.object_id => ??? x.object_id => 4 Y
  34. Collision Resolution $seen_object_id = {} $location_to_object_id = {} class Object

    def object_id id = memory_location while $seen_object_id[id] id += 1 end $seen_object_id[id] = id $location_to_object_id[memory_location] = id end end
  35. Object ID Collisions x = Object.new x.object_id GC.compact x.object_id y

    = Object.new y.object_id 1 2 3 4 X Heap Object ID Table Memory Location Object ID 4 4 Updated Object ID Table Memory Location Object ID 1 4 y.object_id => 5 x.object_id => 4 Y Updated Object ID Table- Memory Location Object ID 1 4 4 5
  36. GC Cleanup $seen_object_id = {} $location_to_object_id = {} def free_obj(obj)

    if $location_to_object_id[obj.memory_location] id = $location_to_object_id.delete(obj.memory_location) $seen_object_id.delete(id) end end
  37. Sliding Compaction 1 2 3 4 5 6 7 8

    9 10 Yajl [ ] ? "foo" "bar" ? Address Content