Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[SnowOne 2025] Роман Артемьев: ART Memory Manag...

[SnowOne 2025] Роман Артемьев: ART Memory Management

ART — Android RunTime — виртуальная машина в ОС Android, в которой исполняются пользовательские приложения. ART можно считать альтернативной реализацией VM для языка Java, в котором есть свои интересные особенности, в частности механизм управления памятью и сборкой мусора. К сожалению, эту тему все еще слабо освещают. Предлагаю исправить это недоразумение и погрузиться в Java-мир, альтернативный мейнстримному OpenJDK, и узнать, как бывает еще.

Видео: https://youtu.be/yRtfXxN6KWk

Avatar for jugnsk

jugnsk

May 08, 2025
Tweet

More Decks by jugnsk

Other Decks in Programming

Transcript

  1. About me ▶ JVM for E2K (Elbrus) Arch ▶ Kotlin

    Compiler ▶ Android Runtime for RISC-V arch 2 / 47
  2. Plan 1. ART Overview 2. General GC information 3. Mark

    & Sweep 4. Semi-Space 5. Concurrent Copying 6. Concurrent Mark Compact 3 / 47
  3. Plan 1. ART Overview 2. General GC information 3. Mark

    & Sweep 4. Semi-Space 5. Concurrent Copying 6. Concurrent Mark Compact 4 / 47
  4. DEX bytecode public int calculate(int a, int b, int c)

    { return (a + b) * c; } JVM 0: iload_1 // Load a on stack 1: iload_2 // Load b on stack 2: iadd // push a + b 3: iload_3 // Load c on stack 4: imul // push TOP * c 5: ireturn // return 6 / 47
  5. DEX bytecode public int calculate(int a, int b, int c)

    { return (a + b) * c; } JVM 0: iload_1 // Load a on stack 1: iload_2 // Load b on stack 2: iadd // push a + b 3: iload_3 // Load c on stack 4: imul // push TOP * c 5: ireturn // return DEX move v0 , p1 // Move a -> v0 move v1 , p2 // Move b -> v1 move v2 , p3 // Move c -> v2 add -int v0 , v0 , v1 // v0+v1 => v0 mul -int v0 , v0 , v2 // v0*v2 => v0 return v0 // return 6 / 47
  6. ART Overview ▶ Android RunTime ▶ Executes DEX bytecode (classes.dex)

    ▶ Primary VM on Android devices ▶ Runs Android applications (.apk/apex) 7 / 47
  7. ART Overview ▶ Android RunTime ▶ Executes DEX bytecode (classes.dex)

    ▶ Primary VM on Android devices ▶ Runs Android applications (.apk/apex) ▶ Supports AOT compilation 7 / 47
  8. ART Overview ▶ Android RunTime ▶ Executes DEX bytecode (classes.dex)

    ▶ Primary VM on Android devices ▶ Runs Android applications (.apk/apex) ▶ Supports AOT compilation ▶ Targets arm/aarch64/x86/x86 64/riscv64 7 / 47
  9. Plan 1. ART Overview 2. General GC information 3. Mark

    & Sweep 4. Semi-Space 5. Concurrent Copying 6. Concurrent Mark Compact 8 / 47
  10. Garbage Collection ▶ 4GB Max Heap size 1. Mapped to

    the low addresses (0x0000000-0xFFFFFFFF) 9 / 47
  11. Garbage Collection ▶ 4GB Max Heap size 1. Mapped to

    the low addresses (0x0000000-0xFFFFFFFF) 2. Uses compressed 32-bit pointer even on 64-bit system 9 / 47
  12. Garbage Collection ▶ 4GB Max Heap size 1. Mapped to

    the low addresses (0x0000000-0xFFFFFFFF) 2. Uses compressed 32-bit pointer even on 64-bit system ▶ Uses so-called Space abstraction to represent heap memory 9 / 47
  13. Garbage Collection ▶ 4GB Max Heap size 1. Mapped to

    the low addresses (0x0000000-0xFFFFFFFF) 2. Uses compressed 32-bit pointer even on 64-bit system ▶ Uses so-called Space abstraction to represent heap memory ▶ 2 GC working states: 9 / 47
  14. Garbage Collection ▶ 4GB Max Heap size 1. Mapped to

    the low addresses (0x0000000-0xFFFFFFFF) 2. Uses compressed 32-bit pointer even on 64-bit system ▶ Uses so-called Space abstraction to represent heap memory ▶ 2 GC working states: 1. Foreground 9 / 47
  15. Garbage Collection ▶ 4GB Max Heap size 1. Mapped to

    the low addresses (0x0000000-0xFFFFFFFF) 2. Uses compressed 32-bit pointer even on 64-bit system ▶ Uses so-called Space abstraction to represent heap memory ▶ 2 GC working states: 1. Foreground ▶ Runs when Application is in active 9 / 47
  16. Garbage Collection ▶ 4GB Max Heap size 1. Mapped to

    the low addresses (0x0000000-0xFFFFFFFF) 2. Uses compressed 32-bit pointer even on 64-bit system ▶ Uses so-called Space abstraction to represent heap memory ▶ 2 GC working states: 1. Foreground ▶ Runs when Application is in active ▶ Focuses on latency 9 / 47
  17. Garbage Collection ▶ 4GB Max Heap size 1. Mapped to

    the low addresses (0x0000000-0xFFFFFFFF) 2. Uses compressed 32-bit pointer even on 64-bit system ▶ Uses so-called Space abstraction to represent heap memory ▶ 2 GC working states: 1. Foreground ▶ Runs when Application is in active ▶ Focuses on latency 2. Background 9 / 47
  18. Garbage Collection ▶ 4GB Max Heap size 1. Mapped to

    the low addresses (0x0000000-0xFFFFFFFF) 2. Uses compressed 32-bit pointer even on 64-bit system ▶ Uses so-called Space abstraction to represent heap memory ▶ 2 GC working states: 1. Foreground ▶ Runs when Application is in active ▶ Focuses on latency 2. Background ▶ Run when Application is in background 9 / 47
  19. Garbage Collection ▶ 4GB Max Heap size 1. Mapped to

    the low addresses (0x0000000-0xFFFFFFFF) 2. Uses compressed 32-bit pointer even on 64-bit system ▶ Uses so-called Space abstraction to represent heap memory ▶ 2 GC working states: 1. Foreground ▶ Runs when Application is in active ▶ Focuses on latency 2. Background ▶ Run when Application is in background ▶ Focuses on efficiency 9 / 47
  20. Garbage Collection ▶ 4GB Max Heap size 1. Mapped to

    the low addresses (0x0000000-0xFFFFFFFF) 2. Uses compressed 32-bit pointer even on 64-bit system ▶ Uses so-called Space abstraction to represent heap memory ▶ 2 GC working states: 1. Foreground ▶ Runs when Application is in active ▶ Focuses on latency 2. Background ▶ Run when Application is in background ▶ Focuses on efficiency ▶ Application itself notifies runtime about state switch 9 / 47
  21. Garbage Collection ▶ 4GB Max Heap size 1. Mapped to

    the low addresses (0x0000000-0xFFFFFFFF) 2. Uses compressed 32-bit pointer even on 64-bit system ▶ Uses so-called Space abstraction to represent heap memory ▶ 2 GC working states: 1. Foreground ▶ Runs when Application is in active ▶ Focuses on latency 2. Background ▶ Run when Application is in background ▶ Focuses on efficiency ▶ Application itself notifies runtime about state switch ▶ VMRuntime::updateProcessState() 9 / 47
  22. Plan 1. ART Overview 2. General GC information 3. Mark

    & Sweep 4. Semi-Space 5. Concurrent Copying 6. Concurrent Mark Compact 10 / 47
  23. Mark & Sweep Overview ▶ https://cs.android.com/android/platform/ superproject/main/+/main: art/runtime/gc/collector/mark_sweep.h ▶ Uses

    Malloc Space ▶ Non-moving collector (preserves object references) ▶ 2 Versions: 11 / 47
  24. Mark & Sweep Overview ▶ https://cs.android.com/android/platform/ superproject/main/+/main: art/runtime/gc/collector/mark_sweep.h ▶ Uses

    Malloc Space ▶ Non-moving collector (preserves object references) ▶ 2 Versions: 1. Serial (-Xgc:MS), JVM SerialGC, ParallelGC 11 / 47
  25. Mark & Sweep Overview ▶ https://cs.android.com/android/platform/ superproject/main/+/main: art/runtime/gc/collector/mark_sweep.h ▶ Uses

    Malloc Space ▶ Non-moving collector (preserves object references) ▶ 2 Versions: 1. Serial (-Xgc:MS), JVM SerialGC, ParallelGC 2. Concurrent Marking (-Xgc:CMS), JVM Concurrent Mark&Sweep 11 / 47
  26. Mark & Sweep Algorithm 1. Collect root set 1.1 Thread

    stacks & registers 1.2 Static objects 12 / 47
  27. Mark & Sweep Algorithm 1. Collect root set 1.1 Thread

    stacks & registers 1.2 Static objects 1.3 many other minor places like native handlers, image/jit references and so on 12 / 47
  28. Mark & Sweep Algorithm 1. Collect root set 1.1 Thread

    stacks & registers 1.2 Static objects 1.3 many other minor places like native handlers, image/jit references and so on 2. Mark reachable objects 12 / 47
  29. Mark & Sweep Algorithm 1. Collect root set 1.1 Thread

    stacks & registers 1.2 Static objects 1.3 many other minor places like native handlers, image/jit references and so on 2. Mark reachable objects 2.1 DFS on memory graph (opt. Concurrently) 12 / 47
  30. Mark & Sweep Algorithm 1. Collect root set 1.1 Thread

    stacks & registers 1.2 Static objects 1.3 many other minor places like native handlers, image/jit references and so on 2. Mark reachable objects 2.1 DFS on memory graph (opt. Concurrently) 2.2 Re-mark roots & dirty objects (STW) 12 / 47
  31. Mark & Sweep Algorithm 1. Collect root set 1.1 Thread

    stacks & registers 1.2 Static objects 1.3 many other minor places like native handlers, image/jit references and so on 2. Mark reachable objects 2.1 DFS on memory graph (opt. Concurrently) 2.2 Re-mark roots & dirty objects (STW) ▶ Scan dirty cards 12 / 47
  32. Mark & Sweep Algorithm 1. Collect root set 1.1 Thread

    stacks & registers 1.2 Static objects 1.3 many other minor places like native handlers, image/jit references and so on 2. Mark reachable objects 2.1 DFS on memory graph (opt. Concurrently) 2.2 Re-mark roots & dirty objects (STW) ▶ Scan dirty cards 3. Sweep garbage i.e. unreachable objects calling Free(unreachable object) 12 / 47
  33. Mark & Sweep Algorithm 1. Collect root set 1.1 Thread

    stacks & registers 1.2 Static objects 1.3 many other minor places like native handlers, image/jit references and so on 2. Mark reachable objects 2.1 DFS on memory graph (opt. Concurrently) 2.2 Re-mark roots & dirty objects (STW) ▶ Scan dirty cards 3. Sweep garbage i.e. unreachable objects calling Free(unreachable object) ▶ is gargabe = live[obj] & ~mark[obj] 12 / 47
  34. Mark & Sweep Algorithm 1. Collect root set 1.1 Thread

    stacks & registers 1.2 Static objects 1.3 many other minor places like native handlers, image/jit references and so on 2. Mark reachable objects 2.1 DFS on memory graph (opt. Concurrently) 2.2 Re-mark roots & dirty objects (STW) ▶ Scan dirty cards 3. Sweep garbage i.e. unreachable objects calling Free(unreachable object) ▶ is gargabe = live[obj] & ~mark[obj] 13 / 47
  35. Mark & Sweep Problems ▶ Slow allocations ▶ Non-constant complexity

    ▶ Thread unsafe ▶ Memory fragmentation 15 / 47
  36. Mark & Sweep Problems ▶ Slow allocations ▶ Non-constant complexity

    ▶ Thread unsafe ▶ Memory fragmentation 15 / 47
  37. Plan 1. ART Overview 2. General GC information 3. Mark

    & Sweep 4. Semi-Space 5. Concurrent Copying 6. Concurrent Mark Compact 16 / 47
  38. Semi-Space Overview ▶ https://cs.android.com/android/platform/ superproject/main/+/main: art/runtime/gc/collector/semi_space.h ▶ Uses 2 Bump

    Pointer Spaces ▶ from and to spaces ▶ Moving collector ▶ -Xgc:SS ▶ Uses mark-is-relocate concept 17 / 47
  39. Semi-Space Overview ▶ https://cs.android.com/android/platform/ superproject/main/+/main: art/runtime/gc/collector/semi_space.h ▶ Uses 2 Bump

    Pointer Spaces ▶ from and to spaces ▶ Moving collector ▶ -Xgc:SS ▶ Uses mark-is-relocate concept ▶ Always STW 17 / 47
  40. Semi-Space Overview ▶ https://cs.android.com/android/platform/ superproject/main/+/main: art/runtime/gc/collector/semi_space.h ▶ Uses 2 Bump

    Pointer Spaces ▶ from and to spaces ▶ Moving collector ▶ -Xgc:SS ▶ Uses mark-is-relocate concept ▶ Always STW 17 / 47
  41. Semi-Space Algorithm 1. Collect root set 1.1 Stop threads 2.

    Relocate any reached object in from-space into to-space 19 / 47
  42. Semi-Space Algorithm 1. Collect root set 1.1 Stop threads 2.

    Relocate any reached object in from-space into to-space 2.1 Update reference 19 / 47
  43. Semi-Space Algorithm 1. Collect root set 1.1 Stop threads 2.

    Relocate any reached object in from-space into to-space 2.1 Update reference 3. Exchange from and to spaces 19 / 47
  44. Semi-Space Algorithm 1. Collect root set 1.1 Stop threads 2.

    Relocate any reached object in from-space into to-space 2.1 Update reference 3. Exchange from and to spaces 3.1 Any object in from-space is garbage 19 / 47
  45. Semi-Space Algorithm 1. Collect root set 1.1 Stop threads 2.

    Relocate any reached object in from-space into to-space 2.1 Update reference 3. Exchange from and to spaces 3.1 Any object in from-space is garbage 3.2 All object in to-space is alive 19 / 47
  46. Semi-Space Algorithm 1. Collect root set 1.1 Stop threads 2.

    Relocate any reached object in from-space into to-space 2.1 Update reference 3. Exchange from and to spaces 3.1 Any object in from-space is garbage 3.2 All object in to-space is alive 3.3 to-space becomes Allocation space 19 / 47
  47. Semi-Space Algorithm 1. Collect root set 1.1 Stop threads 2.

    Relocate any reached object in from-space into to-space 2.1 Update reference 3. Exchange from and to spaces 3.1 Any object in from-space is garbage 3.2 All object in to-space is alive 3.3 to-space becomes Allocation space 4. Resume threads 19 / 47
  48. Semi-Space Problems ▶ STW Compation ▶ UI Stalls ▶ Network

    package loses ▶ ... ▶ 2x times smaller heap size 21 / 47
  49. Plan 1. ART Overview 2. General GC information 3. Mark

    & Sweep 4. Semi-Space 5. Concurrent Copying 6. Concurrent Mark Compact 22 / 47
  50. Concurrent Copying Overview ▶ https://cs.android.com/android/platform/ superproject/main/+/main: art/runtime/gc/collector/concurrent_copying.h ▶ Uses Region

    Space ▶ Moving collector ▶ -Xgc:CC ▶ Very close to HotSpot’s G1 GC ▶ Uses mark-is-relocate concept 23 / 47
  51. Concurrent Copying Overview ▶ https://cs.android.com/android/platform/ superproject/main/+/main: art/runtime/gc/collector/concurrent_copying.h ▶ Uses Region

    Space ▶ Moving collector ▶ -Xgc:CC ▶ Very close to HotSpot’s G1 GC ▶ Uses mark-is-relocate concept ▶ Fully Concurrent 23 / 47
  52. Concurrent Copying Overview ▶ https://cs.android.com/android/platform/ superproject/main/+/main: art/runtime/gc/collector/concurrent_copying.h ▶ Uses Region

    Space ▶ Moving collector ▶ -Xgc:CC ▶ Very close to HotSpot’s G1 GC ▶ Uses mark-is-relocate concept ▶ Fully Concurrent ▶ Requires Read Barrier 23 / 47
  53. Concurrent Copying Overview ▶ https://cs.android.com/android/platform/ superproject/main/+/main: art/runtime/gc/collector/concurrent_copying.h ▶ Uses Region

    Space ▶ Moving collector ▶ -Xgc:CC ▶ Very close to HotSpot’s G1 GC ▶ Uses mark-is-relocate concept ▶ Fully Concurrent ▶ Requires Read Barrier ▶ Supports generational & young collections 23 / 47
  54. Concurrent Copying Overview ▶ https://cs.android.com/android/platform/ superproject/main/+/main: art/runtime/gc/collector/concurrent_copying.h ▶ Uses Region

    Space ▶ Moving collector ▶ -Xgc:CC ▶ Very close to HotSpot’s G1 GC ▶ Uses mark-is-relocate concept ▶ Fully Concurrent ▶ Requires Read Barrier ▶ Supports generational & young collections 23 / 47
  55. Concurrent Copying Overview ▶ https://cs.android.com/android/platform/ superproject/main/+/main: art/runtime/gc/collector/concurrent_copying.h ▶ Uses Region

    Space ▶ Moving collector ▶ -Xgc:CC ▶ Very close to HotSpot’s G1 GC ▶ Uses mark-is-relocate concept ▶ Fully Concurrent ▶ Requires Read Barrier ▶ Supports generational & young collections 23 / 47
  56. Concurrent Copying Overview ▶ https://cs.android.com/android/platform/ superproject/main/+/main: art/runtime/gc/collector/concurrent_copying.h ▶ Uses Region

    Space ▶ Moving collector ▶ -Xgc:CC ▶ Very close to HotSpot’s G1 GC ▶ Uses mark-is-relocate concept ▶ Fully Concurrent ▶ Requires Read Barrier ▶ Supports generational & young collections 23 / 47
  57. Concurrent Copying Read Barrier Executed as 2nd stage of reference

    field reading Without RB A f = o.f; 27 / 47
  58. Concurrent Copying Read Barrier Executed as 2nd stage of reference

    field reading Without RB A f = o.f; With RB A f = o.f; if (threadrb enabled) { f = rb resolve(f); } 27 / 47
  59. Concurrent Copying Read Barrier Executed as 2nd stage of reference

    field reading Without RB A f = o.f; With RB A f = o.f; if (threadrb enabled) { f = rb resolve(f); } 28 / 47
  60. Concurrent Copying Read Barrier Object rb resolve(Object o) { if

    (isToRegion(o)) return o; if (isMarked(o)) return toRegionPtr(o); return mark and relocate(o); } 29 / 47
  61. Concurrent Copying Read Barrier Object rb resolve(Object o) { if

    (isToRegion(o)) return o; if (isMarked(o)) return toRegionPtr(o); return mark and relocate(o); } 29 / 47
  62. Concurrent Copying Algorithm 1. Determine which regions are to be

    evacuated (from-region) 2. Collect root set (STW) 31 / 47
  63. Concurrent Copying Algorithm 1. Determine which regions are to be

    evacuated (from-region) 2. Collect root set (STW) 2.1 During this phase root set objects are relocated 31 / 47
  64. Concurrent Copying Algorithm 1. Determine which regions are to be

    evacuated (from-region) 2. Collect root set (STW) 2.1 During this phase root set objects are relocated 2.2 Enable Read Barrier 31 / 47
  65. Concurrent Copying Algorithm 1. Determine which regions are to be

    evacuated (from-region) 2. Collect root set (STW) 2.1 During this phase root set objects are relocated 2.2 Enable Read Barrier 3. Walk mark stack and collect alive subgraph 31 / 47
  66. Concurrent Copying Algorithm 1. Determine which regions are to be

    evacuated (from-region) 2. Collect root set (STW) 2.1 During this phase root set objects are relocated 2.2 Enable Read Barrier 3. Walk mark stack and collect alive subgraph 3.1 Relocate objects in from-regions to to-regions 31 / 47
  67. Concurrent Copying Algorithm 1. Determine which regions are to be

    evacuated (from-region) 2. Collect root set (STW) 2.1 During this phase root set objects are relocated 2.2 Enable Read Barrier 3. Walk mark stack and collect alive subgraph 3.1 Relocate objects in from-regions to to-regions 3.2 Once stack walking is done there is no alive object left in from-region 31 / 47
  68. Concurrent Copying Algorithm 1. Determine which regions are to be

    evacuated (from-region) 2. Collect root set (STW) 2.1 During this phase root set objects are relocated 2.2 Enable Read Barrier 3. Walk mark stack and collect alive subgraph 3.1 Relocate objects in from-regions to to-regions 3.2 Once stack walking is done there is no alive object left in from-region 4. Sync mutators and GC threads 31 / 47
  69. Concurrent Copying Algorithm 1. Determine which regions are to be

    evacuated (from-region) 2. Collect root set (STW) 2.1 During this phase root set objects are relocated 2.2 Enable Read Barrier 3. Walk mark stack and collect alive subgraph 3.1 Relocate objects in from-regions to to-regions 3.2 Once stack walking is done there is no alive object left in from-region 4. Sync mutators and GC threads 4.1 Disable Read Barrier 31 / 47
  70. Concurrent Copying Algorithm 1. Determine which regions are to be

    evacuated (from-region) 2. Collect root set (STW) 2.1 During this phase root set objects are relocated 2.2 Enable Read Barrier 3. Walk mark stack and collect alive subgraph 3.1 Relocate objects in from-regions to to-regions 3.2 Once stack walking is done there is no alive object left in from-region 4. Sync mutators and GC threads 4.1 Disable Read Barrier 31 / 47
  71. Concurrent Copying Generations 1. Minor (young gen) collection ▶ Evacuate

    only newly allocated regions 2. Major collection ▶ Evacuate newly allocated and regions filled > 75% 32 / 47
  72. Concurrent Copying Generations 1. Minor (young gen) collection ▶ Evacuate

    only newly allocated regions 2. Major collection ▶ Evacuate newly allocated and regions filled > 75% 3. Full collection ▶ Evacuate all regions 32 / 47
  73. Concurrent Copying Problems ▶ Read barrier overhead ▶ Lower app

    throughput compared to other algorithms 33 / 47
  74. Plan 1. ART Overview 2. General GC information 3. Mark

    & Sweep 4. Semi-Space 5. Concurrent Copying 6. Concurrent Mark Compact 34 / 47
  75. Concurrent Mark Compact Overview ▶ https://cs.android.com/android/platform/ superproject/main/+/main: art/runtime/gc/collector/mark_compact.h ▶ Uses

    Single Bump Pointer Space ▶ Moving collector ▶ -Xgc:CMC ▶ Fully Concurrent ▶ The Pauseless GC Algorithm by Click, Tene & Wolf 35 / 47
  76. Concurrent Mark Compact BitMaps To track alive bytes, object bytes

    and so on Total size = 4Gb / 8 bytes / 8bit = 64Mb 37 / 47
  77. Concurrent Mark Compact Read Barrier Executed as 2nd stage of

    reference field reading Without RB A f = o.f; With RB A f = o.f; if (threadrb enabled) { f = rb resolve(f); } 39 / 47
  78. Concurrent Mark Compact MMU Trigger Thread 1 mprotect(mem, NONE) //

    t=0 // Do something ... // 1<t<4 mprotect(mem, READ | WRITE) // t=5 40 / 47
  79. Concurrent Mark Compact MMU Trigger Thread 1 mprotect(mem, NONE) //

    t=0 // Do something ... // 1<t<4 mprotect(mem, READ | WRITE) // t=5 Thread 2 Value = *mem // t=1, SIGSEGV // Do something else mprotect(mem, READ | WRITE) 40 / 47
  80. Concurrent Mark Compact MMU Trigger Thread 1 mprotect(mem, NONE) //

    t=0 // Do something ... // 1<t<4 mprotect(mem, READ | WRITE) // t=5 Thread 2 Value = *mem // t=1, SIGSEGV // Do something else mprotect(mem, READ | WRITE) 1. Granularity – memory page 40 / 47
  81. Concurrent Mark Compact MMU Trigger Thread 1 mprotect(mem, NONE) //

    t=0 // Do something ... // 1<t<4 mprotect(mem, READ | WRITE) // t=5 Thread 2 Value = *mem // t=1, SIGSEGV // Do something else mprotect(mem, READ | WRITE) 1. Granularity – memory page 2. Uses heavy VMA structures 40 / 47
  82. Concurrent Mark Compact MMU Trigger Thread 1 mprotect(mem, NONE) //

    t=0 // Do something ... // 1<t<4 mprotect(mem, READ | WRITE) // t=5 Thread 2 Value = *mem // t=1, SIGSEGV // Do something else mprotect(mem, READ | WRITE) 1. Granularity – memory page 2. Uses heavy VMA structures 2.1 userfaultfd kernel feature (5.15+) does not use VMA 40 / 47
  83. Concurrent Mark Compact MMU Trigger Thread 1 mprotect(mem, NONE) //

    t=0 // Do something ... // 1<t<4 mprotect(mem, READ | WRITE) // t=5 Thread 2 Value = *mem // t=1, SIGSEGV // Do something else mprotect(mem, READ | WRITE) 1. Granularity – memory page 2. Uses heavy VMA structures 2.1 userfaultfd kernel feature (5.15+) does not use VMA 2.2 https: //man7.org/linux/man-pages/man2/userfaultfd.2.html 40 / 47
  84. Concurrent Mark Compact Algorithm 1. Collect roots (STW) 2. Walk

    mark stack 2.1 Collect information for further stages (Heap Live BitMap) 41 / 47
  85. Concurrent Mark Compact Algorithm 1. Collect roots (STW) 2. Walk

    mark stack 2.1 Collect information for further stages (Heap Live BitMap) 3. Prepare for compaction (STW) 41 / 47
  86. Concurrent Mark Compact Algorithm 1. Collect roots (STW) 2. Walk

    mark stack 2.1 Collect information for further stages (Heap Live BitMap) 3. Prepare for compaction (STW) 3.1 Pre-compute evacuated objects addresses 41 / 47
  87. Concurrent Mark Compact Algorithm 1. Collect roots (STW) 2. Walk

    mark stack 2.1 Collect information for further stages (Heap Live BitMap) 3. Prepare for compaction (STW) 3.1 Pre-compute evacuated objects addresses 3.2 Remap relocating heap to shadow space 41 / 47
  88. Concurrent Mark Compact Algorithm 1. Collect roots (STW) 2. Walk

    mark stack 2.1 Collect information for further stages (Heap Live BitMap) 3. Prepare for compaction (STW) 3.1 Pre-compute evacuated objects addresses 3.2 Remap relocating heap to shadow space 3.3 Lock relocating virtual space using userfaultfd 41 / 47
  89. Concurrent Mark Compact Algorithm 1. Collect roots (STW) 2. Walk

    mark stack 2.1 Collect information for further stages (Heap Live BitMap) 3. Prepare for compaction (STW) 3.1 Pre-compute evacuated objects addresses 3.2 Remap relocating heap to shadow space 3.3 Lock relocating virtual space using userfaultfd 3.4 Update root references (no relocation) 41 / 47
  90. Concurrent Mark Compact Algorithm 1. Collect roots (STW) 2. Walk

    mark stack 2.1 Collect information for further stages (Heap Live BitMap) 3. Prepare for compaction (STW) 3.1 Pre-compute evacuated objects addresses 3.2 Remap relocating heap to shadow space 3.3 Lock relocating virtual space using userfaultfd 3.4 Update root references (no relocation) 4. Process each heap page separately (GC thread) 41 / 47
  91. Concurrent Mark Compact Algorithm 1. Collect roots (STW) 2. Walk

    mark stack 2.1 Collect information for further stages (Heap Live BitMap) 3. Prepare for compaction (STW) 3.1 Pre-compute evacuated objects addresses 3.2 Remap relocating heap to shadow space 3.3 Lock relocating virtual space using userfaultfd 3.4 Update root references (no relocation) 4. Process each heap page separately (GC thread) 4.1 If mutator triggers SIGBUS it processes requested page itself 41 / 47
  92. Concurrent Mark Compact Algorithm 1. Collect roots (STW) 2. Walk

    mark stack 2.1 Collect information for further stages (Heap Live BitMap) 3. Prepare for compaction (STW) 3.1 Pre-compute evacuated objects addresses 3.2 Remap relocating heap to shadow space 3.3 Lock relocating virtual space using userfaultfd 3.4 Update root references (no relocation) 4. Process each heap page separately (GC thread) 4.1 If mutator triggers SIGBUS it processes requested page itself 5. Release userfaultfd and GC structs 41 / 47
  93. Concurrent Mark Compact Algorithm 1. Collect roots (STW) 2. Walk

    mark stack 2.1 Collect information for further stages (Heap Live BitMap) 3. Prepare for compaction (STW) 3.1 Pre-compute evacuated objects addresses 3.2 Remap relocating heap to shadow space 3.3 Lock relocating virtual space using userfaultfd 3.4 Update root references (no relocation) 4. Process each heap page separately (GC thread) 4.1 If mutator triggers SIGBUS it processes requested page itself 5. Release userfaultfd and GC structs 41 / 47
  94. Concurrent Mark Compact Algorithm 1. Collect roots (STW) 2. Walk

    mark stack 2.1 Collect information for further stages (Heap Live BitMap) 3. Prepare for compaction (STW) 3.1 Pre-compute evacuated objects addresses 3.2 Remap relocating heap to shadow space 3.3 Lock relocating virtual space using userfaultfd 3.4 Update root references (no relocation) 4. Process each heap page separately (GC thread) 4.1 If mutator triggers SIGBUS it processes requested page itself 5. Release userfaultfd and GC structs 41 / 47
  95. Concurrent Mark Compact Algorithm 1. Collect roots (STW) 2. Walk

    mark stack 2.1 Collect information for further stages (Heap Live BitMap) 3. Prepare for compaction (STW) 3.1 Pre-compute evacuated objects addresses 3.2 Remap relocating heap to shadow space 3.3 Lock relocating virtual space using userfaultfd 3.4 Update root references (no relocation) 4. Process each heap page separately (GC thread) 4.1 If mutator triggers SIGBUS it processes requested page itself 5. Release userfaultfd and GC structs 43 / 47
  96. Concurrent Mark Compact Algorithm 1. Collect roots (STW) 2. Walk

    mark stack 2.1 Collect information for further stages (Heap Live BitMap) 3. Prepare for compaction (STW) 3.1 Pre-compute evacuated objects addresses 3.2 Remap relocating heap to shadow space 3.3 Lock relocating virtual space using userfaultfd 3.4 Update root references (no relocation) 4. Process each heap page separately (GC thread) 4.1 If mutator triggers SIGBUS it processes requested page itself 5. Release userfaultfd and GC structs 43 / 47
  97. Concurrent Mark Compact Algorithm 1. Collect roots (STW) 2. Walk

    mark stack 2.1 Collect information for further stages (Heap Live BitMap) 3. Prepare for compaction (STW) 3.1 Pre-compute evacuated objects addresses 3.2 Remap relocating heap to shadow space 3.3 Lock relocating virtual space using userfaultfd 3.4 Update root references (no relocation) 4. Process each heap page separately (GC thread) 4.1 If mutator triggers SIGBUS it processes requested page itself 5. Release userfaultfd and GC structs 45 / 47
  98. Q/A ▶ AOSP repo https://cs.android.com/android/platform/ superproject/main ▶ The Pauseless GC

    Algorithm by Click, Tene & Wolf https://www.researchgate.net/publication/ 221137840_The_pauseless_GC_algorithm 47 / 47