Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Fundamentals of Memory Management in Go: Learni...

Fundamentals of Memory Management in Go: Learning Through the History

24/10/2025 Slides for the presentation at Go West Conference in Lehi, UT.

Avatar for Takuto Nagami

Takuto Nagami

October 24, 2025
Tweet

More Decks by Takuto Nagami

Other Decks in Technology

Transcript

  1. Why memory management matters? • For better performance • Who

    takes care of Go itself? • Memory management is fun to learn!
  2. Learning process: Why, What and How When you want to

    add a new feature, Why Why do you need it?
  3. Learning process: Why, What and How Why Why do you

    need it? What What is it? When you want to add a new feature,
  4. Learning process: Why, What and How Why Why do you

    need it? What What is it? How How will you implement it? When you want to add a new feature,
  5. When you want to add a new feature, Learning process:

    Why, What and How Why Why do you need it? What What is it? How How will you implement it? Typical sessions focus on this!
  6. Learning process: Why, What and How Why Why do you

    need it? What What is it? How How will you implement it? When you want to add a new feature,
  7. Pre-requirements: what go provides • Compiler ◦ Converts your code

    into binary • Runtime ◦ Runs together with your code during execution
  8. Pre-requirements: what go provides Your code go build • Compiler

    ◦ Converts your code into binary • Runtime ◦ Runs together with your code during execution
  9. Pre-requirements: what go provides Your code binary go build •

    Compiler ◦ Converts your code into binary • Runtime ◦ Runs together with your code during execution
  10. Pre-requirements: what go provides Your code binary go build compiler’s

    duty • Compiler ◦ Converts your code into binary • Runtime ◦ Runs together with your code during execution
  11. Pre-requirements: what go provides Your code binary go build compiler’s

    duty (execute binary) • Compiler ◦ Converts your code into binary • Runtime ◦ Runs together with your code during execution
  12. Pre-requirements: what go provides Your code binary go build compiler’s

    duty (execute binary) runtime’s duty • Compiler ◦ Converts your code into binary • Runtime ◦ Runs together with your code during execution
  13. Pre-requirements: what go provides Your code binary go build compiler’s

    duty (execute binary) runtime’s duty go run • Compiler ◦ Converts your code into binary • Runtime ◦ Runs together with your code during execution
  14. Memory visualization . . . • Key-value store ◦ Key:

    Memory address ◦ Value: 1 byte for each OS splits RAM to prepare virtual memory for each process! (Learn “virtual memory” for more) 0x0000 0x0001 0x0002 0x0003
  15. Read/Write with assembly . . . . . . 0x0000

    0x0001 0x0002 0x0003 {address}
  16. Read/Write with assembly . . . . . . Write

    0x0000 0x0001 0x0002 0x0003 {address} 1 1
  17. Read/Write with assembly . . . . . . Read

    0x0000 0x0001 0x0002 0x0003 {address} 1 1 %eax (CPU register)
  18. High-level languages arose • 1957: Fortran was released ◦ “First”

    high-level programming language • COBOL, BASIC, PASCAL, C... and eventually Go
  19. Abstraction of data: Variables . . . . . .

    . . . . . {a addr} 12 a (Variables) 12
  20. Abstraction of data: Variables a (Variables) 12 b 34 {a

    addr} {b addr} . . . . . . . . . 12 34 Programmers recognize only these!
  21. We want some rules to place them! {a addr} ?

    {b addr} ? . . . . . . . . . 12 34
  22. Data structure of function calling First In, First Out (FIFO)...?

    main() a() Call main() a() Return Yes! That’s STACK !!
  23. Stack frame • Representation of a “function” in memory ◦

    Each frame = scope • Structure is defined in ABI ◦ Local variables ◦ (Some of) arguments ◦ Position of the previous function stack frame ◦ Position where previous function stopped ◦ etc... a() stack frame • call on line 10 • return to main() 5 7
  24. . . . main() stack frame 5 Stack memory area

    {main() s.f. addr} {v addr}
  25. . . . main() stack frame • call on line

    3 5 Stack memory area {main() s.f. addr} {v addr}
  26. . . . main() stack frame • call on line

    3 5 a() stack frame Stack memory area {a() s.f. addr} {main() s.f. addr} {v addr}
  27. . . . main() stack frame • call on line

    3 5 a() stack frame • return to main() Stack memory area {a() s.f. addr} {main() s.f. addr} {v addr}
  28. . . . main() stack frame • call on line

    3 5 a() stack frame • return to main() 5 Stack memory area {a() s.f. addr} {arg addr} {main() s.f. addr} {v addr}
  29. . . . main() stack frame • call on line

    3 5 a() stack frame • return to main() 5 arg+2= 7 Stack memory area {a() s.f. addr} {arg addr} {a addr} {main() s.f. addr} {v addr}
  30. . . . main() stack frame • call on line

    3 5 a() stack frame • return to main() 5 7 Stack memory area {a() s.f. addr} {arg addr} {a addr} {main() s.f. addr} {v addr}
  31. . . . main() stack frame • call on line

    3 5 a() stack frame • return to main() 5 7 Stack memory area {a() s.f. addr} {arg addr} {a addr} {main() s.f. addr} {v addr}
  32. . . . main() stack frame • call on line

    3 5 a() stack frame • call on line 10 • return to main() 5 7 Stack memory area {a() s.f. addr} {arg addr} {a addr} {main() s.f. addr} {v addr}
  33. . . Stack memory area main() stack frame • call

    on line 3 5 a() stack frame • call on line 10 • return to main() {b() s.f. addr} {a() s.f. addr} {arg addr} {a addr} {main() s.f. addr} {v addr} 5 7 b() stack frame
  34. . . Stack memory area main() stack frame • call

    on line 3 5 a() stack frame • call on line 10 • return to main() {b() s.f. addr} {a() s.f. addr} {arg addr} {a addr} {main() s.f. addr} {v addr} 5 7 b() stack frame • return to a()
  35. . . Stack memory area main() stack frame • call

    on line 3 5 a() stack frame • call on line 10 • return to main() {b() s.f. addr} {arg addr} {a() s.f. addr} {arg addr} {a addr} {main() s.f. addr} {v addr} 5 7 b() stack frame • return to a() 7
  36. . . Stack memory area main() stack frame • call

    on line 3 5 a() stack frame • call on line 10 • return to main() {b() s.f. addr} {arg addr} {b addr} {a() s.f. addr} {arg addr} {a addr} {main() s.f. addr} {v addr} 5 7 b() stack frame • return to a() 7 1
  37. . . Stack memory area main() stack frame • call

    on line 3 5 a() stack frame • call on line 10 • return to main() {b() s.f. addr} {arg addr} {b addr} {a() s.f. addr} {arg addr} {a addr} {main() s.f. addr} {v addr} 5 7 b() stack frame • return to a() 7 1
  38. . . Stack memory area main() stack frame • call

    on line 3 5 a() stack frame • call on line 10 • return to main() {b() s.f. addr} {arg addr} {b addr} {a() s.f. addr} {arg addr} {a addr} {main() s.f. addr} {v addr} 5 7 b() stack frame • return to a() 7 1 arg+b= 8 (CPU register)
  39. . . Stack memory area main() stack frame • call

    on line 3 5 a() stack frame • call on line 10 • return to main() {b() s.f. addr} {arg addr} {b addr} {a() s.f. addr} {arg addr} {a addr} {main() s.f. addr} {v addr} 5 7 b() stack frame • return to a() 7 1 8 (CPU register)
  40. . . . Stack memory area main() stack frame •

    call on line 3 5 a() stack frame • return to main() {a() s.f. addr} {arg addr} {a addr} {main() s.f. addr} {v addr} 5 7 8 (CPU register)
  41. . . . Stack memory area main() stack frame •

    call on line 3 5 a() stack frame • return to main() {a() s.f. addr} {arg addr} {a addr} {main() s.f. addr} {v addr} 5 7 8 (CPU register)
  42. . . . Stack memory area main() stack frame 5

    {main() s.f. addr} {v addr} 8 (CPU register)
  43. . . . Stack memory area main() stack frame 8

    {main() s.f. addr} {v addr} 8 (CPU register)
  44. . . . Stack memory area main() stack frame 8

    {main() s.f. addr} {v addr}
  45. Stack is not a silver bullet... • Can’t handle unknown

    data size (maps/slices) ◦ Stack allocation is compiler’s duty • Sometimes can’t share variables between functions
  46. Heap memory area . . . . . . .

    [] {slice_h addr} {main() s.f. addr} main() stack frame
  47. Heap memory area . . . . . . .

    [] {slice_h addr} {main() s.f. addr} main() stack frame Address is determined by runtime
  48. Heap memory area . . . . . . .

    [] {slice_h addr} {main() s.f. addr} main() stack frame main() doesn’t know where the slice is...
  49. . . . . . . . [] {slice_h addr}

    {main() s.f. addr} {slice addr} main() stack frame Heap memory area {slice_h addr}
  50. . . . . . . . [] {slice_h addr}

    {main() s.f. addr} {slice addr} main() stack frame Heap memory area This is pointer!! {slice_h addr}
  51. . . . . . . . [] {slice_h addr}

    {main() s.f. addr} {slice addr} {num addr} main() stack frame Heap memory area 5 This is pointer!! {slice_h addr}
  52. . . . main() stack frame • call on line

    3 5 Value arg {main() s.f. addr} {num addr}
  53. . . . main() stack frame • call on line

    3 5 Value arg f() stack frame • call on line 3 {f() s.f. addr} {main() s.f. addr} {num addr}
  54. . . . main() stack frame • call on line

    3 5 Value arg f() stack frame • call on line 3 5 {f() s.f. addr} {num addr} {main() s.f. addr} {num addr}
  55. {f() s.f. addr} {num addr} {main() s.f. addr} {num addr}

    . . . main() stack frame • call on line 3 5 Value arg f() stack frame • call on line 3 10
  56. {f() s.f. addr} {num addr} {main() s.f. addr} {num addr}

    . . . main() stack frame • call on line 3 5 Value arg f() stack frame • call on line 3 10 Doesn’t affect to num in main() func!
  57. . . . Pointer arg main() stack frame • call

    on line 3 5 {main() s.f. addr} {num addr}
  58. . . . Pointer arg main() stack frame • call

    on line 3 5 f() stack frame • call on line 3 {f() s.f. addr} {main() s.f. addr} {num addr}
  59. . . . main() stack frame • call on line

    3 f() stack frame • call on line 3 Pointer arg 5 {num addr} {f() s.f. addr} {num addr} {main() s.f. addr} {num addr}
  60. . . . main() stack frame • call on line

    3 f() stack frame • call on line 3 Pointer arg {f() s.f. addr} {num addr} {main() s.f. addr} {num addr} 10 {num addr}
  61. . . . {f() s.f. addr} {num addr} {main() s.f.

    addr} {num addr} main() stack frame • call on line 3 10 f() stack frame • call on line 3 Pointer arg {num addr} Does affect to num in main() func!
  62. Heap management in early era . . . . .

    . . {main() s.f. addr} main() stack frame Especially in C • func malloc(int size) pointer • func free(pointer memory) (Conceptual pseudocode)
  63. Heap management in early era Especially in C • func

    malloc(int size) pointer • func free(pointer memory) (Conceptual pseudocode) . . . . . . . ~~~ {some addr} {main() s.f. addr} main() stack frame {some addr}
  64. Heap management in early era . . . . .

    . . {main() s.f. addr} main() stack frame {some addr} Especially in C • func malloc(int size) pointer • func free(pointer memory) (Conceptual pseudocode)
  65. 【Pain】What if we forget free()? • The memory usage infinitely

    increases ◦ Memory Leak • Eventually, OOM kill
  66. Garbage collection (GC) • Cleans up heap variables that are

    no longer in use • Executed by runtime
  67. Heap Area Stack Area Mark and sweep algorithm main() stack

    frame a() stack frame b() stack frame a b c e {a addr} d {b addr} {var1 addr} var1 {d addr}
  68. Heap Area Stack Area Mark phase main() stack frame a()

    stack frame b() stack frame a b c e {a addr} d {b addr} {var1 addr} var1 {d addr}
  69. Heap Area Stack Area Mark phase main() stack frame a()

    stack frame b() stack frame a b c e {a addr} d {b addr} {var1 addr} var1 {d addr}
  70. Heap Area Stack Area Mark phase main() stack frame a()

    stack frame b() stack frame a b c e {a addr} d {b addr} {var1 addr} var1 {d addr}
  71. Heap Area Stack Area Mark phase main() stack frame a()

    stack frame b() stack frame a b c e {a addr} d {b addr} {var1 addr} var1 {d addr}
  72. Heap Area Stack Area Mark phase main() stack frame a()

    stack frame b() stack frame a b c e {a addr} d {b addr} {var1 addr} var1 {d addr}
  73. Heap Area Stack Area Mark phase main() stack frame a()

    stack frame b() stack frame a b c e {a addr} d {b addr} {var1 addr} var1 {d addr}
  74. Heap Area Stack Area Mark phase main() stack frame a()

    stack frame b() stack frame a b c e {a addr} d {b addr} {var1 addr} var1 {d addr}
  75. Heap Area Stack Area main() stack frame a() stack frame

    b() stack frame a b e {a addr} d {b addr} {var1 addr} var1 {d addr} Sweep phase c
  76. Heap Area Stack Area main() stack frame a() stack frame

    b() stack frame a b e {a addr} d {b addr} {var1 addr} var1 {d addr} Sweep phase
  77. Very simple, but hard to optimize • Marking concurrently to

    other goroutines ◦ Tri-color marking algorithm etc... • Adjusting triggers • Green Tea!!!
  78. Very simple, but hard to optimize • Can be optimized

    more with complex algorithm • Go doesn’t = philosophy of symplicity
  79. 【Pain】GC is great, but expensive • Placing much data in

    the heap = high GC cost ◦ Especially marking
  80. Most GC languages determine by type • Old Java: all

    objects are in heap • Python: all objects are in heap • JS: all non-primitives are in heap
  81. • Old Java: all objects are in heap • Python:

    all objects are in heap • JS: all non-primitives are in heap Most GC languages determine by type Non-primitive types tend to be in heap!
  82. Heap escape • Put as much data in the stack

    as possible • Specific data “escapes” to the heap
  83. There are some common cases... • Returning pointers • Interface-type

    variables • map, channel, string, most of slice
  84. Only the compiler & runtime know... • Many are decided

    by compiler • Rest are decided by runtime
  85. Heap escape analysis • Compiler analyzes variables ◦ It can

    output the analysis log go {run/build/test} -gcflags="-m" **.go
  86. Object management in heap • Variables in heap (objects) varies...

    int (8 bytes) struct (dynamically sized) pointer (8 bytes)
  87. … … Object management in heap • Objects of the

    same size are grouped into span int (8 bytes) struct (8 bytes) struct struct (32 bytes) struct struct int int span 1 (for 8 bytes) span 2 (for 8 bytes) span 3 (for 32 bytes) int int
  88. Object management in heap Spans are grouped into larger units

    • mcentral ↓ • arena ↓ • mheap
  89. Zero allocation • Zero allocation = no heap escape ◦

    Allocates data to stack; it doesn’t mean “zero”
  90. Zero allocation isn't a silver bullet... • No heap =

    involves argument copying ◦ Big copy -> huge calculation • Zero allocation isn't justice ◦ Measure the effect with benchmark main() • call on line 3 Var foo() • call on line 3 Var bar() • call on line 3 Var
  91. Stack overflow • Language decides border between heap and stack

    ◦ e.g., in C on Linux, stack size is 8MB . . . . . Heap var main() stack frame
  92. 【Pain】Heap also may explode... • Preparing a huge stack area

    sounds good... • More stack area = Less heap area ◦ ✅ Fewer stack overflows ◦ ❌ Higher risk of OOM kill 😭 . . . . . Heap var main() stack frame 󰷺
  93. Go has flexible stack • Go creates a stack for

    each goroutine • Each stack grows and shrinks on demand ◦ Initialized with 4kB
  94. Rough idea main goroutine main() stack frame a() stack frame

    a2() stack frame a3() stack frame a4() stack frame m2() stack frame b() stack frame a goroutine b goroutine
  95. How is it implemented? • Originally, Go used segmented stacks

    • Now, Go uses stack copying (optimized way)
  96. Stack copying . . . main() stack frame m2() stack

    frame main goroutine’s stack
  97. Stack copying . . . main() stack frame m2() stack

    frame x2 size New main goroutine’s stack
  98. Stack copying . . . main() stack frame m2() stack

    frame main() stack frame m2() stack frame
  99. Stack copying . . . main() stack frame m2() stack

    frame m3() stack frame Finally able to execute m3()
  100. Maximum stack size • Stack of each goroutine has limit

    of 1GB (default) ◦ If stack exceeds limit, stack overflow happends
  101. Flexible size stack per goroutine makes stack in Go special!

    This is what makes goroutine special, too
  102. Bad performance: too many escaping • Periodical spike in CPU

    consumption • App seems to have many short-living heap objects ◦ GC likely to be triggered many times • Bottleneck: Filtering function
  103. Benchmarking • Benchmarking the both funtions ◦ Using list of

    1-100 as the original slice ◦ Filter: is number even
  104. x2 faster! Benchmarking • Benchmarking the both funtions ◦ Using

    list of 1-100 as the original slice ◦ Filter: is number even
  105. x2 faster! Benchmarking • Benchmarking the both funtions ◦ Using

    list of 1-100 as the original slice ◦ Filter: is number even Caused by only a single allocation
  106. As a result... • As the whole workload ◦ 57%

    lower CPU!! ◦ 99% lower memory!!
  107. Conclusion • Memory management has been developed to satisfy programmers’

    needs • Knowledge about memory management is power • This session is just a start ◦ Let’s dive deeply into each concept!
  108. Heap escape condition and analysis Escape Analysis in Go: Understanding

    and Optimizing Memory Allocation @ Go Conference 2023 Online