The path to memory reduction in RBS

The path to memory reduction in RBS Money Forward Tech
LT大会 vol.2 at Fukuoka Oct. 15th 2024

pp self • Pocke • Work for Money Forward •
Ruby committer (RBS maintainer) • Rails application developer • From Okayama ◦ My favorite ramen in Okayama🍜 →→

Agenda The main theme is reducing memory of RBS and
Steep. • Why do I need to reduce memory usage of RBS • Memory Profiling for Ruby • Future plan

Glossary • RBS ◦ A library for static typing of
Ruby ◦ It provides RBS language, tools, and so on • Steep ◦ A static type checker for Ruby ◦ It uses RBS ◦ It provides CLI tools and LSP server

Why do I need to reduce memory usage of RBS?

Why is the memory improvement necessary Steep uses too much
memory because: • Steep makes resident processes because it works as LSP server • Steep makes many processes ◦ For number of projects using Steep ◦ For number of CPUs because Steep launches workers for parallelization ◦ total_memory = projects.size * CPUs.size * memory_per_process

Why is the memory improvement necessary 1 Steep worker process
consume ~1.5GB memory in a middle size Rails application e.g. 8 core * 5 project * 1.5GB/proc = 60GB We need to decrease the memory usage in order for Steep to be widely used.

Memory Proﬁling for Ruby

Measure. Don't second guess Profiling is important to clarify the
bottleneck

SamSaﬀron/memory_proﬁler Ruby has memory_profiler gem. require 'memory_profiler' arr = []
r = MemoryProfiler.report do Object.new # (1) arr.push Object.new # (2) end r.pretty_print Allocated: (1), (2) Retained: (2)

It's really useful gem, but… It is not enough for
Steep because: • I want to reduce "peak" memory usage of Steep • It is not eﬀicient to profile peak memory usage

Allocated Memory by memory proﬁler • It traces all allocated
memory/objects during profiling • Pros: It's helpful to find a execution time bottleneck caused by memory allocation • Cons: It's not helpful to find the cause of the peak memory usage ◦ Too noisy ◦ Example: Steepのメモリ使用量を改善するつもりが、実行速度の改善をしていた - Money Forward Developers Blog https://moneyforward-dev.jp/entry/2024/07/29/improve-steep-performanc e

Retained Memory by memory proﬁler • It traces all retained
memory/objects when the profiling is finished • Pros: It's helpful to find a memory leak • Cons: It's not helpful to find the cause of the peak memory usage ◦ We need to stop profiling on the peak, but the peak is not obvious

New memory proﬁler: Majo🧙 I created a new memory profiler
for Ruby to profile peak memory usage. https://github.com/pocke/majo

The strategy of Majo • I supposed peak memory usage
is approximated as memory usage of long-lived objects • Majo collects allocation info only for long-lived objects ◦ It introduces object lifetime by how many times the object survived GC

How Majo cast a spell on Ruby • Ruby provides
hooks on Ruby object allocation and `free` • Use TracePoint events ◦ `RUBY_INTERNAL_EVENT_NEWOBJ` ◦ `RUBY_INTERNAL_EVENT_FREEOBJ`

CSV format output Majo supports CSV format. It's really useful
with Spreadsheet https://docs.google.com/spreadsheets/d/1TnlnLXQTnuDfB3Bhw 0sNp9y2iZObqpkKVeqE--eAdlk/edit?gid=331894152#gid=3318941 52

CSV format output on a spreadsheet

The result by Majo • Reduce Array allocation during parsing
◦ https://github.com/ruby/rbs/pull/1950 • Reduce Hash allocation during parsing ◦ 不要な処理が実行速度を速くする謎を追う - Money Forward Developers Blog https://moneyforward-dev.jp/entry/2024/09/26/removing-steps-make s-it-slower ◦ I will introduce this patch for the next Ruby version

Future plan

Future plan I will change Steep's process management more Copy
on Write (CoW) friendly.

What's Copy on Write It's a technique to wait Copying
before Writing This slides focus on CoW for memory management by *nix on `fork`. Note: `fork` is an API to duplicate a process on *nix OS🍴

# Memory [1, 2, 3] Copy on Write Example (1)
# Process A x = [1, 2, 3] if fork x.push(42) p x else p x end

# Memory [1, 2, 3] Copy on Write Example (2)
# Process A x = [1, 2, 3] if fork x.push(42) p x else p x end # Process A' x = [1, 2, 3] if fork x.push(42) p x else p x end

# Memory [1, 2, 3, 42] # Copying! [1, 2,
3] Copy on Write Example (3) # Process A x = [1, 2, 3] if fork x.push(42) p x else p x end # Process A' x = [1, 2, 3] if fork x.push(42) p x else p x end

The current process management of Steep Steep LSP uses Master-Workers
structure. Steep Master Steep Worker 3 Steep Worker 2 Steep Worker 1 fork fork fork • Master ◦ Communicate the LSP client and workers • Workers ◦ Process LSP features ◦ Type checking, complement, hover, …

The current process management of Steep All workers have diﬀerent
RBS::Environment. Steep Master Steep Worker 3 Steep Worker 2 Steep Worker 1 fork fork fork RBS::Env 1 RBS::Env 3 RBS::Env 2

Solution: Fork-Worker and Reforking A CoW friendly process management structure
for the Master-Worker model • In the traditional Master-Worker model, workers are forked from the master process • In Fork-Worker, workers are forked from a worker process I borrowed this idea from puma and pitchfork (HTTP server for Ruby) https://github.com/puma/puma/blob/master/docs/fork_worker.md

Fork-Worker All workers share the same memory Steep Master Steep
Worker 3 Steep Worker 2 Steep Worker 1 fork fork fork RBS::Env

Reforking Restart workers after a while Steep Master Steep Worker
3 Steep Worker 2 Steep Worker 1 fork kill kill RBS::Env

1 fork RBS::Env

3 Steep Worker 2 Steep Worker 1 fork refork refork RBS::Env

Conclusion

Conclusion • New memory profiler: Majo ◦ It collects long-lived
object allocations • Steep will have more CoW-friendly structure ◦ Fork worker and Reforking Thanks for listening!

The path to memory reduction in RBS

The path to memory reduction in RBS

More Decks by pocke

Other Decks in Technology

Featured

Transcript