$30 off During Our Annual Pro Sale. View Details »

A JVM Threading Model for the containerized times

Luiz Hespanha
September 21, 2023

A JVM Threading Model for the containerized times

Presented in the Strange Loop event in 2023.

Luiz Hespanha

September 21, 2023
Tweet

Other Decks in Programming

Transcript

  1. A JVM threading
    model for the
    containerized
    times
    Flavio Brasil, Principal Engineer
    Luiz Hespanha, Principal Engineer
    Systems Performance @ Nubank
    {
    }
    ...

    View Slide

  2. Nubank

    View Slide

  3. Nubank
    Hespanha

    View Slide

  4. The perfect storm
    01 {
    } ..
    ..

    View Slide

  5. PIX
    Running since the end of 2020, Pix is an instant payment platform created and
    managed by the monetary authority of Brazil, the Central Bank of Brazil (BCB), which
    enables the quick execution(max 10 seconds) of payments and transfers 24/7.

    View Slide

  6. PIX
    Monthly transfers (thousands) - 4 Billion in July!

    View Slide

  7. PIX
    Essential for people's day-to-day

    View Slide

  8. Payday madness
    Nubank's PIX
    down today?
    Service
    experiencing
    instability
    Number of
    failures (RED)
    increasing

    View Slide

  9. Understanding the
    problem
    02 {
    } ..
    ..

    View Slide

  10. The crash resolution paradox

    View Slide

  11. The system normally operates
    at a low CPU usage
    The crash resolution paradox

    View Slide

  12. The system normally operates
    at a low CPU usage
    But when there's a load spike, the CPU
    becomes a bottleneck
    The crash resolution paradox
    Flavio

    View Slide

  13. The crash resolution paradox
    The system normally operates
    at a low CPU usage
    But when there's a load spike, the CPU
    becomes a bottleneck
    Latencies skyrocket, system sometimes
    become unresponsive

    View Slide

  14. The crash resolution paradox
    The system normally operates
    at a low CPU usage
    But when there's a load spike, the CPU
    becomes a bottleneck
    Latencies skyrocket, system sometimes
    become unresponsive
    Resolution: more CPU capacity!?

    View Slide

  15. Cluster-wide instability

    View Slide

  16. Some crashes escalated to k8s nodes
    becoming saturated
    Cluster-wide instability

    View Slide

  17. Some crashes escalated to k8s nodes
    becoming saturated
    A few systems consumed all CPU resources
    and became noisy neighbors
    Cluster-wide instability

    View Slide

  18. Cluster-wide instability
    Some crashes escalated to k8s nodes
    becoming saturated
    A few systems consumed all CPU resources
    and became noisy neighbors
    Several nodes become saturated and
    instability spread to collocated services

    View Slide

  19. Symptoms of a bottleneck
    Flavio

    View Slide

  20. As described by the Universal Scalability
    Law (USL), efficiency can drop
    significantly when a bottleneck is
    reached
    Symptoms of a bottleneck

    View Slide

  21. As described by the Universal Scalability
    Law (USL), efficiency can drop
    significantly when a bottleneck is
    reached
    The median CPU usage (blue line) was
    dropping over time and the number of
    cores (bars) growing at a higher pace
    than the load
    Symptoms of a bottleneck

    View Slide

  22. Our goals
    {
    }

    View Slide

  23. Our goals
    {
    }
    Systems remain
    functional even
    if overloaded

    View Slide

  24. Our goals
    Problematic
    systems can't
    affect others
    {
    }
    Systems remain
    functional even
    if overloaded

    View Slide

  25. Seeing through
    the noise
    03 {
    } ..
    ..

    View Slide

  26. CPU isolation
    Linux
    config
    Nature Mechanism
    Prevents
    node saturation
    CPU
    requests
    CPU
    limits

    View Slide

  27. CPU isolation
    Linux
    config
    Nature Mechanism
    Prevents
    node saturation
    CPU
    requests
    cpu.shares
    CPU
    limits
    cpu.quota

    View Slide

  28. CPU isolation
    Linux
    config
    Nature Mechanism
    Prevents
    node saturation
    CPU
    requests
    cpu.shares Soft limit
    CPU
    limits
    cpu.quota Hard limit
    Hespanha

    View Slide

  29. CPU isolation
    Linux
    config
    Nature Mechanism
    Prevents
    node saturation
    CPU
    requests
    cpu.shares Soft limit
    Prioritization
    when all CPUs
    are busy
    CPU
    limits
    cpu.quota Hard limit
    Enforced even if
    node has
    available CPU

    View Slide

  30. CPU isolation
    Linux
    config
    Nature Mechanism
    Prevents
    node saturation
    CPU
    requests
    cpu.shares Soft limit
    Prioritization
    when all CPUs
    are busy
    No
    CPU
    limits
    cpu.quota Hard limit
    Enforced even if
    node has
    available CPU
    Yes**
    ** Partially since it's a quota and not a concurrency limit, but it's generally enough

    View Slide

  31. CPU throttling oddness

    View Slide

  32. CPU throttling oddness

    View Slide

  33. CPU throttling intuition
    CPU quota:
    4 cores
    Each square:
    20ms of CPU

    View Slide

  34. CPU throttling intuition
    CPU quota:
    4 cores
    Each square:
    20ms of CPU

    View Slide

  35. CPU throttling intuition
    CPU quota:
    4 cores
    Each square:
    20ms of CPU

    View Slide

  36. CPU throttling intuition
    CPU quota:
    4 cores
    Each square:
    20ms of CPU

    View Slide

  37. CPU throttling intuition
    CPU quota:
    4 cores
    Each square:
    20ms of CPU

    View Slide

  38. Current schools of thought
    {
    }

    View Slide

  39. Current schools of thought
    CPU pinning is
    the one true way!
    Just disable
    limits!
    {
    }

    View Slide

  40. ● How can we prevent node
    saturation?
    ● Environment-dependent
    performance
    Current schools of thought
    CPU pinning is
    the one true way!
    Just disable
    limits!
    {
    }

    View Slide

  41. ● How can we prevent node
    saturation?
    ● Environment-dependent
    performance
    Current schools of thought
    ● What about small systems
    with fractional quotas?
    ● No possibility of bursts
    ● k8s scheduling pressure
    CPU pinning is
    the one true way!
    Just disable
    limits!
    {
    }

    View Slide

  42. ● How can we prevent node
    saturation?
    ● Environment-dependent
    performance
    Current schools of thought
    ● What about small systems
    with fractional quotas?
    ● No possibility of bursts
    ● k8s scheduling pressure
    CPU pinning is
    the one true way!
    Just disable
    limits!
    {
    }
    What is it hiding? 🤔

    View Slide

  43. The danger of averages

    View Slide

  44. The danger of averages

    View Slide

  45. The danger of averages

    View Slide

  46. The danger of averages

    View Slide

  47. The danger of averages
    Flavio

    View Slide

  48. The danger of averages
    Flavio

    View Slide

  49. The danger of averages

    View Slide

  50. The danger of averages

    View Slide

  51. The danger of averages
    What is it hiding? 🤔

    View Slide

  52. The
    CPU avg
    is a lie!
    😱

    View Slide

  53. Fine-grained CPU metrics

    View Slide

  54. Based on https://github.com/sqshq/sampler
    Fine-grained CPU metrics

    View Slide

  55. Fine-grained CPU metrics
    Does not require a shorter Prometheus scrapping interval

    View Slide

  56. How can we
    make systems
    behave within
    the CPU quota?

    View Slide

  57. Building an adaptive
    threading model
    04 {
    } ..
    ..

    View Slide

  58. Nauvoo
    01
    02
    03

    View Slide

  59. You can't improve what you don't measure!
    Nauvoo
    01
    02
    03
    Fine-grained perf metrics

    View Slide

  60. You can't improve what you don't measure!
    Nauvoo
    No more manual thread pool tuning, avoids CPU
    throttling on the fly
    01
    02
    03
    Fine-grained perf metrics
    Adaptive concurrency

    View Slide

  61. You can't improve what you don't measure!
    Nauvoo
    No more manual thread pool tuning, avoids CPU
    throttling on the fly
    Rejects work above the system's capacity,
    avoids unbounded queuing and GC death spirals
    01
    02
    03
    Fine-grained perf metrics
    Adaptive concurrency
    Reactive backpressure

    View Slide

  62. Detecting degradation

    View Slide

  63. Detecting degradation
    v0: check all
    the things
    {
    }
    Hespanha

    View Slide

  64. Detecting degradation
    Multiple checks:
    ● CPU usage
    ● Throttled %
    ● Memory
    Tries to avoid degradation
    v0: check all
    the things
    {
    }

    View Slide

  65. Detecting degradation
    Multiple checks:
    ● CPU usage
    ● Throttled %
    ● Memory
    Tries to avoid degradation
    v1: heartbeat
    mode
    v0: check all
    the things
    {
    }

    View Slide

  66. Detecting degradation
    ● Inspired by jHiccup
    ● while(true) { measure
    Thread.sleep(1) }
    ● Allows a configurable
    level of degradation
    ● Also detects GC pauses,
    safepoints, allocation
    stalls
    Multiple checks:
    ● CPU usage
    ● Throttled %
    ● Memory
    Tries to avoid degradation
    v1: heartbeat
    mode
    v0: check all
    the things
    {
    }

    View Slide

  67. Controlling degradation
    Linux
    Scheduler

    View Slide

  68. Controlling degradation
    Linux
    Scheduler
    Executor
    Executor
    Executor

    View Slide

  69. Controlling degradation
    Linux
    Scheduler
    Executor
    Executor
    Executor
    Nauvoo

    View Slide

  70. Controlling degradation
    If there's
    degradation,
    reduce concurrency

    View Slide

  71. Controlling degradation
    If there's
    degradation,
    reduce concurrency
    If threads are
    reliably scheduled,
    allow more concurrency
    Flavio

    View Slide

  72. Main challenges
    - Reaction time
    Start with small changes and
    escalate via exponential steps
    - Control loop stability
    Introduce metastable state
    thresholds to stabilize changes
    Controlling degradation
    If there's
    degradation,
    reduce concurrency
    If threads are
    reliably scheduled,
    allow more concurrency

    View Slide

  73. Main challenges
    - Reaction time
    Start with small changes and
    escalate via exponential steps
    - Control loop stability
    Introduce metastable state
    thresholds to stabilize changes
    Controlling degradation
    If there's
    degradation,
    reduce concurrency
    If threads are
    reliably scheduled,
    allow more concurrency

    View Slide

  74. Main challenges
    - Reaction time
    Start with small changes and
    escalate via exponential steps
    - Control loop stability
    Introduce metastable state
    thresholds to stabilize changes
    Controlling degradation
    If there's
    degradation,
    reduce concurrency
    If threads are
    reliably scheduled,
    allow more concurrency

    View Slide

  75. Demo
    Adapting # of threads to a load with low CPU usage and thread blocking
    Load
    increasing
    Number of
    Threads
    increasing
    ing to handle
    the load
    CPU
    Throttling
    under control

    View Slide

  76. Demo
    Adapting # of threads to a CPU intensive load + rejections
    Number of
    Threads
    decreasing
    Service
    suffering with
    CPU Throttling
    Tasks being
    rejected

    View Slide

  77. Results

    View Slide

  78. Results
    Social - Stability improvement

    View Slide

  79. Results
    Magnitude - Latency reduction

    View Slide

  80. Results
    Stormshield - Cost reduction

    View Slide

  81. The path ahead
    05 {
    } ..
    ..

    View Slide

  82. Further optimization

    View Slide

  83. Further optimization
    Hespanha

    View Slide

  84. Further optimization

    View Slide

  85. Nauvoo v2
    01
    02
    03

    View Slide

  86. Nauvoo v2
    01
    02
    03
    Lower overhead

    View Slide

  87. Nauvoo v2
    01
    02
    03
    Lower overhead
    Loom integration

    View Slide

  88. Nauvoo v2
    01
    02
    03
    Lower overhead
    Loom integration
    Prepare for open source

    View Slide

  89. CREDITS: This presentation template was created by
    Slidesgo, and includes icons by Flaticon, and
    infographics & images by Freepik
    Optimizations by
    several teams!
    major rewrites, database
    migration, tunings, ...
    Flavio

    View Slide

  90. CREDITS: This presentation template was created by
    Slidesgo, and includes icons by Flaticon, and
    infographics & images by Freepik
    Optimizations by
    several teams!
    major rewrites, database
    migration, tunings, ...
    Nauvoo

    View Slide

  91. Payday sanity

    View Slide

  92. CREDITS: This presentation template was created by
    Slidesgo, and includes icons by Flaticon, and
    infographics & images by Freepik
    Thanks!
    @fbrasisil
    @luiz_hespanha

    View Slide