Upgrade to Pro — share decks privately, control downloads, hide ads and more …

AuthZed Office Hours: Perf & Load Testing for S...

Verónica López
March 15, 2025
7

AuthZed Office Hours: Perf & Load Testing for SpiceDB

Verónica López

March 15, 2025
Tweet

Transcript

  1. Overview Where do we go from here? The role of

    perf & load testing in authorization systems. Why k6 How Fast vs How Well 01 05 04 02 The Good, the Bad & the Ugly 06 03 The Problem Tools Analysis Benchmark vs. Perf Conclusions Real World Scenarios
  2. Origin: releases From my experience releasing large distributed systems -including

    databases- : If you want to learn the nitty-gritty details of a distributed system, get familiar with the release process. Release process =! Deployment
  3. Origin: releases Identifying bottlenecks in release processes helps to understand

    whether you need a) more tests b) communication c) automation d) address tech debt, etc.
  4. “Unrelated” changes, scaling, etc. Common Pitfalls Prod Most of the

    gnarly issues only show up in production. Folks test in “hello world” scenarios. DBs Relationships & interactions Infra 1 2 3
  5. • Identify regressions • Highlight bottlenecks • Understand trade-offs •

    Spend less time fixing; more time solving problems to fulfill our mission Tests for Release Stability
  6. perf & load testing in authorization systems ensures that frequent

    permission checks remain fast and stable under realistic workloads. It provides continuous feedback on the impact of code changes, ensuring that any performance regressions are caught immediately. This helps identify even minor slowdowns that can degrade dev experience or expose security gaps
  7. Benchmarks vs. Perf 
 Sometimes, engineers mistakenly believe benchmarks and

    performance/load tests are equivalent since both measure metrics such as response time and throughput Benchmarks • Isolated code paths in controlled environments • Measures raw speed (latency, throughput) • Offers a best-case performance snapshot Perf/Load • Simulates real-world concurrency and multi-service interactions • Monitors system stability, error rates, and resource usage • Identifies bottlenecks and regressions under stress
  8. Benchmarks vs. Perf 
 Examples! Benchmarks Measuring the latency of

    a single permission check on an optimized Postgres instance or Running an isolated query against SpiceDB to record the maximum throughput (queries-per-second) when no other operations are active. Perf/Load Simulating 1,000 concurrent checks across multiple resources to capture latency distributions and identify bottlenecks. or Stress-testing cache invalidation by rapidly updating policies and then immediately querying permissions, exposing real-world delays.
  9. DISCLAIMER! The code you are about to see is an

    exercise to illustrate the ideas shared on this presentation, simplified for brevity. These snippets aren’t comprehensive enough to describe the full capabilities of SpiceDB.
  10. This test uses a ramping-arrival-rate executor to simulate increasing load,

    reaching 75 queries per second. The positive_checking function picks a random relationship from the dataset and performs a single permission check, isolating the operation to provide a clear measurement of raw latency and throughput under controlled conditions.
  11. This function simulates a permission check that involves group-based logic.

    It calls client.invoke to send a gRPC request to the PermissionsService/CheckPermission endpoint. The request includes data specifying: The resource, the permission to check & the subject. The test alternates between two scenarios: evenly distributes different workloads across the user pool, simulating varied real-world interactions.
  12. This test uses a ramping-arrival-rate executor to simulate increasing load

    on write operations. It generates a random batch of updates to simulate concurrent writes and verifies that each batch is processed successfully. The threshold ensures that 95th percentile write latency stays below 2000ms, capturing key performance metrics under stress.
  13. • Virtual Users; can be ramped up, down. • JS

    or TS 😛 • built-in support for detailed metrics (e.g., latency distributions, error rates, throughput), but you can also add your own. Good for diagnosing performance regressions. • Easy to understand tradeoffs • Open Source ❤ Why K6
  14. perf & load testing Result interpretation requires deep analysis to

    differentiate between transient performance anomalies -or trade-offs!- and genuine regressions. We often need expert tuning and contextual understanding of production behaviours.
  15. perf & load testing Sometimes, the actual goal of these

    test can be to find the breaking point of your software, and plan around it: do we fix it? Do we work around it? How do we make sure we don’t hit those values in production? How the system degrades under load. Are we ok with it?
  16. We can write tests addressing the nuances of each database.

    Simulate Kubernetes workflows : what happens if these pods die while X amount of users are performing these checks, etc.
  17. Conclusions Perf & load tests as a means to reach

    more stable releases. Setting up an environment that faithfully replicates prod* conditions and interpret the data to drive actionable insights. Know your system. Forget the testers vs. devs mentality. Benchmarks vs. Performance & load
  18. Conclusions If your tests are really providing signal vs. noise,they

    need to be alive: constantly tweaking them based on real world scenarios with customers. Every post-mortem (or equivalent) is an opportunity to improve the project. Be patient.