Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Diagnosing performance problems without the gue...

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.

Diagnosing performance problems without the guesswork

Workshop help at LDX3 - June 2nd & 3rd 2026

Avatar for Elena Tanasoiu

Elena Tanasoiu

June 02, 2026

Other Decks in Technology

Transcript

  1. Your host: gh.io/elena › 15+ years - Working with Ruby

    › 4 years - Performance engineering team at GitHub › Entire life – pickle afficionado › I like to travel a lot and I've recently been adopted by a cat
  2. What you'll learn › What is a flamegraph › How

    to get one › How to spot 3 common performance problems You will need a laptop or really good eyes: https://gh.io/flamegraphs
  3. Introduce yourselves At your table, ~60 seconds each: - Your

    name - What you work on - The most expensive problem you've seen in production
  4. The Problem with Guessing Most of us treat performance like

    a guessing game. We know something is slow, but we rely on suspicions rather than evidence. "This is fine," we say, as latency spikes and users complain.
  5. The Sensation The Suspicion Metrics show latency going up. A

    page feels heavy. Everyone can feel the drag, but the mechanics remain hidden behind high-level dashboards. Our first instinct is to look at a spike and start guessing. We blame the database, the network, or the last deploy without verifying the actual path. Our Relationship with Performance
  6. Add an Index Enable Caching Scale Hardware Assuming the database

    is the bottleneck is common. We add indexes and hope they hit. Papering over the cracks with a cache often hides the inefficient logic beneath. Throwing larger hardware at the problem is expensive and avoids the root software cause. The Instinctive (and Suboptimal) Fixes
  7. Why Intuition is a Trap Don't Trust Your Gut Performance

    optimization is counter-intuitive. The method you've been suspicious of for years? It's probably fine. The innocuous single line of code that nobody looks at? That might be your main culprit.
  8. Scenario Guessing Approach Flamegraph Approach Slow Page Load Add Redis

    cache layer Profile reveals N+1 query CPU Spike Upgrade to beefier machines Finds regex backtracking bug Memory Leak Restart pods regularly Identifies garbage collection events Intuition vs. Measurement
  9. The Profiling Paradigm Shift Moving from "I think" to "I

    know." Let's look at how we visualize the stack.
  10. Execution Visualized The Power of the Flamegraph Flamegraphs represent stack

    traces over time. Each box is a function; the width is the time spent on execution. By looking at the "wide" bars, you immediately see where your request is actually spending its life. Chaos becomes a map.
  11. Stack Charts It’s an upside-down flamegraph › Y-axis: Stack depth.

    Top = entry point, bottom = where CPU time is spent. › X-axis: Time. Width = proportion of time. Wide bar = expensive. Unlike the original flamegraph, these are in chronological order of execution. › Colour: Ignore it. In most tools it's random or indicates frame type.987
  12. Meet Vernier A modern Ruby profiler. Created by John Hawthorn

    at GitHub. What It Captures › All threads simultaneously › SQL queries performed › Feature flag checks › GC pauses and memory allocations › Idle time, RPC calls, cache calls etc. Where can I see it? › Output viewable at https://vernier.prof › Also viewable on https://gh.io/flameviewer
  13. Installing the Vernier gem in Rails # add to your

    Gemfile: gem "vernier", group: :development # Profile a Rails request (add to a controller): around_action :vernier_profile, if: -> { params[:flamegraph] } # Then visit GET /posts?flamegraph=1
  14. Installing the Vernier gem in Rails The response will be

    a Vernier JSON file instead of the usual response. Next step: Upload your `*.vernier.json` file to https://vernier.prof or https://gh.io/flameviewer
  15. Open it in the Flamegraph Viewer Safe for production data

    - processes files entirely in your browser. Nothing uploaded. 1. Go to https://gh.io/flameviewer 2. Drop your `vernier.json` file 3. Explore the timeline, flamegraph, markers 4. Drop a second file to compare before/after
  16. How does GitHub generate flamegraphs? Web Requests Add query params

    to any GitHub URL: ?flamegraph=1 API Requests Use gh api or curl with the same param: ?flamegraph=1 Use the Flamegraph Copilot Skill In-house flamegraph skill which lets you ask copilot to get the flamegraph for you.
  17. The Shapes You're Looking For You don't need a PhD

    to read a flamegraph. You just need to spot the shapes. https://gh.io/flamegraphs
  18. Pattern 1: The Comb Teeth (N+1) The same SQL frame

    stacked many times horizontally. Each tooth is a separate database call that could have been batched or preloaded. The simple approach: look for repeating vertical bars, like teeth on a comb. https://gh.io/flamegraphs
  19. Pattern 2: A suspiciously large SQL query At the bottom

    of the flamegraph viewer we can see SQL queries. An extra long bar is likely a sign of an inefficient SQL query. In our example we spend 700ms to load labels on the Watch button. The simple approach: look for big SQL blocks. https://gh.io/flamegraphs
  20. Pattern 2: Hover over the query to see it If

    we hover over the query we can see what it was. https://gh.io/flamegraphs
  21. Pattern 2.1: A suspiciously wide bar The simple approach: find

    the fattest rectangle. That might be your problem. https://gh.io/flamegraphs
  22. Pattern 3: A lot of garbage collection events The simple

    approach: what is causing a lot of GC pauses? Try to avoid rampant creation of objects. In some cases we see the comb pattern but it doesn’t make any SQL calls. However, it does trigger “Garbage collection” events. This usually means we’re creating too many objects in Ruby, causing the Garbage collector to pause the request to recover memory. This is especially common with GraphQL. https://gh.io/flamegraphs
  23. Interactive: Pair with 1-2 people 1. Go to https://gh.io/flamegraphs and

    click the Web Request link. 2. Scroll down, familiarize yourself with the code and find the widest box that isn't a framework method 3. Write down the method you think is the problem. Don't pick the controller action, drill down further. DO NOT SAY IT YET
  24. 65M Daily Logged-in Requests The GitHub Repo Page At GitHub,

    the repository overview page is one of the most visited locations. At this scale, even a small inefficiency becomes a massive performance tax. Intuition isn't enough when every millisecond matters for 65 million page loads. Case Study: GitHub Scale
  25. What We Found A method called might_have_packages? was consuming 199ms

    on every page load. Most of that time was spent building an Elasticsearch query on the fly to answer a yes-or-no question - "does the repo have any packages?" Maybe. On one of the most visited pages on the internet. Nobody would have guessed that a packages sidebar check was the most expensive thing on this page.
  26. The Three Optimisation Strategies 1. Don't Do It Delete the

    code entirely. The fastest code is code that never runs. This was our fix. 60 lines deleted. 2. Do It Cheaper Batch, short-circuit, cache it, use a better algorithm, reduce object allocations. 3. Do It Later Background job, lazy evaluation, defer to a non- critical path. Biggest wins almost always come from #1 and #2.
  27. How does this transfer to Go? Quite well: 1. Width

    == Time 2. Flamegraphs still map to stack traces – scrolling down matches stack call depth 3. The suspiciously fat rectangle & repetitive comb patterns stay unchanged. It might say db.(*Conn).Query instead of might_have_packages? but there's nothing to re-learn about what the pattern is showing.
  28. What actually changes is upstream Different profiler, different export command,

    but the same stack representation underneath. Concretely the profilers for the most common languages are: - Go: pprof - Node: --prof / clinic - Linux: perf - Python: py-spy - Ruby: Vernier / stackprof / rbspy Our viewer works with all of these! https://gh.io/flameviewer
  29. Concurrency caveat A Go flamegraph aggregates across multiple goroutines, while

    Ruby shows you one thread at a time. Meaning a wide frame in Go can be many goroutines consuming a lot of CPU together, not one single slow thing.
  30. Getting to a flamegraph in Go is three lines go

    test -cpuprofile cpu.prof ./... go tool pprof -raw cpu.prof > profile.txt # drop into the viewer # It's got a built-in flamegraph viewer: go tool pprof -http=:8080 cpu.prof
  31. The Velocity Problem GitHub ships fast. Everyone is shipping faster.

    We can't manually investigate every latency spike. › 65 million requests a day on just one page › Hundreds of deploys per week › hundreds of feature flag checks per page load › Humans make performance mistakes, and the codebase keeps growing We need a way to keep up We’re not saying we should replace engineers with AI. We’re saying we can equip engineers with a faster way to find the signal in the noise.
  32. Copilot + Flamegraphs 1. Generate a profile with Vernier 2.

    Convert to AI-readable format: vernier view –output=markdown WEB_REQUEST.vernier.json or use the “Copy findings button” at the top right of the flamegraph viewer. 3. Feed the summary to Copilot (not the raw JSON, it'll blow up the context). Have the profile stored locally so copilot can explore it. 4. Ask it to identify the top bottlenecks 5. Verify everything it says against the actual flamegraph Pro tip: Never feed raw Vernier JSON directly. Use the markdown output. The AI needs a summary, not a 50MB JSON blob.
  33. An Example Prompt "You are a performance engineer. Given this

    flamegraph profile, identify the top 3 bottlenecks. Look for: - Repeated frames (N+1 queries) - Synchronous I/O in hot paths - Large SQL queries or methods that take a long time - Garbage collection overhead For each bottleneck, suggest which optimisation strategy applies: don't do it, do it cheaper, or do it later." Or try the Flamegraph Viewer (elenatanasoiu.com/flamegraph-viewer) which auto-detects common issues.
  34. Ask Copilot to get the flamegraph "Capture a flamegraph of

    https://github.com/github/github" "Profile the repos API endpoint https://github.com/repos/github/github" "Get a Vernier profile of this GraphQL query" "Why is this page slow? Capture a flamegraph"
  35. The Shape of the Job Has Changed, But Your Responsibility

    Has Not Before › Manually scan profiles line by line › Hunt for patterns through intuition › Investigate one spike at a time › Performance work is a specialist skill Now › AI summarizes the chaos › You decide what's real and what's noise › Reviewing AI output is a core engineering skill › Performance work is accessible to everyone
  36. The Human with the Face AI finds the problem. The

    human is responsible for the fix. › The AI identifies the what. You decide the why and the how. › Don't let the tool make the decision. It's a fast reader that doesn't understand your business logic. › If it hits production, it's on you, not the bot.
  37. What to Do Tomorrow Morning Stop Guessing If it's slow,

    profile it first. Don't cache, index, or scale until you've seen the flamegraph. Use the Viewer Look for the shapes: comb teeth, wide bands, GC pauses. Leverage automation to spot the patterns. Be the Reviewer Question the AI's suggestions. Verify with the data in your flamegraph. You own what ships.
  38. Resources Tools › vernier.prof Official Vernier profile viewer › gh.io/flameviewer

    Flamegraph Viewer with auto-detection › github.com/jhawthorn/vernier Vernier gem source