Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Can We Measure Developer Productivity?

Can We Measure Developer Productivity?

Often, the goal of architecture is to improve developer productivity. But what does it mean for developer to be more productive? And can we measure it? Should we? But if we don’t, how can we make any progress?
McKinsey claimed that they could finally measure developer productivity. This was followed by extensive criticism from notable figures such as Daniel Terhorst-North, Kent Beck, and Gergely Orosz. We will look at the different viewpoints and explore whether productivity can be measured and whether it should be.

Eberhard Wolff

November 13, 2024
Tweet

More Decks by Eberhard Wolff

Other Decks in Technology

Transcript

  1. Today I wrote just 10 lines of code. …because I

    spent so much time deploying software Need to fix deployment!
  2. Today I wrote 1.000 lines of code. …but that was

    really yak shaving. No business value
  3. Goodhart’s Law Any observed statistical regularity will tend to collapse

    once pressure is placed upon it for control purposes.
  4. Goodhart’s Law: Test Coverage •Test coverage: good measure for the

    quality of test. •Higher: more parts of the code are executed, so the test can catch more errors.
  5. Goodhart’s Law: Test Coverage •Set a test coverage goal •Metric

    will be manipulated. •E.g. focus on trivial parts of the code •E.g. don’t check any results •E.g. just make sure no exception is thrown •….
  6. Goodhart’s Law: Test Coverage •Test coverage increases. •But the tests

    won’t really catch more problems. •Test coverage is not a good metric for the quality of the tests any more!
  7. Goodhart’s Law: Test Coverage Solution? •Be smarter about what you

    measure. •E.g. mutation testing •E.g. review test code •… •IMHO: This is a pointless arms race. •So don’t manage for metrics?
  8. Goodhart’s Law: Solution •Purpose •Teams tries to optimize itself. •Probably

    not a case for Goodhart’s Law •Management measures quality into software / people •Probably a case for Goodhart’s Law
  9. Dealing with Goodhart’s Law •Let the team decide whether /

    how they want to improve! •Help and support •i.e. pave the road. •Provide support: techniques and technologies
  10. DORA: 4 Key Metrics •Change lead time •Deployment frequency •Change

    fail percentage •Failed deployment recovery time •https://dora.dev/
  11. DORA •Good empiric evidence •Metrics have many positive consequences •More

    time for new features •Less burnout •More economic success
  12. Another Good Metric: Business Value •I.e. Outcome •Ideally a $

    / € value •How do you measure business value? •Some organizations require a business case for a software project i.e. they can predict business value •Business case: starting point to find business value?
  13. Empiric? •Empiric in our field is generally hard. •Empiric conclusions

    about specific metrics? •But we must improve somehow. •Gut feeling?
  14. SPACE: Areas •SPACE is a framework of metrics •Choose a

    specific set of metrics to understand a specific problem
  15. SPACE: Matrix of Metrics by Levels & Areas Area 1

    Area 2 … Level 1 Metric … Level 2 … …
  16. SPACE: Areas •Satisfaction & Well-Being: e.g. Developer satisfaction / retention

    •Performance (Outcome) Code review velocity •Activity (Count of actions) Code review scores
  17. SPACE: Recommendations •Multiple metrics across various dimensions •At least 3,

    but not too many •At least one perceptual (survey) •What gets measured shows what is relevant •Hard to gamble
  18. SPACE: Conclusion •A comprehensive and sensible framework •Many metrics •Must

    be tailored to the environment •Broad (e.g. communication, collaboration, satisfaction) •Goes beyond pure performance
  19. McKinsey Matrix Outcome Focus Optimization Focus Opportunity Focus System level

    Selected DORA / SPACE Metrics … Opportunity- focused metrics (McKinsey) + some SPACE metrics Team level … Individual level
  20. McKinsey Matrix Outcome Focus Optimization Focus Opportunity Focus System level

    Selected DORA / SPACE Metrics … Opportunity- focused metrics (McKinsey) + some SPACE metrics Team level … Individual level
  21. McKinsey & DORA / SPACE •Predefined set of SPACE metrics

    for projects •I.e. no customization per organization •Eliminates tailoring …and therefore the discussion about “why?”
  22. McKinsey Matrix Outcome Focus Optimization Focus Opportunity Focus System level

    Selected DORA / SPACE Metrics … Opportunity- focused metrics (McKinsey) + some SPACE metrics Team level … Individual level
  23. Contribution Analysis •Measuring individual contributions to the backlog using JIRA

    and custom tools. •Managers can manage expectations and improve performance this way. •IMHO problematic – is this a sensible metric? •Task might be important but time-consuming? •What if you don’t do tickets but support other people? •Shouldn’t contribution be about created business value?
  24. Inner / Outer Loop Time Spent Inner Loop Outer Loop

    Test Code Build Deploy at Scale Security and compliance Integrate Meetings
  25. Inner / Outer Loop Time Spent Inner Loop Outer Loop

    Test Code Build Deploy at Scale Security and compliance Integrate Meetings Optimize for time in the Inner Loop! Hacking away instead of a meeting to understand the problem? Really?
  26. Developer Velocity Index •46 Driver in 13 capability areas •Technology

    (Architecture, Public Cloud, Test Automation) •Working Practices (Engineering Practices e.g. Tech Debt) •Organizational Enablement (e.g. Culture, Talent Management)
  27. Developer Velocity Index •Good foundation for a elaborated consulting project

    •Does it help? •Benchmarking? By industry? •Every project is different •Can you arrive at results more pragmatically and quickly ? •E.g. interviews
  28. Talent Capability Score •Individual skills •Diamond: majority in the middle

    •Example: too many inexperienced individuals → training •Why not aim for the best? Majority More Skill
  29. McKinsey Example I • Developers spend too much time on

    design and managing dependencies • Clarify roles • Result: more code produced • Pro: Managing dependencies is annoying • Con: Design can be useful • Might be a good idea!
  30. McKinsey Example II •New employees don't achieve as much •So:

    better onboarding and mentoring •IMHO good idea •High potential for poor metrics •Mentors perform poorly with regards to Developer Velocity Index
  31. McKinsey: Recommended Approach •Learn the basics for communication with C-level

    •Assess your systems (e.g. to measure test coverage) •Build a plan - concrete goal •Remember that measuring productivity is contextual - it’s about getting better.
  32. McKinsey: Conclusion •SPACE should be customized •New metrics are questionable

    •In my experience, you can find the main challenges quicker e.g. with interviews. •However, the examples and general recommendations make sense. •Doesn’t seem to aim at identifying people to fire.
  33. Criticism •The paper has sparked quite some criticism. •Next slides

    show some highlights. •Not a comprehensive discussion!
  34. Dan North’s Criticism - Highlights •Contribution Analysis measures the wrong

    thing •Has the outer loop really low value? •Talent capability: depends on organization
  35. Dan North’s Recommendation •Theory of Constraints: Identity bottleneck, Utilize it

    fully •Lead time or flow •I.e. Lean / DORA •If you hire the best, productivity is a problem of the organization not the individual. •Coaching & peer feedback
  36. Spot customer pain point Ship a solution Design docs Code

    Feature in prod Customer behave differently Value generated
  37. Kent Beck & Gergely Orosz: Highlights •“Absurdly naïve” •Ignores software

    development teams •It's about individual performance •CEO/CFO will override the CTO to implement the McKinsey framework •Unethical CEOs and CTOs are the target audience •Then it destroys the organization
  38. Kent Beck & Gergely Orosz: Highlights •The criticism doesn’t match

    what the paper says. •The paper has completely different examples and recommendations. •The criticism might be a caused by the scandals around McKinsey. •Prejudice?
  39. Kent Beck & Gergely Orosz: Advice • Understand why you're

    measuring and recognize power relationships. • Promote Self-Measurement: Teams should analyze their own data. • Trust Your Judgement: Rely on explanations that resonate and take responsibility for decisions. • Productivity metrics are misleading • Focus on Real Accountability: Prioritize consistent delivery of customer-valued outcomes. • IMHO great idea!
  40. Conclusion •Beware of Goodhart’s Law! •Use metrics to support teams!

    •Therefore: Create your own custom metrics for the problem at hand. •SPACE is a great starting point.
  41. Send email to [email protected] Slides + Service Mesh Primer EN

    + Microservices Primer DE / EN + Microservices Recipes DE / EN + Sample Microservices Book DE / EN + Sample Practical Microservices DE/EN + Sample of Continuous Delivery Book DE Powered by Amazon Lambda & Microservices EMail address logged for 14 days, wrong addressed emails handled manually