Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Renovate or Rebuild? Architectural techniques ...

Renovate or Rebuild? Architectural techniques for the million-euro trade off (35 minutes)

Most teams facing the renovate-versus-rebuild decision end up with a management-driven process that relies on intuition, outdated assumptions, and optimism rather than facts and clear rationale — yet these decisions can make or break budgets and careers.

This talk introduces an architecture-driven approach for moving beyond gut feel to analyse technical debt, operational quality, opportunity costs, and risk in a structured, data-driven way.

You'll leave with a practical approach to generating defensible options with clear tradeoffs, and the stakeholder communication skills to present them credibly to decision makers — whether you're wrestling with a creaking monolith or drowning in distributed technical debt.

(This is the 35 minute version of the longer talk of the same title)

Avatar for Eoin Woods

Eoin Woods

March 11, 2026
Tweet

More Decks by Eoin Woods

Other Decks in Programming

Transcript

  1. Your system is a little like a “Heath Robinson” machine

    It is working pretty well You can change it … cautiously … if you understand all of the moving parts But perhaps it’s time to think about its future … One morning a senior manager says “let’s rewrite the whole thing!”
  2. DANGER! DANGER! A ”clean sheet” rewrite can come with many

    risks … ¡ Stalled rebuilds ¡ Never ending migrations ¡ Spiralling costs But, continuing to patch up a complex aging platform is risky too. So what to do?
  3. Eoin Woods • Independent consultant (software architecture, CTO) • 10

    years as CTO in delivery consultancy - Endava • 10 years in capital markets - UBS and BGI • 10+ years in products - Bull, Sybase, InterTrust ADDRESSING ENERGY EFFICIENCY IN SYSTEM DESIGN: A JOURNEY FROM ARCHITECTURE TO OPERATION EOIN WOODS A thesis submitted in partial fulfilment of the requirements of the University of East London for the degree of Doctor of Philosophy December 2018
  4. Of course, the real decision isn’t “Renovate or Rebuild?” It

    is choosing from a spectrum of options …
  5. THE REAL DECISION Tactical Improvement (Localised improvements when necessary, no

    overall plan) Strategic Improvement (Localised improvements regularly, clear priorities and backlog of work) Incremental Rebuild (Intentional planned progress towards future state, build-in-place, item-by-item) Parallel Rebuild (Green field replacement built in parallel to ongoing work on the system) continuum of options
  6. The question is “What would success look like?” That is,

    what needs to be different to where we are today? This, of course, is stakeholder needs analysis ... and as architects we are good at that!
  7. AN ARCHITECTURE-DRIVEN APPROACH Step 1: Where Are You? Step 2:

    Potential Benefits Step 3: Remediation Options Step 4: Risks, Constraints and Tradeoffs Step 5: Stakeholder-Led Decision
  8. STEP 1: WHERE ARE YOU? Goal: understand decision context Context

    Critical Concerns Crucial Quality Attributes Code Concerns: complaints, perceptions, incidents Context: organisation, system, team Code: structure, complexity, evolution Qualities: priorities, measurements
  9. STEP 2: FIND THE POTENTIAL BENEFITS Goal: explore how you

    can justify large scale investment and change Keep it simple! • What will you be able to do tomorrow that you can’t do today? • Why does that matter? • Who cares about this? • How much do they care? Can they quantify the benefit to them? This isn’t a company strategy exercise (perhaps you need one of those too?)
  10. STEP 2: FIND THE POTENTIAL BENEFITS Improve Reputation with Clients

    Dev Team Sales Why? Who? How? What? Less than one ”Late Reports” incident / month Take better sales ideas to clients Analytics for client risk profiles & interest areas “Resilient Delivery” feature & add report data cache Impact Mapping can help make benefits of technical change clear https://www.impactmapping.org
  11. STEP 3: UNDERSTAND THE REMEDIATION OPTIONS Goal: understand the realistic

    options for realising those benefits Tactical Improvement Backlog of independent incremental changes Strategic Improvement Backlog delivering incremental benefit Incremental Rebuild Future state and incremental roadmap Parallel Rebuild Incrementally delivered new implementation Understand = Identify + Characterise
  12. STEP 3: UNDERSTAND THE REMEDIATION OPTIONS Characterise each option clearly

    (benefits, size, difficulty, risk) Improvement Option Benefits Scale (S-XL) Difficulty (1-10) Risk (1-10) Refactor request handling into a new library New APIs 25% quicker, better testing & reliability (30% less defects in prod), better monitoring, enable message-based interfaces L 4 3 New dev, test, UAT pipelines Reduced release effort (~40%), increased release reliability (20% less defects in prod), earlier defect identification (70/20/10%), consistency M 6 1 Replace Angular with React Faster change in the UI, easier access to skills, compatibility with corporate tooling XL 7 7 Replace synchronous calls to CRM with message-based Resilience when CRM is slow or unavailable, easier adaptation (new CRM in Q4), monitoring L 5 4 …
  13. STEP 4: RISKS, CONSTRAINTS AND TRADEOFFS Goal: identify what could

    go wrong, what is possible and decision implications Risks • Known unknowns • Areas of unknown unknowns • Integration points • Functions really understood? • Data complexity, quality, mastering • Is reliable change possible? • … Constraints • Money • Risk and change appetite • Realistic technology options • Team skills, morale • Political landscape • Externals: regulators, customers • …
  14. STEP 4: TRADE OFF MATRICES Step 1: Define Decision Factors

    Step 2: Capture Factors for Each Option in a Tradeoff Matrix Factor Definition Measurement Migration Risk What is the risk of migrating to each option Likelihood of outage or rollback needed (H, M, L) Bus. Change Risk What level of risk is involved in the business process change needed for an option Likelihood of client visible business interruption (H, M, L) User Experience What level of UX improvement or reduction is likely for an option Improvement or reduction in UX (from -10 to +10) Ease of Feature Change How much faster or slower is feature delivery likely to be for each option Faster or slower in percentage terms. Option Migration Risk Bus. Change Risk User Experience Feature Change Add Reporting DB L L +4 -10% Upgrade DB and Schema Extension M L +2 0% Build New Reporting Engine on existing DB H M +8 +50% Integrate with Corporate Reporting Services H H -4 -75% Goal: Address Client Reporting Problems
  15. STEP 5: STAKEHOLDER-LED DECISION Goal: use our data and insights

    to help stakeholders make a good decision Context Concerns Options & Tradeoffs Risks and Constraints ?
  16. STEP 5: STAKEHOLDER-LED DECISION Have a clear strategy Democratic? Consensus?

    Consultative? Choice usually driven by culture & people Focused on the facts Context, concerns, options, risks and constraints Explain tradeoffs Use architectural thinking and stakeholder language Capture them in a standard form for clarity Avoid distortions Preconceptions, assumptions, bias, loudest voices, …
  17. AN ARCHITECTURE-DRIVEN APPROACH Step 1: Where Are You? Step 2:

    Potential Benefits Step 3: Remediation Options Step 4: Risks, Constraints & Tradeoffs Step 5: Stakeholder-Led Decision
  18. Disclaimer: this example is a “mélange” of the characteristics of

    a number of system modernisations that I have worked on over the years, so it is realistic but not a portrail of any specific system or organization.
  19. A SYSTEM WITH SOME PROBLEMS <<db2_db>> System DB <<cpp_svc>> Calculation

    Service <<java_svc>> Trading Support Service <angular_spa>> Trading Support Workbench <<pub/sub>> Message Bus • Capital Markets trading support platform • Highly differentiating (largely unique) • Delivery velocity has slowed dramatically • Formerly reliable, now failing regularly • Complex 600k LOC, multi-technology, db-centric • Highly connected to other parts of the bank The desire is to rewrite … but should they? External Systems
  20. STEP 1: WHERE ARE WE? … CONTEXT <<db2_db>> System DB

    <<cpp_svc>> Calculation Service <<java_svc>> Trading Support Service <angular_spa>> Trading Support Workbench <<pub/sub>> Message Bus Highly interconnected part organisation’s technology Used by specialised and influential user community
  21. STEP 1: WHERE ARE WE? … CONCERNS <<db2_db>> System DB

    <<cpp_svc>> Calculation Service <<java_svc>> Trading Support Service <angular_spa>> Trading Support Workbench <<pub/sub>> Message Bus Love the specialised UI … but want reliability and new features Many DB related incidents Defect and reliability incidents Surveys indicate concerns are delivery speed, performance, reliability
  22. STEP 1: WHERE ARE WE? … QUALITY ATTRIBUTES & CODE

    ¡ Survey & Metrics => Performance, resilience, reliability ¡ Operational analysis => regular failures at peak demand ¡ Code analysis => complexity, but good modularity ¡ Code analysis => security vulnerabilities ¡ Code analysis => complex database structure & queries ¡ Dev survey => little test automation, fragile tests ¡ Dev survey => limited deployment automation
  23. STEP 2: THE BENEFITS ¡ ”The Business” – every outage

    costs money and every new feature makes money ¡ TechOps - better reliability => easier management => reduce cost (and stress!) ¡ Dev Team - reliable testing, easier code changes, easier deployment => faster feature delivery It appears that there are clear financial benefits … what are our options?
  24. <<oracle_db>> Trading DB <<cpp_svc>> Calculation Service <<java_svc>> Trading Support Service

    <angular_spa>> Trading Support Workbench Message Bus <<java_svc>> UI Services <<java_svc>> Risk Service <<oracle_db>> Risk DB <mobile_app>> Trading Support Assistant STEP 3: REMEDIATION OPTIONS <<oracle_db>> Trading DB <<cpp_svc>> CalcNG <react_app>> Trading Support Workbench <<java_svc>> UI Services <mobile_app>> Trading Support Assistant <<java_svc>> Trading Service <<java_svc>> Mobile Services <<oracle_db>> Risk DB <<java_svc>> Risk Service <<oracle_db>> Structure DB <<java_svc>> Structuring Service <<oracle_db>> Account DB <<java_svc>> Account Service <<oracle_db>> Risk DB <<java_svc>> Risk Service API Gateway Strategic Improvements Parallel Rebuild (Partitioning and splitting databases, partition the Java monolith into three, introduce a mobile app for experimental new features, automation everywhere). (Green field, service-based rewrite using more modern technology and decomposition to services with new UI and mobile app).
  25. STEP 4: RISKS AND CONSTRAINTS Risks • Correctness – rewriting

    600kloc of code without logic errors • Complexity – making major changes is error prone • Migration – migrating existing business is risky • Reliability – major changes could cause more outages • …
  26. STEP 4: RISKS AND CONSTRAINTS Constraints • Team – team

    have little automation experience, split skills Java / JS / C++ • Integration – bank clients do not want changes to integration points • User Experience – existing users know and like the user interface • Technology – current technology understood and supported • …
  27. STEP 5: STAKEHOLDER-LED DECISION Current State Estimated benefits Options &

    Tradeoffs Risks and Constraints Context Specific Tradeoffs Business Ops Development • Refactor vs migration risks • Refactor risk vs new defects • Disruption vs benefits • Business change risk vs new opportunities • Team risks with new tech • … ?
  28. THE DECISION The team decided to perform a strategic improvement

    programme and over 12-18 months the system was systematically improved resolving many of the perceived problems and salvaging the reputation of the system and the team. (The risks of migration, external integration changes and rewriting a lot of complex logic reliably in reasonable time were the main factors in the decision)
  29. CONCLUSIONS: KEY PRINCIPLES Aim for architecture-driven but stakeholder-led decision Data

    collection and analysis key investments for good decisions Stakeholder communication is key Consider all the options, understand their tradeoffs Find the benefits early
  30. AN ARCHITECTURE-DRIVEN DECISION Step 1: Where Are You? Step 2:

    Potential Benefits Step 3: Remediation Options Step 4: Risks, Constraints and Tradeoffs Step 5: Stakeholder-Led Decision
  31. CONCLUSIONS: ARCHITECTURE SKILLS ARE CRUCIAL Data & Analysis Understanding Tradeoffs

    Stakeholder Communication Quality Attributes Identifying Options Systematic Stakeholder-Led Decision Making