Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Getting to Know Your Legacy (System) with AI-Dr...

Getting to Know Your Legacy (System) with AI-Driven Software Archeology (WeAreDevelopers World Congress 2025)

Maintaining old software systems can be challenging. The code is often hard to read, documentation may be missing, and the original authors might no longer be available. However, with the data we already have at our fingertips, we can quickly dig deeper in existing codebases and track how they have evolved over time.

By using artificial intelligence, especially large language models, alongside modern data analysis techniques and freely available open-source tools, you too can gain deeper insights into a system’s architecture. Uncover hidden patterns that help you better understand the legacy entrusted to you and its evolution.

Join me to see how data-driven analysis helps you work more efficiently, reduce uncertainty, and confidently take the first step toward moving your legacy system into the future.

Avatar for Markus Harrer

Markus Harrer

July 10, 2025
Tweet

More Decks by Markus Harrer

Other Decks in Technology

Transcript

  1. Getting to Know Your Legacy (System) with AI-Driven Software Archeology

    WeAreDevelopers World Congress 2025, July 10, 2025 Markus Harrer Software Evolutionist 1
  2. What happened in between? Devstral ChatGPT Deep Research GPT-4.5 o3

    / o4-mini AlphaEvolve Veo 3 Claude 4 Jan ‘25 Jul ‘25 Initial idea Conference Apple not delivering AI (only a few things stayed the same …) 3 Context Engineering
  3. All the cool kids are playing with AI b/w drawing

    by Daniel Storii {turnoff.us}, Thanks to Michael Tharrington 4
  4. Legacy systems are different Drawing by Daniel Storii {turnoff.us}, Thanks

    to Michael Tharrington We have to deal with the here, AI! real world 5
  5. Archaeologists try to find the clues left by people who

    lived before us, and they try to make sense of them.“ “Archaeology at work”, English Heritage Education Service https://www.youtube.com/watch?v=TFejIkYDH9Q Detectives of the past “ Archeologist 9
  6. Modern archeology techniques Excavation Typology Chaîne Opératoire amount of AI

    usage little high 11 What is there? How was it made? What is it?
  7. I had my fun (and LLMs know that!) 18 My

    Software Analytics repositories on GitHub pandas matplotlib Plotly Jupyter Notebook
  8. 24 Using existing naming schemas . ├── model │ ├──

    Owner.java │ ├── Person.java │ ├── Pet.java │ ├── Specialty.java │ ├── Vet.java │ └── Visit.java ├── repository │ ├── OwnerRepository.java │ ├── PetRepository.java │ ├── VetRepository.java │ └── VisitRepository.java └── web ├── OwnerController.java ├── PetController.java ├── PetValidator.java ├── VetController.java └── VisitController.java Repository Pattern repository/*Repository.java Abstracts data access by encapsulating the logic needed to retrieve, store, and query data. repository Repository Repository Repository Repository Typology creation with LLMs
  9. A typology prompt for Claude Code Analyze the production Java

    code in this codebase and extract distinct concepts. Categorize them into two groups: - technical_concepts: architectural patterns, design tactics, or technical structures - business_concepts: domain-relevant ideas, rules, or terms that represent key business logic For each concept, provide: - name: a short, descriptive name - explanation: a concise description of what the concept is - rationale: why this concept likely exists in the codebase (technical or domain motivation) - file_globs: glob-style patterns of the used naming conventions to identify where this concept appears in the codebase (e.g., **/**Service.java, **/invoicing/**.java) Output the result as a well-structured YAML file with two top-level sections: technical_concepts and business_concepts. Focus only on Java production code (exclude test files, scripts, and configuration files). 25
  10. The expected result from an LLM technical_concepts: - name: "Boundary"

    description: "Defines the interfaces for communication between the core business logic (Interactors) and the outer layers (e.g., UI, web services). It includes Request and Response Models." rationale: "Separates the core application from the delivery mechanism, allowing the presentation layer to change independently of the business rules." file_globs: - "**/boundary/*.java" ... business_concepts: - name: "Site" description: "A 'Site' represents a distinct container or context for content like comments, files, and schedules. Most other business concepts are scoped within a specific site." rationale: "The concept of a 'Site' allows for multi-tenancy or partitioning of data, where different users or groups can have their own isolated space within the application." file_globs: - "**/site/**/*.java" 26 glob file patterns + some manual editing …
  11. Typology evaluation 27 See how many files can be associated

    with a concept If you know one file that implements a concept, you know all the other files that implement the same concept! Distribution for a small software system (~300 source code files)
  12. Typology coverage analysis 28 See where you need to dig

    deeper into the code base to get more familiar with it
  13. 29 Color key ▪ Both concept types ▪ Technical concept

    type only ▪ Business concept type only ▪ No member of any concept type Typology
  14. Advanced Typology Evaluating the conceptual integrity Source: L. Adams Gilmour,

    Early Medieval Pottery from Flaxengate Lincoln. Image: https://pixabay.com/de/photos/arch%C3%A4ologie- arch%C3%A4ologische-ausgrabung-59150/ 30
  15. Advanced Typology 31 Detect files within concepts that don‘t do

    what others do Example: Concept “Service“: Serve cold beers CService BService AService DService
  16. Advanced Typology Evaluating the conceptual integrity using LLMs [...] Please

    analyze the following source code and assess how well it implements the specified concept. [...] Demo: https://github.com/feststelltaste/software-analytics/blob/master/demos/20250710_WADWC/Conceptual%20Integrity%20Analysis.ipynb 32
  17. Chaîne opératoire The operating chain of all the steps of

    the lifecycle of an artifact 1. Creation 2. Usage 3. Maintenance 4. Repair 5. Deposition Image: https://fr.wikipedia.org/wiki/Cha%C3%AEne_op%C3%A9rat oire#/media/Fichier:Cha%C3%AEne_op%C3%A9ratoire.png 38
  18. The operating chain of all of the steps of the

    lifecycle of an artifact Chaîne Opératoire create public class Customer add testNameCheck() change to BusinessPartner fix tech debt add calculateBonus() refactor testNameCheck() delete BusinessPartner public class BusinessPartner { private String name; private double bonus; public BusinessPartner(String name) { this.name = name; } public double calculateBonus() { ... t 39
  19. But before we move on to the last demo Do

    we even need human written glue code for ArchAIology?? 40
  20. 41

  21. 43

  22. It works, if you know how to dig deeper into

    your legacy (system)! ArchAIology 46
  23. More on the topic 48 My collection about the topic

    https://github.com/feststelltaste/awesome-software-analytics
  24. Manual Work Transformation Tools Guided AI AI assistants AI agents

    Developers manually analyze, reason about, and fix issues (based on deep domain and system knowledge) Human-based creation of formal rules and recipes to perform consistent, automated code transformations Human-led detection of issues or anti-patterns, followed by localized AI-generated fixes within defined areas Human-guided AI-based task execution for fixing code in smaller areas / clearly scoped contexts Autonomous systems orchestrate analysis, transformation and validation of modernization workflows General Idea Special issues like redesign of critical parts of business logic or performance optimization Framework migrations, API upgrades, bulk renames, restructurings Identifying systemic issues and using AI to propose or apply localized solutions Summarizing code, generating tests & comments, renaming identifiers, writing code snippets Cleanup ideation, automated, multistep refactoring, bug fixing across multiple code bases Typical Use Cases ++ + o - -- Control How much humans can be in the loop -- - - + ++ Risk How likely changes go wrong -- - - o ++ Breadth How wide the method can operate ++ ++ + o o Accuracy How well problematic spots are addressed o ++ ++ o - Traceability How well actions can be tracked ~ - o o o Efforts How much work setup and use need -- ++ o - + Volume How much can be processed Light Version 1.1a Markus Harrer AI for Legacy Modernization: When and How to Use (or not)
  25. Manual Work Transformation Tools1 Guided AI AI assistants AI agents

    Developers manually analyze, reason about, and fix issues (based on deep domain and system knowledge) Human-based creation of formal rules and recipes to perform consistent, automated code transformations Human-led detection of issues or anti-patterns, followed by localized AI-generated fixes within defined areas Human-guided AI-based task execution for fixing code in smaller areas / clearly scoped contexts Autonomous systems orchestrate analysis, transformation and validation of modernization workflows General Idea Special issues like redesign of critical parts of business logic or performance optimization Framework migrations, API upgrades, bulk renames, restructurings Identifying systemic issues and using AI to propose or apply localized solutions Summarizing code, generating tests & comments, renaming identifiers, writing code snippets Cleanup ideation, automated, multistep refactoring, bug fixing across multiple code bases Typical Use Cases Very High (humans drive everything) High (humans define transformation logic, execution is automatic) Medium (humans guide focus, agents generate and apply solutions) Low (humans initiate, roughly guide and review AI’s results) Very Low (agents make decisions and act with minimal intervention) Control How much humans can be in the loop Low to Medium (may suffer from outdated assumptions, overconfidence or unclear goals) Medium (when creating recipes) to none (during execution, but also depends on recipe quality) Low (with good problem localization that allows suggestions in limited contexts) High to medium (depends on scope and tasks) Very high to high (esp. with broad tasks and high autonomy + wrong tool use) Risk How likely changes go wrong Very narrow (limited by developers’ cognitive capacities) Narrow (limits defined by AST, LST or recipe capabilities) Narrow (scoped to recognizable patterns or metrics) Limited (current file, code block or interaction context) Very broad (across files, services and task types) Breadth How wide the method can operate Human-level quality (varies by experience) High (precise and deterministic) High (during analysis), medium (during fixing) Medium (but error-prone outside narrow, familiar contexts / training) Medium (depends on prompt quality, feedback loops, available tools) Accuracy How well problematic spots are addressed High (with peer review and diffs) Very high (rules, recipes, diffs) High (analysis steps, reports3, diffs) Medium (prompt history, diffs) Medium (prompts, execution paths, diffs) Traceability How well actions can be tracked Variable (depends on task difficulty) Low to medium4 (depends on reusing existing recipes or creating new ones) Medium (because data analysis needed) Medium (prompt writing, instruction definition, model tuning) First low (“it’s just prompts”), later high (MCPs, orchestration, validation, security, …) Efforts How much work setup and use require Limited due to the need for deep contextual understanding High-volume, homogeneous code bases Mid-sized codebases (with structural issues) Localized impact (limited by context window ) Large, heterogeneous systems (with recurring issues) Volume How much can be processed 1 e.g. Codemods, OpenRewrite, Rector 2 e.g. using jQAssistant, Semgrep, CodeScene 3 e.g. using Jupyter Notebooks 4 for new recipes, AI might be used Full Version 1.1a Markus Harrer MCP: Model Context Protocol AST: Abstract Syntax Tree LST: Lossless Semantic Tree AI for Legacy Modernization: When and How to Use (or not)