Source Code Diff Revolution

Source code diff revolution

Encyclopedia 3000: The picture shows humans, known as “software engineers”
using source code diff tools to perform a “code review” in the year 2025.

Why source code diff is so important in software engineering?

When do we need diff?

3-way merge

How frequently do we use diff tools? 41 minutes of
code reviewing per day Global Code Time Report, 2022 https://www.software.com/reports/code-time-report 250K+ developers M. Codoban, S. S. Ragavan, D. Dig, and B. Bailey, “Software history under the lens: A study on why and how developers examine it,” ICSME 2015. Survey with 217 developers: P 85% of them consider software history important to their development activities P 61% need to refer to history at least several times a day

Eugene Myers Language independent Super-fast and scalable Line-level granularity Does
not handle well moves

Implications of low-quality diff ✘Makes blame fail ✘git log for
block statements has 73% precision and 83% recall [Hasan et al., TSE 2024] ✘SZZ algorithm finds the change that introduced a bug, has over 9 variations fixing issues related to blame ✘Makes code reviewing slower ✘“Understanding the code takes most of the reviewing time” [Bacchelli and Bird, ICSE 2013] ✘“Understanding the code’s purpose, the motivations for the change, and how the change was implemented” [MacLeod et al., IEEE Software 2017] ✘Makes merge conflict resolution a nightmare ✘Structure-aware and refactoring-aware merging tools [IntelliMerge, OOPSLA 2019] [RefMerge, TSE 2022] [JDime, ASE 2017]

Myers algorithm 1986

Abstract Syntax Tree diff Fine-grained diff between AST nodes (not
just lines) Supports moves and updates (not just additions and deletions) Still has limitations (coming up soon…)

2007 Change Distiller Fluri et al. 2014 GumTree Falleri et
al. 2016 MTDiff Dotzler et al. 2018 IJM Frick et al. 2023 iASTMapper Zhang et al. 2024 RMiner 3 Alikhanifard et al. Language aware Partial matching Language independent Largest identical subtrees Language independent Move action optimizations Tree Matching Statement Mapping

AST diff is NOT perfect

No support for multi-mappings Limitation #1 “A given node can
only belong to one mapping”

GumTree Simple

GumTree Greedy

RefactoringMiner calls to extracted method

Incorrect matching of program declarations Limitation #2 “The algorithm is
independent of any language specificity”

GumTree Simple renamed new method

RefactoringMiner call to extracted method inverted if condition

Semantic ignorance Limitation #3 “Mappings involve two nodes with identical
labels”

GumTree Greedy Type matched with variable Variable matched with method
call Variable matched with lambda parameter For body block matched with method body block

RefactoringMiner

Refactoring un-awareness Limitation #4

GumTree Simple RefactoringMiner Rename object to item Extract variable itemKey
object matched with itemKey object matched with item

No support for commit-level analysis Limitation #5

GumTree Greedy

RefactoringMiner Pulled up to superclass AbstractDQLPlanNode Moved from class NestedLoop

Poor evaluation standards Limitation #6

Is shorter (edit script) better?

The path of Virtue or Vice Bench marks shorter edit
script

The first AST Diff benchmark • Process (6 months): 1.
Run all ASTDiff tools (GumTree 3.0, GumTree 2.1, IJM, MTDiff, RMiner) 2. Manually validate the diffs 3. Construct the “perfect” diff • Datasets: • 800 bug fixing commits from Defects4J • 187 refactoring commits from Refactoring Oracle Pouria Alikhanifard

Approach

RefactoringMiner Statement mappings Program declaration mappings Import declaration mappings Refactoring
mappings based on mechanics Tree Matcher Tree Matcher Overwrite conflicting mappings AST mappings AST mappings Final AST mappings Edit script version1 version2

Evaluation results

AST mapping accuracy dataset RMiner 3 Precision Recall GumTree greedy
Precision Recall GumTree simple Precision Recall iASTMapper Precision Recall Defects4J 99.7 99.3 97.5 93.1 98.4 97.8 98.5 99 Refactoring 99.6 99.2 84.1 70.2 86.7 72.4 91.8 79.2 Overall 99.7 99.3 93.8 86.1 95.2 90 96.7 92.9 1. RMiner 3.0 99.5% 2. iASTMapper 94.8% 3. GumTree simple 92.6% 4. GumTree greedy 89.8% Tree Matching Statement Mapping Ranking based on F-score ±1-6% ±8-29% 99.4% 85.0% 78.9% 76.5% Refactoring only

How can I use your tool? 1. dependency 2. Command
line tool 3. Docker image 4. git rmd 5. GitHub action APIs: 1. With a commit of a locally cloned git repository 2. With a commit fetched directly from GitHub 3. With the files changed in a GitHub Pull Request 4. With two directories https://github.com/tsantalis/RefactoringMiner

modified file moved/renamed file moved code between files

File split to multiple files

Code from different files merged to a single file

Moved code between files with overlapping refactorings

1. Extract and Move Method 3. Local variable theRecord renamed
to result 4. Inherited attribute allFields renamed to allReportedFields 5. Moved attribute UNKOWN_FIELD_AT_ENTRY_TYPE_CELL_ENTRY renamed to fix typo 6. String literal “-” extracted to an attribute 2. Move Attributes

Is everything so perfect?

Test code is very challenging to correctly match

Mixed method study on test refactoring firehouse interviews commit mining
for refactoring type patterns Victor Veloso

55 test-specific refactorings 31 totally new 10 found by all
3 methods 30 found by at least 2 different methods

The two extremes in AST diff Language specific Language independent
RefactoringMiner High accuracy Complex algorithm Hard to generalize Lower accuracy Simple algorithm Blind matching

Intelligent Code Review Assistants

ChangeViz Gasparini et al. VISSOFT 2021

CLDiff Huang et al. ASE 2018

Variable url is extracted to construct the pagination url based
on the value of the pageNumber parameter This line has been added to the extracted method to return whether the coursesContainer includes more pages with courses. This newly added while loop calls the extracted method by incrementing the pageNumber argument by 1 in each iteration and terminates when the extracted method returns false (i.e., there are no more pages left).

More benchmarks

DiffBenchmark 1. Generate “perfect diff” programmatically • Combining diffs from
different tools • Discarding and injecting mappings 2. Translate the output diff of any tool to a common format based on offset information (e.g., gumtree-spoon) 3. Extend existing tools with missing features • enable multi-mappings • enable inter-file mappings • force semantically compatible matches 4. Compute precision/recall based on “perfect diff”

https://github.com/tsantalis/RefactoringMiner https://github.com/pouryafard75/DiffBenchmark

Source Code Diff Revolution

Source Code Diff Revolution

More Decks by Nikolaos Tsantalis

Other Decks in Research

Featured

Transcript