Upgrading Multi-Agent Pathfinding for the Real World

kei18 https://kei18.github.io [email protected] Nov. 2025 Keisuke Okumura1,2 1University of Cambridge
2National Institute of Advanced Industrial Science and Technology (AIST) Upgrading Multi-Agent Pathfinding for the Real World

/134 2 "Swarm" automation is ubiquitous & future Insider Tech
/ YouTube Ocado’s warehouse

/134 3 antithesis.com/blog/zelda NAVER LABS / YouTube Marina Bay Sands
JAY JANNER / AMERICAN-STATESMAN Oliver Berg / GETTY IMAGES

/134 4 generated by DALL-E 3 Design Choice Decentralization Centralization

/134 5 Drone Addicts / YouTube Port of Amsterdam @tuidelescribano
/ X Shibuya Scramble Crossing, Tokyo Decentralization is powerful

/134 6 @yr_6001_as / X Skylark Channel / YouTube Delivery
Robots in Restaurants deadlock But with possibility of miscoordination

/134 7 RoyalChris / reddit Miscoordination Examples Amazon Warehouse Radio
BandNews FM / Facebook São Paulo, gridlocked intersection

/134 8 Design Choice Decentralization Centralization

/134 9 given agents graph goals solution paths without collisions
cost total travel time, distance, makespan, etc MAPF: Multi-Agent Path Finding

/134 10
494 9150 Amazon bought Kiva Systems 1st “League of Robot Runners” competition sponsored by Amazon Robotics

/134 11 Are cutting-edge MAPF studies ready for the real-world?
I want to say, almost YES

/134 12 How to Use MAPF: Feedback Control system (e.g.,
warehouse) query MAPF plan state e.g., ≥10s

/134 13 Real world is more complicated typical research rotation
aggressive moves sometimes unavailable task interaction asynchrony heterogeneity continuous space imperfect execution, sometimes crash com m . delay what the industry expects

/134 14 system (e.g., warehouse) query MAPF uncertainties robot dynamics
Two Challenges e.g., ≥10s

/134 15 Uncertainty makes MAPF intractable Okumura, K., Bonnet, F.,
Tamura, Y., & Défago, X. Offline time-independent multiagent path planning. T-RO. 2023. (extended from IJCAI’22) Okumura, K. & Tixeuil, S. Fault-Tolerant Offline Multi-Agent Path Planning. AAAI. 2023. plan tolerant to any timing delays Finding a solution is NP-hard, verifying a solution is co-NP-complete

/134 16 system (e.g., warehouse) query MAPF robot dynamics e.g.,
≥10s NO, this is hard uncertainties My Strategy

/134 17 system (e.g., warehouse) query MAPF robot dynamics ≤1s
uncertainties Sufficiently quick replanning conquers uncertainties My Strategy

/134 18 system (e.g., warehouse) query MAPF robot dynamics uncertainties
My Strategy ≤1s

/134 19 ≃ quick scalable computationally lightweight

/134 20 We need quick & scalable methods. But are
MAPF algorithms (so far) not scalable?

/134 21 Hart, P. E., Nilsson, N. J., & Raphael,
B. A formal basis for the heuristic determination of minimum cost paths. IEEE Trans. on Systems Science and Cybernetics. 1968. start goal A* search start goal f = g + h search tree

/134 22 … … … … … search node (configuration)
goal configuration Vanilla A* for MAPF

/134 23 Complete algorithms return solutions for solvable instances in
finite time; otherwise, they report the non-existence. Optimal algorithms always return solutions having minimum costs. Algorithm Properties A* is complete. It is optimal with admissible heuristics.

/134 24 runtime (sec) solved instances (%)
- 13,900 instances - 33 grid maps - every 50 agents, up to max. (1000) - tested on standard desktop PC Stern, R. et al. Multi-Agent Pathfinding: Definitions, Variants, and Benchmarks. SoCS. 2019. 33 grid maps e.g., random-32-32-20, 200 agents Evaluation on MAPF benchmark maze-32-32-2, 100 agents 00.0% A* [Hart+ 68] complete optimal algorithm properties computation time

/134 25 Reason for poor performance of A* start goal
search tree branching factor (number of successor nodes) O(5^N) N: #agents MAPF has a huge branching factor: 3,125 9,765,625 95,367,431,640,625 931,322,574,615,478,515,625 9,094,947,017,729,282,379,150,390,625 88,817,841,970,012,523,233,890,533,447,265,625 5^5 5^10 5^20 5^30 5^40 5^50 For 10k agents? Ridiculous!

/134 26
runtime (sec) solved instances (%) 0.0% A* [Hart+ 68] 0.4% ODrM* [Wagner+ AIJ-15] complete optimal algorithm properties A* variant

/134 27 Relaxing Completeness Optimal algorithms always return solutions having
minimum costs. Complete algorithms return solutions for solvable instances in finite time; otherwise, they report the non-existence. unable to identify unsolvable instances

/134 28 CBS: Conflict-based Search Sharon, G., Stern, R., Felner,
A., & Sturtevant, N. R. Conflict-based search for optimal multi-agent pathfinding. AIJ. 2015. high-level search Image by GraphicMama-team from Pixabay identify conflicts in solution candidates low-level search find a path satisfying constraints (e.g., A*) query a single-agent path that avoids detected conflicts return a path satisfying constraints

/134
29 0.0% A* [Hart+ 68] 0.4% ODrM* [Wagner+ AIJ-15] complete optimal runtime (sec) solved instances (%) 8.3% CBS [Sharon+ AIJ-15; Li+ AIJ-21] 10.7% BCP [Lam+ COR-22] solution complete optimal (unable to identify unsolvable instances) two-level, but with mathematical optimization at high-level

/134 30 Relaxing Optimality Optimal algorithms always return solutions having
minimum costs. Complete algorithms return solutions for solvable instances in finite time; otherwise, they report the non-existence. allowing bounded suboptimal solutions: obtained solution cost ≤ w*(optimal solution cost) where w ≥ 1

/134
31 0.0% A* [Hart+ 68] 0.4% ODrM* [Wagner+ AIJ-15] 8.3% CBS [Sharon+ AIJ-15; Li+ AIJ-21] 10.7% BCP [Lam+ COR-22] complete solution complete optimal optimal (unable to identify unsolvable instances) runtime (sec) solved instances (%) 30.9% I-ODrM*-5 [Wagner+ AIJ-15] complete bounded suboptimal A* variant

/134 32 Relaxing Completeness and Optimality Optimal algorithms always return
solutions having minimum costs. Complete algorithms return solutions for solvable instances in finite time; otherwise, they report the non-existence. allowing bounded suboptimal solutions

/134 33 0.0% A* [Hart+ 68] 0.4% ODrM* [Wagner+ AIJ-15]
8.3% CBS [Sharon+ AIJ-15; Li+ AIJ-21] 10.7% BCP [Lam+ COR-22] 30.9% I-ODrM*-5 [Wagner+ AIJ-15] complete solution complete complete bounded suboptimal optimal optimal (unable to identify unsolvable instances) runtime (sec) solved instances (%) 50.5% EECBS-5 [Li+ AAAI-21] solution complete bounded suboptimal CBS variant

/134 34 Give up Everything Optimal algorithms always return solutions
having minimum costs. Complete algorithms return solutions for solvable instances in finite time; otherwise, they report the non-existence.

/134 35 PP: Prioritized Planning Erdmann, M., & Lozano-Perez, T.
On multiple moving objects. Algorithmica. 1987; Silver, D. Cooperative pathfinding. AIIDE. 2005. simple, quick, scalable, reasonable solution quality 1 stay 2 2. Perform single-agent pathfinding for each agent according to priorities, while avoiding collisions with already competed paths. 1 2 1. Assign priorities to each agent.

8.3% CBS [Sharon+ AIJ-15; Li+ AIJ-21] 10.7% BCP [Lam+ COR-22] 30.9% I-ODrM*-5 [Wagner+ AIJ-15] complete solution complete complete bounded suboptimal optimal optimal (unable to identify unsolvable instances) runtime (sec) solved instances (%) 50.5% EECBS-5 [Li+ AAAI-21] solution complete bounded suboptimal 61.4% PP [Silver AIIDE-05] incomplete suboptimal

/134 37 MAPF-LNS2: Large Neighborhood Search Li, J., Chen, Z.,
Harabor, D., Stuckey, P. J., & Koenig, S. MAPF-LNS2: fast repairing for multi-agent path finding via large neighborhood search. AAAI. 2022. high-level search low-level search query paths for the subset of agents return paths identify subset of agents (e.g., random selection) find paths for the subset with a smaller number of collisions within the subset (e.g., PP)

0.0% A* [Hart+ 68] 0.4% ODrM* [Wagner+ AIJ-15] 8.3% CBS [Sharon+ AIJ-15; Li+ AIJ-21] 10.7% BCP [Lam+ COR-22] 30.9% I-ODrM*-5 [Wagner+ AIJ-15] complete solution complete complete bounded suboptimal optimal optimal (unable to identify unsolvable instances) 50.5% EECBS-5 [Li+ AAAI-21] solution complete bounded suboptimal 61.4% PP [Silver AIIDE-05] incomplete suboptimal 80.9% LNS2 [Li+ AAAI-22] capable of addressing hundreds of agents

0.0% A* [Hart+ 68] 0.4% ODrM* [Wagner+ AIJ-15] 8.3% CBS [Sharon+ AIJ-15; Li+ AIJ-21] 10.7% BCP [Lam+ COR-22] 30.9% I-ODrM*-5 [Wagner+ AIJ-15] complete solution complete complete bounded suboptimal optimal optimal (unable to identify unsolvable instances) 50.5% EECBS-5 [Li+ AAAI-21] solution complete bounded suboptimal 61.4% PP [Silver AIIDE-05] 80.9% LNS2 [Li+ AAAI-22] lose nice props. incomplete suboptimal

8.3% CBS [Sharon+ AIJ-15; Li+ AIJ-21] 10.7% BCP [Lam+ COR-22] 30.9% I-ODrM*-5 [Wagner+ AIJ-15] 50.5% EECBS-5 [Li+ AAAI-21] 61.4% PP [Silver AIIDE-05] 80.9% LNS2 [Li+ AAAI-22] complete solution complete complete solution complete bounded suboptimal bounded suboptimal optimal optimal (unable to identify unsolvable instances) incomplete suboptimal Scalable methods are "scalable"? 4 agents success?

/134 41 Summary so far (~2022) complete optimal incomplete suboptimal
holy grail tradeoff coordination guarantees planning effort small large speed & scalability

/134 42 LaCAM with PIBT (2023) complete optimal incomplete suboptimal
planning effort small large speed & scalability coordination guarantees generated by DALL-E 3

/134 43 Ultra-Lightweight MAPF Methods: PIBT & LaCAM

/134 44 Unblock Me / Google Play DavidPlays / YouTube
goal planning stage acting stage Two Styles to Solve Puzzle long-horizon (deliberative, offline) short-horizon (reactive, online)

/134 45 coordination guarantees planning effort small large speed &
scalability complete optimal incomplete suboptimal long-horizon (deliberative, offline) short-horizon (reactive, online) Planning Horizon planning stage acting stage

/134 46 Strategy to Overcome the Tradeoff integration coordination guarantees
planning effort small large speed & scalability complete optimal incomplete suboptimal

/134 47 This is just a metaphor. Image by OpenClipart-Vectors
from Pixabay PIBT Okumura+ AIJ-22 LaCAM Okumura AAAI-23, IJCAI-23

/134 48 PIBT Okumura+ AIJ-22 LaCAM Okumura AAAI-23, IJCAI-23

/134 49 PIBT: Priority Inheritance with Backtracking Okumura, K., Machida,
M., Défago, X., & Tamura, Y. Priority Inheritance with Backtracking for Iterative Multi-agent Path Finding. AIJ. 2022. (extended from IJCAI’19) collision-free configuration while reﬂecting preferences PIBT 1 desired 2 3 4 5 4 2 1 desired 3 preference & priority + configuration High Low

/134 50 greedy assignment with priorities is incomplete stuck high
low mid How PIBT works – 1/4

/134 51 high low mid as high priority inheritance Sha+
IEEE Trans Comput-90 How PIBT works – 2/4

/134 52 high as high as high as high as
high stuck but still not feasible How PIBT works – 3/4

/134 53 invalid valid re-plan re-plan valid You can move
invalid You must re-plan, I will stay introduce backtracking How PIBT works – 4/4 always generate collision-free configurations

/134 54 The League of Robot Runners Winning Strategy for
Lifelong MAPF Competition Chan, Shao-Hung, et al. The league of robot runners competition: Goals, designs, and implementation. ICAPS. 2024 The best overall submission was based on PIBT (2023) I did not involve! default planner : PIBT-based winner : PIBT-based - winner prizes (total): $10,000 - 688 submissions (2023) => 1513 (2024)

/134 55 PIBT PIBT initial configuration PIBT goal configuration Vanilla
PIBT for MAPF incomplete and suboptimal

/134 56
incomplete suboptimal 67.4% PIBT [Okumura+ AIJ’22] runtime (sec) solved instances (%)

/134 57 PIBT Okumura+ AIJ-22 LaCAM Okumura AAAI-23, IJCAI-23

/134 58 … … … … … search node (configuration)
goal configuration Recap: A* exponential number of node generation greedy search: 44 nodes

/134 59 PIBT initial configuration Recap: PIBT use PIBT to
guide exhaustive search only 4 configurations PIBT goal configuration PIBT

/134 60 … … PIBT initial configuration … … PIBT
goal configuration Okumura, K. LaCAM: Search-Based Algorithm for Quick Multi-Agent Pathfinding. AAAI. 2023 LaCAM: Lazy Constraints Addition Search for MAPF … PIBT PIBT guides depth-first search with lazy successor generation greedy: 44 nodes LaCAM: 4 nodes => quick & complete MAPF lazy generation

/134 must go left in the next config. 61 constraint
tree (maintained implicitly) invoked multiple times during the search Lazy constraints addition – 1/4

/134 62 1st invoke configuration generation with no constraint Lazy
constraints addition – 2/4

/134 63 2nd invoke configuration generation with Lazy constraints addition
– 3/4

/134 64 e.g., breadth-first search 24th invoke configuration generation with
Lazy constraints addition – 4/4

/134 65 A* / PIBT 0% with 400 agents 100%,
worst: 11sec LaCAM #agents success rate random-32-32-20, 32x32, 30sec timeout, 400 agents Performance of LaCAM

/134 66
incomplete suboptimal 67.4% PIBT [Okumura+ AIJ’22] 0.0% A* [Hart+ 68] complete optimal 99.0% LaCAM + fine-tuned PIBT complete eventually optimal runtime (sec) solved instances (%) Okumura, K. Improving LaCAM for Scalable Eventually Optimal Multi-Agent Pathfinding. IJCAI. 2023.

/134 67
runtime (sec) solved instances (%) 0.0% A* [Hart+ 68] 0.4% ODrM* [Wagner+ AIJ’15] 8.3% CBS [Sharon+ AIJ’15; Li+ AIJ’21] 10.7% BCP [Lam+ COR’22] 30.9% I-ODrM*-5 [Wagner+ AIJ’15] 50.5% EECBS-5 [Li+ AAAI’21] 61.4% PP [Silver AIIDE’05] 80.9% LNS2 [Li+ AAAI’22] 67.4% PIBT [Okumura+ AIJ’22] complete solution complete complete solution complete bounded suboptimal bounded suboptimal optimal optimal (unable to identify unsolvable instances) incomplete suboptimal 99.0% LaCAM [Okumura IJCAI’23] complete eventually optimal lose nice props.

/134 68 LaCAM suboptimally solves MAPF for 10k agents in
a warehouse-style map with many narrow corridors, in 5 seconds on my laptop

/134 69 We now have quick & scalable methods. But
solution quality does matter.

/134 70 ﬂowtime / LB instance x25
LNS2 random-32-32-20 409 agents, 30s timeout No. vanilla LaCAM Is LaCAM the universally dominant MAPF solver?

/134 71 solution quality optimal suboptimal ≥1000 agents in seconds
≤100 agents in minutes speed & scalability holy grail Towards Real-Time, Large-Scale, and Near-Optimal MAPF Vanilla LaCAM is a good starting point

/134 72 initial solution planning deadline improve initial solution quality
improve convergence speed Improving Anytime Planning computation time solution cost

/134 73 LaCAM3 – Engineered Okumura, K., Engineering LaCAM∗: Towards
Real-Time, Large-Scale, and Near-Optimal Multi-Agent Pathfinding. AAMAS. 2024. LaCAM2 IJCAI’23 LaCAM3 409 agents, 30s timeout

/134 74 Make PIBT Smatter with Small Effort Okumura, K.
& Nagai, H. Lightweight and Effective Preference Construction in PIBT for Large-Scale Multi-Agent Pathfinding. SoCS. 2025. Tiebreaking does matter! a b 1 desired 2 3 3 or 4 2 3 or 4 1 desired High Low PIBT preference b a 5 steps 3 steps

/134 75 Make PIBT Smatter with Small Effort Okumura, K.
& Nagai, H. Lightweight and Effective Preference Construction in PIBT for Large-Scale Multi-Agent Pathfinding. SoCS. 2025. cost #agents 20% Tiebreaking does matter! PIBT and LaCAM are minimal, leaving plenty of room for improvement! hindrance + regret

/134 76 high-level search low-level search query paths for the
subset of agents return paths identify subset of agents (e.g., random selection) find refined feasible paths (e.g., PP) LNS: Large Neighborhood Search Refinement Li, J., Chen, Z., Harabor, D., Stuckey, P. J., & Koenig, S. Anytime multi-agent path finding via large neighborhood search. IJCAI. 2021. Okumura, K., Tamura, Y., & Défago, X. Iterative refinement for real-time multi-robot path planning. IROS. 2021. runtime (sec) cost / LB vanilla LaCAM +LNS 13% random-32-32-20, 409 agents

/134 77 Congestion Mitigation Kato, T., Okumura, K., Sasaki, Y.
& Yokomachi, N. Congestion Mitigation Path Planning for Large-Scale Multi-Agent Navigation in Dense Environments. RA-L. 2025. with congestion-aware guidance task: decentralized navigation in continuous spaces

/134 78 GG: Global Guidance Han, S.,& Yu, J. Optimizing
space utilization for more effective multi-robot path planning. ICRA. 2022. Okumura, K., Engineering LaCAM∗: Towards Real-Time, Large-Scale, and Near-Optimal Multi-Agent Pathfinding. AAMAS. 2024. Step 1: Compute spatially dispersed paths Collisions are allowed Pickup an agent and recompute its path while minimizing collisions runtime (sec) cost / LB vanilla LaCAM +Global Guidance 12% random-32-32-20, 409 agents Step 2: Run PIBT, while making agents follow the precomputed paths as much as possible

/134 79 Guidance for PIBT/LaCAM PIBT LaCAM configuration vanilla PIBT
LaCAM configuration GG: global guidance PIBT LaCAM configuration LG: local guidance

/134 80 LG: Local Guidance Arita, T. & Okumura, K.
Local Guidance for Configuration-Based Multi-Agent Pathfinding. AAAI. 2026. Mitigate local congestion by windowed decouple planning with soft collisions

/134 81 cost runtime [s]
LaCAM GG LG LNS2 random-64-64-20, 1000 agents vanilla GG (3.1% impr.) LG (31.5% impr.) Room-64-64-8, 1000 agents Mitigate local congestion by windowed decouple planning with soft collisions LG: Local Guidance Arita, T. & Okumura, K. Local Guidance for Configuration-Based Multi-Agent Pathfinding. AAAI. 2026.

/134 82 Developing techniques with human bias – OK. But
for beyond? We need learning.

/134 Silver, D. et al. Mastering the game of Go
without human knowledge. Nature. 2017 83 Search & Learning Lecture by Demis Hassabis at Cambridge, 2025 Learning a neural network model to efficiently guide the search Lee Jin-Man / AP Search Only Learning Only Learning and Search

/134 84 Neural MAPF Policy Alkazzi, J. M. & Okumura,
K., A Comprehensive Review on Leveraging Machine Learning for Multi-Agent Path Finding. IEEE Access. 2024. ʜ local observation Reinforcement Learning or Imitation Learning Train Aiming at decentralization: scalability, real-time responsiveness, partial observation

/134 85 Neural MAPF Policy Alkazzi, J. M. & Okumura,
K., A Comprehensive Review on Leveraging Machine Learning for Multi-Agent Path Finding. IEEE Access. 2024. We have lots. I cannot keep track of them all nowadays.

/134 86 Neural Methods vs. Search Learning-based methods are significantly
behind search-based solvers. “when compared with LNS2 and LaCAM3, two of the most reliable state-of-the-art MAPF solvers, SYLPH and all other (learning) planners fail to achieve comparable levels of performance.” He, C., Duhan, T., Tulsyan, P., Kim, P., & Sartoretti, G. Social behavior as a key to learning-based multi-agent pathfinding dilemmas. AIJ. 2025 Skrynnik, A., Andreychuk, A., Borzilov, A., Chernyavskiy, A., Yakovlev, K., & Panov, A.. POGEMA: A Benchmark Platform for Cooperative Multi-Agent Pathfinding. ICLR. 2025 (2025) LaCAM3

/134 87 Neural Methods and Search “all learnt MAPF works
require non-trivial data collection and complex models but perform worse than LaCAM” Neural-guided PIBT Veerapaneni, R., Jakobsson, A., Ren, K., Kim, S., Li, J., Likhachev, M. Improving Learnt Local MAPF Policies with Heuristic Search. ICAPS. 2024. “our results demonstrate that current learning-based methods fail to exhibit a clear advantage over simple rule-based heuristics” Neural-guided LNS Tan, J., Luo, Y., Li, J., Ma, H. Reevaluation of Large Neighborhood Search for MAPF. SoCS. 2025. (2025) Neural components so far remain an interesting aid; not practical.

/134 88 optimal solution quality planning effort small large speed
& scalability search algorithm Neural nets are slow! (model inference, CPU-GPU transfer, etc) Neural Methods and Search +neural aid

/134 89 optimal solution quality planning effort small large speed
& scalability +neural aid LaGAT (2025)

/134 90 LaGAT: LaCAM + Graph Attention Net Jain, R.,
Okumura, K., Amir, M. & Prorok, A. Graph Attention-Guided Search for Dense Multi-Agent Pathfinding. AAAI. 2026. LaCAM3 anytime algorithm, leading search-based planner runtime [s] solution cost ﬂowtime/LB dense warehouse, 128 agents dense {maze, room}, 160 agents For the first time, neuro-hybrid search clearly beats leading search-based planners! LaGAT

/134 91 3. fine-tuning map-specific msg. MAGAT+ 2. pre-training 1.
expert data collection maze: 80% random: 20% lacam3 imitation 4. search … deadlock detection: … … … … LaCAM unguided guided by policy LaGAT Pipeline MAGAT+, a lightweight imitation learning policy, produces PIBT preference. LaCAM with deadlock detection compensates for imperfections of neural guidance.

/134 92 ʜ MAGAT: Graph-Neural Network for MAPF Li, Q.,
Lin, W., Liu, Z., Prorok, A. Message-aware graph attention networks for large-scale multi-robot path planning. RA-L. 2021. encoder decoder communication graph imitation loss from expert trajectories (LaCAM3 for MAGAT+) graph attention layer message passing (MAGAT+ uses an improved attention architecture from the original version.)

/134 93 16.64% DCC 2M [Ma+ RAL’21] 33.62% SCRIMP 8M
[Wang+ IROS’23] 92.33% MAGAT+ (C++ / Python) 760K, without fine-tuning 92.22% MAPF-GPT 6M [Andreychuk+ AAAI’25] 90.27% MAPF-GPT 85M 86.89% MAPF-GPT 2M 35.97% SSIL 160K [Veerapaneni+ ICRA’25] runtime (sec) solved instances (% of 3,456) reinforcement learning imitation learning GNN imitation learning is lightweight yet powerful

/134 94 Recap: LaCAM PIBT configuration LaCAM configuration 3 2
5 1 desired 4 preference retrieve cost-to-go

/134 95 Integration of MAGAT into LaCAM C.f., Veerapaneni, R.,
Jakobsson, A., Ren, K., Kim, S., Li, J., Likhachev, M. Improving Learnt Local MAPF Policies with Heuristic Search. ICAPS. 2024. 2 3 4 1 desired 5 PIBT configuration preference LaCAM configuration MAGAT+

/134 96 expressive representation power inference speed fast slow inaccurate
m odel size Tradeoff Again Not appropriate; LaCAM calls NN many times Ideal, but misguiding sometimes behavior of 12K-parameters model (final params: 760K)

/134 97 1 2 constraints addition Search with Imperfect Guidance
misguided by solved Optimal makespan is 2!

/134 98 misguided by 1 unguided by deadlock (livelock) detection
Search with Deadlock Detection use standard preference for 2 1 2 constraints addition Detection is done by checking several ancestor nodes solved

/134 99 Performance of LaGAT MAPF-GPT LaCAM3–init LaCAM3–30s LaGAT–init LaGAT+LNS–30s
solution cost #agents LNS2 LaCAM3–init LaGAT–init runtime [s] MAPF-GPT dense maze #agents LaGAT outperforms others in dense challenging scenarios LNS2

/134 100 Performance of LaGAT Imitation + search outperforms expert
algorithm in dense setups! NB. For less constrained env., LaCAM3 is enough. dense warehouse, 128 agents / dense {maze, room}, 160 agents LaCAM3 LaGAT+LNS cost at 30s LaCAM3 LaGAT initial solution cost instances where LaGAT outperforms LaCAM3

/134 101 success rate w/o fine-tuning w/o pre-training LaGAT solution
cost w/o deadlock detection w/o search w/o model LaGAT What makes LaGAT special? Pre-train–then-fine-tune strategy improves reliability, search w/neural guidance improves quality

/134 102 Better Model? Existing neural MAPF policies models agent-interaction
in a pairwise manner. message passing input output But MAPF is a group interaction problem! stay The 'stay' action cannot be deduced via pairwise reasoning.

/134 103 This is not a graph; it’s a hypergraph!

/134 104 HMAGAT: Hypergraph-Neural Network for MAPF encoder message passing
hypergraph neural layer decoder ʜ imitation loss from expert trajectories (LaCAM3 for HMAGAT) directed hypergraph Jain, R., Okumura, K., Amir, M., Lio, P. & Prorok, A. Pairwise is Not Enough: Hypergraph Neural Networks for Multi-Agent Pathfinding. arXiv preprint. 2025.

/134 105 Performance of HMAGAT HMAGAT Better inductive biases (e.g.,
hypergraphs) are key to build small yet capable policies. (AD: agent density)

/134 106 We have capable MAPF ecosystems. Let’s move on
to real-world robotics.

/134 107 Multi-robot is difficult My home in Tokyo, 2021
AIST, Tokyo, 2025 Prorok Lab at Cambridge, 2024 Prorok Lab at Cambridge, 2024 🤔

/134 108 Planning Space Real World Sim-to-Real Gap Agents Robots
You

/134 109 robot team env. config / mission Conventional Centralized
Multi-Robot Control waypoints control e.g., 50Hz global coordinator e.g., 10s robot-wise controller

/134 110 robot team env. config / mission MAPF Feedback
Control with Quick MAPF e.g., 0.1s (5-20Hz) waypoints control e.g., 50Hz robot-wise controller

/134 111 LF: Online MAPF Feedback Control Shankar, A., Okumura,
K., Prorok, A. LF: Online Multi-Robot Path Planning meets Optimal Trajectory Control. arXiv preprint. 2025. sufficiently quick MAPF planner (LaCAM) x sufficiently robust optimal trajectory controller (Freyja) = practical multi-robot controller (LF)

/134 112 synchronous mission w/dynamic obstacles 10 quadrotors, 8x asynchronous
mission 6 quadrotors, 8x 4 ground robots, timelapse target following pattern formation 10 quadrotors

/134 113 classical MAPF LaCAM in LF state action collision
aligned with target direction Representation of LF sphere collision check holonomic robot

/134 114 15 quadrotors, 4x

/134 115 We want to extract hardware capabilities 1 quadrotor,
1x 8 quadrotors, 1x

/134 116 smooth, agile kinodynamic aggressiveness planning effort small large
speed & scalability state abstraction slow, jerky gridworld MAPF LF Yet Another Tradeoff continuous dynamics

/134 117 smooth, agile kinodynamic aggressiveness planning effort small large
speed & scalability slow, jerky Can scalable MAPF upgrade this border?

/134 118 db-LaCAM: Discontinuity Bounded Search Moldagalieva, A., Okumura, K.,
Prorok, A. & Honig, W. db-LaCAM: Fast and Scalable Multi-Robot Kinodynamic Motion Planning with Discontinuity-Bounded Search and Lightweight MAPF. arXiv preprint. 2025. fast kinodynamic motion planning for arbitrary multi-robot systems

/134 119 Representation of db-LaCAM <latexit sha1_base64="evLs8skJbw8/djARcMaqxHsK6X4=">AAACDXicbVC7TsMwFHV4lvIKMLJYFCSmKkGoMFawMBZEH1ITIttxWqvOQ7aDqKL8AAu/wsIAQqzsbPwNTtoBWo50paNz7tW99+CEM6ks69tYWFxaXlmtrFXXNza3ts2d3Y6MU0Fom8Q8Fj2MJOUsom3FFKe9RFAUYk67eHRZ+N17KiSLo1s1TqgbokHEAkaQ0pJnHjohUkOCeNbLoSNTLKmCpYZxdpPfZb73kHtmzapbJeA8saekBqZoeeaX48ckDWmkCEdS9m0rUW6GhGKE07zqpJImiIzQgPY1jVBIpZuV3+TwSCs+DGKhK1KwVH9PZCiUchxi3VncKWe9QvzP66cqOHczFiWpohGZLApSDlUMi2igzwQlio81QUQwfSskQyQQUTrAqg7Bnn15nnRO6naj3rg+rTUvpnFUwD44AMfABmegCa5AC7QBAY/gGbyCN+PJeDHejY9J64IxndkDf2B8/gB8HJx7</latexit> X ⇢ Rdx
classical MAPF state action collision db-LaCAM <latexit sha1_base64="Zycktn/TcFKN7yhvpyt9oXRhTQg=">AAACJ3icbZDLSsNAFIYn9VbrrerSzWARKkhJRKogSNGNyyr2Ak1bJpNJO3QyCXMRSsjbuPFV3AgqokvfxOllUVt/GPj5zjnMOb8XMyqVbX9bmaXlldW17HpuY3Nreye/u1eXkRaY1HDEItH0kCSMclJTVDHSjAVBocdIwxvcjOqNRyIkjfiDGsakHaIepwHFSBnUzV8VNXQph26IVB8jltTS4054At3LWQRdqT1J1IR5XnKfdhK/q9NuvmCX7LHgonGmpgCmqnbzb64fYR0SrjBDUrYcO1btBAlFMSNpztWSxAgPUI+0jOUoJLKdjO9M4ZEhPgwiYR5XcExnJxIUSjkMPdM52lPO10bwv1pLq+CinVAea0U4nnwUaAZVBEehQZ8KghUbGoOwoGZXiPtIIKxMtDkTgjN/8qKpn5accql8d1aoXE/jyIIDcAiKwAHnoAJuQRXUAAZP4AW8gw/r2Xq1Pq2vSWvGms7sgz+yfn4B0ySmAw==</latexit> (u 2 U)m, U ⇢ Rdu motion primitive GJK algorithm, implemented by FCL [Pan+ ICRA’12] Image from [Branicky+ ICRA’08]

/134 120 Performance of db-LaCAM db-LaCAM reliably delivers solutions much
faster. runtime [s] cost (normalized by db-LaCAM) failure

/134 Solving large problem quickly–Good! But are the trajectories smooth?
50 car-like robots, 30s db-LaCAM’s solution optimal trajectory We need trajectory smoothing.

/134 122 ack: Sergio Rivera dynamics initial state terminal states
collision-freeness <latexit sha1_base64="j0M8LRXYhqkstlmc5KVeFD+d7wI=">AAACtHicbVHLjtMwFHXCawivAks2FhWoI1CUIFTYVBoeC8RqQO10pLoNjnPTMeM4Ib5BraJ8ITt2/A3OQwhmuFLkc849R9e5jgslDQbBL8e9cvXa9RsHN71bt+/cvTe6/+DE5FUpYCFylZenMTegpIYFSlRwWpTAs1jBMj5/1/aX36E0Mtdz3BewzvhWy1QKjlaKRj9YJnVU7yLcyOe0ao+Gfat48pSZKotqSZnU9E3TM5wFzWZOP052Gxmh9bfHoceYxxB2WBsf/SHeOmp8FjZ0RtNJz5o+0QH2HhRy2qc7d2CdJpJ/+LwbvR2Uz/3MQ8oU1wlt6deWziiDrMC9AYxG48APuqKXQTiAMRnqOBr9ZEkuqgw0CsWNWYVBgeualyiFgsZjlYGCi3O+hZWFmmdg1nW39IY+sUpC07y0n0baqX8nap4Zs89i68w4npmLvVb8X29VYfp6XUtdVAha9IPSSlHMafuCNJElCFR7C7gopb0rFWe85ALtO3t2CeHFX74MTl744dSffno5Pno7rOOAPCKPyYSE5BU5Ih/IMVkQ4YTO0vnicHfqMle40FtdZ8g8JP+Uq38DsJrQgg==</latexit> min xi t ,ui t X i2A T X t=0 J(xi t , ui t ) s.t. xi t+1 = f(xi t , ui t , t) xi 0 = si xi T 2 gi R(xi t ) ^ R(xj t ) = ; cost numerical optimization (direct collocation) with nonlinear programming (IPOPT) <latexit sha1_base64="mdlukWKoBCNC1KAVprOqFMED/MA=">AAACDHicbVDLTgIxFO3gC/GFunTTSExgQ2aMQTcmRDcuMZFHAkg65QKVTmfS3jESwge48VfcuNAYt36AO//G8lgoeJImp+ecm/YeP5LCoOt+O4ml5ZXVteR6amNza3snvbtXMWGsOZR5KENd85kBKRSUUaCEWqSBBb6Eqt+/HPvVe9BGhOoGBxE0A9ZVoiM4Qyu10plK9qGFtyJHG5KpNp1e73L0nDYgiHBgAG3KzbsT0EXizUiGzFBqpb8a7ZDHASjkkhlT99wIm0OmUXAJo1QjNhAx3mddqFuqWACmOZwsM6JHVmnTTqjtUUgn6u+JIQuMGQS+TQYMe2beG4v/efUYO2fNoVBRjKD49KFOLCmGdNwMbQsNHOXAEsa1sH+lvMc042j7S9kSvPmVF0nlOO8V8oXrk0zxYlZHkhyQQ5IlHjklRXJFSqRMOHkkz+SVvDlPzovz7nxMowlnNrNP/sD5/AGRpJoW</latexit> V (xi t ) ^ V (xj t ) = ; Multi-Robot Trajectory Optimization Smoother trajectories, but slow! We need better tools. 9 agents, computation time ~15s

/134 123 Diffusion Denoising forward process backward process Let’s denoise
multi-robot trajectories using diffusion scheme

/134 124 D4orm: Optimize Deformation with Denoising Zhang, Y., Okumura,
K., Woo, H., Shankar, A. & Prorok, A. D4orm: Multi-Robot Trajectories with Dynamics-aware Diffusion Denoised Deformations . IROS. 2025. Optimizing multi-robot trajectories by iteratively optimizing deformation control with GPU-parallelizable Monte-Carlo gradient approximation – no learning! <latexit sha1_base64="nJlVvcN9vOjDyjJtgZQyz4CZvEU=">AAAB73icbVBNS8NAEJ34WetX1aOXxSJ4KolI9VjUg8cK9gPaUDbbTbt0s4m7E6GE/gkvHhTx6t/x5r9x2+agrQ8GHu/NMDMvSKQw6Lrfzsrq2vrGZmGruL2zu7dfOjhsmjjVjDdYLGPdDqjhUijeQIGStxPNaRRI3gpGN1O/9cS1EbF6wHHC/YgOlAgFo2ildveWS6Qk7ZXKbsWdgSwTLydlyFHvlb66/ZilEVfIJDWm47kJ+hnVKJjkk2I3NTyhbEQHvGOpohE3fja7d0JOrdInYaxtKSQz9fdERiNjxlFgOyOKQ7PoTcX/vE6K4ZWfCZWkyBWbLwpTSTAm0+dJX2jOUI4toUwLeythQ6opQxtR0YbgLb68TJrnFa9aqd5flGvXeRwFOIYTOAMPLqEGd1CHBjCQ8Ayv8OY8Oi/Ou/Mxb11x8pkj+APn8weUi4+w</latexit> u <latexit sha1_base64="nJlVvcN9vOjDyjJtgZQyz4CZvEU=">AAAB73icbVBNS8NAEJ34WetX1aOXxSJ4KolI9VjUg8cK9gPaUDbbTbt0s4m7E6GE/gkvHhTx6t/x5r9x2+agrQ8GHu/NMDMvSKQw6Lrfzsrq2vrGZmGruL2zu7dfOjhsmjjVjDdYLGPdDqjhUijeQIGStxPNaRRI3gpGN1O/9cS1EbF6wHHC/YgOlAgFo2ildveWS6Qk7ZXKbsWdgSwTLydlyFHvlb66/ZilEVfIJDWm47kJ+hnVKJjkk2I3NTyhbEQHvGOpohE3fja7d0JOrdInYaxtKSQz9fdERiNjxlFgOyOKQ7PoTcX/vE6K4ZWfCZWkyBWbLwpTSTAm0+dJX2jOUI4toUwLeythQ6opQxtR0YbgLb68TJrnFa9aqd5flGvXeRwFOIYTOAMPLqEGd1CHBjCQ8Ayv8OY8Oi/Ou/Mxb11x8pkj+APn8weUi4+w</latexit> u <latexit sha1_base64="nJlVvcN9vOjDyjJtgZQyz4CZvEU=">AAAB73icbVBNS8NAEJ34WetX1aOXxSJ4KolI9VjUg8cK9gPaUDbbTbt0s4m7E6GE/gkvHhTx6t/x5r9x2+agrQ8GHu/NMDMvSKQw6Lrfzsrq2vrGZmGruL2zu7dfOjhsmjjVjDdYLGPdDqjhUijeQIGStxPNaRRI3gpGN1O/9cS1EbF6wHHC/YgOlAgFo2ildveWS6Qk7ZXKbsWdgSwTLydlyFHvlb66/ZilEVfIJDWm47kJ+hnVKJjkk2I3NTyhbEQHvGOpohE3fja7d0JOrdInYaxtKSQz9fdERiNjxlFgOyOKQ7PoTcX/vE6K4ZWfCZWkyBWbLwpTSTAm0+dJX2jOUI4toUwLeythQ6opQxtR0YbgLb68TJrnFa9aqd5flGvXeRwFOIYTOAMPLqEGd1CHBjCQ8Ayv8OY8Oi/Ou/Mxb11x8pkj+APn8weUi4+w</latexit> u <latexit sha1_base64="nJlVvcN9vOjDyjJtgZQyz4CZvEU=">AAAB73icbVBNS8NAEJ34WetX1aOXxSJ4KolI9VjUg8cK9gPaUDbbTbt0s4m7E6GE/gkvHhTx6t/x5r9x2+agrQ8GHu/NMDMvSKQw6Lrfzsrq2vrGZmGruL2zu7dfOjhsmjjVjDdYLGPdDqjhUijeQIGStxPNaRRI3gpGN1O/9cS1EbF6wHHC/YgOlAgFo2ildveWS6Qk7ZXKbsWdgSwTLydlyFHvlb66/ZilEVfIJDWm47kJ+hnVKJjkk2I3NTyhbEQHvGOpohE3fja7d0JOrdInYaxtKSQz9fdERiNjxlFgOyOKQ7PoTcX/vE6K4ZWfCZWkyBWbLwpTSTAm0+dJX2jOUI4toUwLeythQ6opQxtR0YbgLb68TJrnFa9aqd5flGvXeRwFOIYTOAMPLqEGd1CHBjCQ8Ayv8OY8Oi/Ou/Mxb11x8pkj+APn8weUi4+w</latexit> u <latexit sha1_base64="nJlVvcN9vOjDyjJtgZQyz4CZvEU=">AAAB73icbVBNS8NAEJ34WetX1aOXxSJ4KolI9VjUg8cK9gPaUDbbTbt0s4m7E6GE/gkvHhTx6t/x5r9x2+agrQ8GHu/NMDMvSKQw6Lrfzsrq2vrGZmGruL2zu7dfOjhsmjjVjDdYLGPdDqjhUijeQIGStxPNaRRI3gpGN1O/9cS1EbF6wHHC/YgOlAgFo2ildveWS6Qk7ZXKbsWdgSwTLydlyFHvlb66/ZilEVfIJDWm47kJ+hnVKJjkk2I3NTyhbEQHvGOpohE3fja7d0JOrdInYaxtKSQz9fdERiNjxlFgOyOKQ7PoTcX/vE6K4ZWfCZWkyBWbLwpTSTAm0+dJX2jOUI4toUwLeythQ6opQxtR0YbgLb68TJrnFa9aqd5flGvXeRwFOIYTOAMPLqEGd1CHBjCQ8Ayv8OY8Oi/Ou/Mxb11x8pkj+APn8weUi4+w</latexit> u D4orm IPOPT runtime [s] #robots 2D holonomic circular scenario

/134 125 D4orm is a joint multi-robot trajectory smoothing, serving
as a building block of advanced methods e.g., decoupled planning for 100 robots (30s)

/134 126 Wrap-up

/134 127 system (e.g., warehouse) query robot dynamics ≤1s uncertainties
Recap: MAPF Feedback Control MAPF

/134 128 LNS refinement guidance for congestion mitigation GNN-based neuro-guided
search better interaction modeling by hypergraph MAPF feedback control kinodynamic planning with motion primitives denoising for trajectory smoothing short-horizon: PIBT long-horizon: LaCAM foundational MAPF algo. better real-time planning techniques robot deployment MAPF toolbox To the real-world

/134 129 What are missing? automatic heuristic design receding-horizon/incremental planning
task planning parallelization, GPGPU environment optimization central-to-decentralized deployment divide-and-conquer hardware testbed +visible successful examples in real industrial setups (I may bring some in a few years)

/134 130 Jan. 2021. When I was a 1st-year PhD
student: Pathfinding for 10k agents? You’re a dreamer! To be honest, I agreed at the time.

/134 131 Jan. 2021. When I was a 1st-year PhD
student: Pathfinding for 10k agents? You’re a dreamer! To be honest, I agreed at the time. PIBT LaCAM 2023 Image by naobim from Pixabay

/134 132 Already adopted by several warehouse companies, i.e., controlling
countless robots now! PIBT LaCAM 2023

/134 133 We continue to strive for more agile, robust,
adaptive and computationally efficient methods 133 Scalable multi-agent planning revolutionizes multi-robot systems

/134 134 Thank you for listening! Acknowledgements to mentors /
collaborators in: TokyoTech, AIST, Cambridge, MAPF community, and Funding: JSPS {DC1, Overseas Fellowship, Early-Career}, JST {ACT-X, PRESTO}, Yoshida Scholarship Foundation, Muratech kei18 https://kei18.github.io [email protected] Questions / research collaboration are welcome. “Upgrading Multi-Agent Pathfinding for the Real World”

Upgrading Multi-Agent Pathfinding for the Real ...

Upgrading Multi-Agent Pathfinding for the Real World

More Decks by Keisuke Okumura | 奥村圭祐

Other Decks in Research

Featured

Transcript