Upgrading Multi-Agent Pathfinding for the Real World
An extended version of my invited talk at EurMAPF-25, held in Bologna, Italy. This talk aims to invite you to the forefront of MAPF research directly 🚀
aggressive moves sometimes unavailable task interaction asynchrony heterogeneity continuous space imperfect execution, sometimes crash com m . delay what the industry expects
Tamura, Y., & Défago, X. Offline time-independent multiagent path planning. T-RO. 2023. (extended from IJCAI’22) Okumura, K. & Tixeuil, S. Fault-Tolerant Offline Multi-Agent Path Planning. AAAI. 2023. plan tolerant to any timing delays Finding a solution is NP-hard, verifying a solution is co-NP-complete
B. A formal basis for the heuristic determination of minimum cost paths. IEEE Trans. on Systems Science and Cybernetics. 1968. start goal A* search start goal f = g + h search tree
finite time; otherwise, they report the non-existence. Optimal algorithms always return solutions having minimum costs. Algorithm Properties A* is complete. It is optimal with admissible heuristics.
minimum costs. Complete algorithms return solutions for solvable instances in finite time; otherwise, they report the non-existence. unable to identify unsolvable instances
A., & Sturtevant, N. R. Conflict-based search for optimal multi-agent pathfinding. AIJ. 2015. high-level search Image by GraphicMama-team from Pixabay identify conflicts in solution candidates low-level search find a path satisfying constraints (e.g., A*) query a single-agent path that avoids detected conflicts return a path satisfying constraints
solutions having minimum costs. Complete algorithms return solutions for solvable instances in finite time; otherwise, they report the non-existence. allowing bounded suboptimal solutions
On multiple moving objects. Algorithmica. 1987; Silver, D. Cooperative pathfinding. AIIDE. 2005. simple, quick, scalable, reasonable solution quality 1 stay 2 2. Perform single-agent pathfinding for each agent according to priorities, while avoiding collisions with already competed paths. 1 2 1. Assign priorities to each agent.
Harabor, D., Stuckey, P. J., & Koenig, S. MAPF-LNS2: fast repairing for multi-agent path finding via large neighborhood search. AAAI. 2022. high-level search low-level search query paths for the subset of agents return paths identify subset of agents (e.g., random selection) find paths for the subset with a smaller number of collisions within the subset (e.g., PP)
Lifelong MAPF Competition Chan, Shao-Hung, et al. The league of robot runners competition: Goals, designs, and implementation. ICAPS. 2024 The best overall submission was based on PIBT (2023) I did not involve! default planner : PIBT-based winner : PIBT-based - winner prizes (total): $10,000 - 688 submissions (2023) => 1513 (2024)
& Nagai, H. Lightweight and Effective Preference Construction in PIBT for Large-Scale Multi-Agent Pathfinding. SoCS. 2025. Tiebreaking does matter! a b 1 desired 2 3 3 or 4 2 3 or 4 1 desired High Low PIBT preference b a 5 steps 3 steps
& Nagai, H. Lightweight and Effective Preference Construction in PIBT for Large-Scale Multi-Agent Pathfinding. SoCS. 2025. cost #agents 20% Tiebreaking does matter! PIBT and LaCAM are minimal, leaving plenty of room for improvement! hindrance + regret
& Yokomachi, N. Congestion Mitigation Path Planning for Large-Scale Multi-Agent Navigation in Dense Environments. RA-L. 2025. with congestion-aware guidance task: decentralized navigation in continuous spaces
space utilization for more effective multi-robot path planning. ICRA. 2022. Okumura, K., Engineering LaCAM∗: Towards Real-Time, Large-Scale, and Near-Optimal Multi-Agent Pathfinding. AAMAS. 2024. Step 1: Compute spatially dispersed paths Collisions are allowed Pickup an agent and recompute its path while minimizing collisions runtime (sec) cost / LB vanilla LaCAM +Global Guidance 12% random-32-32-20, 409 agents Step 2: Run PIBT, while making agents follow the precomputed paths as much as possible
Local Guidance for Configuration-Based Multi-Agent Pathfinding. AAAI. 2026. Mitigate local congestion by windowed decouple planning with soft collisions
LaCAM GG LG LNS2 random-64-64-20, 1000 agents vanilla GG (3.1% impr.) LG (31.5% impr.) Room-64-64-8, 1000 agents Mitigate local congestion by windowed decouple planning with soft collisions LG: Local Guidance Arita, T. & Okumura, K. Local Guidance for Configuration-Based Multi-Agent Pathfinding. AAAI. 2026.
without human knowledge. Nature. 2017 83 Search & Learning Lecture by Demis Hassabis at Cambridge, 2025 Learning a neural network model to efficiently guide the search Lee Jin-Man / AP Search Only Learning Only Learning and Search
K., A Comprehensive Review on Leveraging Machine Learning for Multi-Agent Path Finding. IEEE Access. 2024. ʜ local observation Reinforcement Learning or Imitation Learning Train Aiming at decentralization: scalability, real-time responsiveness, partial observation
K., A Comprehensive Review on Leveraging Machine Learning for Multi-Agent Path Finding. IEEE Access. 2024. We have lots. I cannot keep track of them all nowadays.
behind search-based solvers. “when compared with LNS2 and LaCAM3, two of the most reliable state-of-the-art MAPF solvers, SYLPH and all other (learning) planners fail to achieve comparable levels of performance.” He, C., Duhan, T., Tulsyan, P., Kim, P., & Sartoretti, G. Social behavior as a key to learning-based multi-agent pathfinding dilemmas. AIJ. 2025 Skrynnik, A., Andreychuk, A., Borzilov, A., Chernyavskiy, A., Yakovlev, K., & Panov, A.. POGEMA: A Benchmark Platform for Cooperative Multi-Agent Pathfinding. ICLR. 2025 (2025) LaCAM3
require non-trivial data collection and complex models but perform worse than LaCAM” Neural-guided PIBT Veerapaneni, R., Jakobsson, A., Ren, K., Kim, S., Li, J., Likhachev, M. Improving Learnt Local MAPF Policies with Heuristic Search. ICAPS. 2024. “our results demonstrate that current learning-based methods fail to exhibit a clear advantage over simple rule-based heuristics” Neural-guided LNS Tan, J., Luo, Y., Li, J., Ma, H. Reevaluation of Large Neighborhood Search for MAPF. SoCS. 2025. (2025) Neural components so far remain an interesting aid; not practical.
Okumura, K., Amir, M. & Prorok, A. Graph Attention-Guided Search for Dense Multi-Agent Pathfinding. AAAI. 2026. LaCAM3 anytime algorithm, leading search-based planner runtime [s] solution cost flowtime/LB dense warehouse, 128 agents dense {maze, room}, 160 agents For the first time, neuro-hybrid search clearly beats leading search-based planners! LaGAT
Lin, W., Liu, Z., Prorok, A. Message-aware graph attention networks for large-scale multi-robot path planning. RA-L. 2021. encoder decoder communication graph imitation loss from expert trajectories (LaCAM3 for MAGAT+) graph attention layer message passing (MAGAT+ uses an improved attention architecture from the original version.)
m odel size Tradeoff Again Not appropriate; LaCAM calls NN many times Ideal, but misguiding sometimes behavior of 12K-parameters model (final params: 760K)
in a pairwise manner. message passing input output But MAPF is a group interaction problem! stay The 'stay' action cannot be deduced via pairwise reasoning.
hypergraph neural layer decoder ʜ imitation loss from expert trajectories (LaCAM3 for HMAGAT) directed hypergraph Jain, R., Okumura, K., Amir, M., Lio, P. & Prorok, A. Pairwise is Not Enough: Hypergraph Neural Networks for Multi-Agent Pathfinding. arXiv preprint. 2025.
Prorok, A. & Honig, W. db-LaCAM: Fast and Scalable Multi-Robot Kinodynamic Motion Planning with Discontinuity-Bounded Search and Lightweight MAPF. arXiv preprint. 2025. fast kinodynamic motion planning for arbitrary multi-robot systems
collision-freeness <latexit sha1_base64="j0M8LRXYhqkstlmc5KVeFD+d7wI=">AAACtHicbVHLjtMwFHXCawivAks2FhWoI1CUIFTYVBoeC8RqQO10pLoNjnPTMeM4Ib5BraJ8ITt2/A3OQwhmuFLkc849R9e5jgslDQbBL8e9cvXa9RsHN71bt+/cvTe6/+DE5FUpYCFylZenMTegpIYFSlRwWpTAs1jBMj5/1/aX36E0Mtdz3BewzvhWy1QKjlaKRj9YJnVU7yLcyOe0ao+Gfat48pSZKotqSZnU9E3TM5wFzWZOP052Gxmh9bfHoceYxxB2WBsf/SHeOmp8FjZ0RtNJz5o+0QH2HhRy2qc7d2CdJpJ/+LwbvR2Uz/3MQ8oU1wlt6deWziiDrMC9AYxG48APuqKXQTiAMRnqOBr9ZEkuqgw0CsWNWYVBgeualyiFgsZjlYGCi3O+hZWFmmdg1nW39IY+sUpC07y0n0baqX8nap4Zs89i68w4npmLvVb8X29VYfp6XUtdVAha9IPSSlHMafuCNJElCFR7C7gopb0rFWe85ALtO3t2CeHFX74MTl744dSffno5Pno7rOOAPCKPyYSE5BU5Ih/IMVkQ4YTO0vnicHfqMle40FtdZ8g8JP+Uq38DsJrQgg==</latexit> min xi t ,ui t X i2A T X t=0 J(xi t , ui t ) s.t. xi t+1 = f(xi t , ui t , t) xi 0 = si xi T 2 gi R(xi t ) ^ R(xj t ) = ; cost numerical optimization (direct collocation) with nonlinear programming (IPOPT) <latexit sha1_base64="mdlukWKoBCNC1KAVprOqFMED/MA=">AAACDHicbVDLTgIxFO3gC/GFunTTSExgQ2aMQTcmRDcuMZFHAkg65QKVTmfS3jESwge48VfcuNAYt36AO//G8lgoeJImp+ecm/YeP5LCoOt+O4ml5ZXVteR6amNza3snvbtXMWGsOZR5KENd85kBKRSUUaCEWqSBBb6Eqt+/HPvVe9BGhOoGBxE0A9ZVoiM4Qyu10plK9qGFtyJHG5KpNp1e73L0nDYgiHBgAG3KzbsT0EXizUiGzFBqpb8a7ZDHASjkkhlT99wIm0OmUXAJo1QjNhAx3mddqFuqWACmOZwsM6JHVmnTTqjtUUgn6u+JIQuMGQS+TQYMe2beG4v/efUYO2fNoVBRjKD49KFOLCmGdNwMbQsNHOXAEsa1sH+lvMc042j7S9kSvPmVF0nlOO8V8oXrk0zxYlZHkhyQQ5IlHjklRXJFSqRMOHkkz+SVvDlPzovz7nxMowlnNrNP/sD5/AGRpJoW</latexit> V (xi t ) ^ V (xj t ) = ; Multi-Robot Trajectory Optimization Smoother trajectories, but slow! We need better tools. 9 agents, computation time ~15s
K., Woo, H., Shankar, A. & Prorok, A. D4orm: Multi-Robot Trajectories with Dynamics-aware Diffusion Denoised Deformations . IROS. 2025. Optimizing multi-robot trajectories by iteratively optimizing deformation control with GPU-parallelizable Monte-Carlo gradient approximation – no learning! <latexit sha1_base64="nJlVvcN9vOjDyjJtgZQyz4CZvEU=">AAAB73icbVBNS8NAEJ34WetX1aOXxSJ4KolI9VjUg8cK9gPaUDbbTbt0s4m7E6GE/gkvHhTx6t/x5r9x2+agrQ8GHu/NMDMvSKQw6Lrfzsrq2vrGZmGruL2zu7dfOjhsmjjVjDdYLGPdDqjhUijeQIGStxPNaRRI3gpGN1O/9cS1EbF6wHHC/YgOlAgFo2ildveWS6Qk7ZXKbsWdgSwTLydlyFHvlb66/ZilEVfIJDWm47kJ+hnVKJjkk2I3NTyhbEQHvGOpohE3fja7d0JOrdInYaxtKSQz9fdERiNjxlFgOyOKQ7PoTcX/vE6K4ZWfCZWkyBWbLwpTSTAm0+dJX2jOUI4toUwLeythQ6opQxtR0YbgLb68TJrnFa9aqd5flGvXeRwFOIYTOAMPLqEGd1CHBjCQ8Ayv8OY8Oi/Ou/Mxb11x8pkj+APn8weUi4+w</latexit> u <latexit sha1_base64="nJlVvcN9vOjDyjJtgZQyz4CZvEU=">AAAB73icbVBNS8NAEJ34WetX1aOXxSJ4KolI9VjUg8cK9gPaUDbbTbt0s4m7E6GE/gkvHhTx6t/x5r9x2+agrQ8GHu/NMDMvSKQw6Lrfzsrq2vrGZmGruL2zu7dfOjhsmjjVjDdYLGPdDqjhUijeQIGStxPNaRRI3gpGN1O/9cS1EbF6wHHC/YgOlAgFo2ildveWS6Qk7ZXKbsWdgSwTLydlyFHvlb66/ZilEVfIJDWm47kJ+hnVKJjkk2I3NTyhbEQHvGOpohE3fja7d0JOrdInYaxtKSQz9fdERiNjxlFgOyOKQ7PoTcX/vE6K4ZWfCZWkyBWbLwpTSTAm0+dJX2jOUI4toUwLeythQ6opQxtR0YbgLb68TJrnFa9aqd5flGvXeRwFOIYTOAMPLqEGd1CHBjCQ8Ayv8OY8Oi/Ou/Mxb11x8pkj+APn8weUi4+w</latexit> u <latexit sha1_base64="nJlVvcN9vOjDyjJtgZQyz4CZvEU=">AAAB73icbVBNS8NAEJ34WetX1aOXxSJ4KolI9VjUg8cK9gPaUDbbTbt0s4m7E6GE/gkvHhTx6t/x5r9x2+agrQ8GHu/NMDMvSKQw6Lrfzsrq2vrGZmGruL2zu7dfOjhsmjjVjDdYLGPdDqjhUijeQIGStxPNaRRI3gpGN1O/9cS1EbF6wHHC/YgOlAgFo2ildveWS6Qk7ZXKbsWdgSwTLydlyFHvlb66/ZilEVfIJDWm47kJ+hnVKJjkk2I3NTyhbEQHvGOpohE3fja7d0JOrdInYaxtKSQz9fdERiNjxlFgOyOKQ7PoTcX/vE6K4ZWfCZWkyBWbLwpTSTAm0+dJX2jOUI4toUwLeythQ6opQxtR0YbgLb68TJrnFa9aqd5flGvXeRwFOIYTOAMPLqEGd1CHBjCQ8Ayv8OY8Oi/Ou/Mxb11x8pkj+APn8weUi4+w</latexit> u <latexit sha1_base64="nJlVvcN9vOjDyjJtgZQyz4CZvEU=">AAAB73icbVBNS8NAEJ34WetX1aOXxSJ4KolI9VjUg8cK9gPaUDbbTbt0s4m7E6GE/gkvHhTx6t/x5r9x2+agrQ8GHu/NMDMvSKQw6Lrfzsrq2vrGZmGruL2zu7dfOjhsmjjVjDdYLGPdDqjhUijeQIGStxPNaRRI3gpGN1O/9cS1EbF6wHHC/YgOlAgFo2ildveWS6Qk7ZXKbsWdgSwTLydlyFHvlb66/ZilEVfIJDWm47kJ+hnVKJjkk2I3NTyhbEQHvGOpohE3fja7d0JOrdInYaxtKSQz9fdERiNjxlFgOyOKQ7PoTcX/vE6K4ZWfCZWkyBWbLwpTSTAm0+dJX2jOUI4toUwLeythQ6opQxtR0YbgLb68TJrnFa9aqd5flGvXeRwFOIYTOAMPLqEGd1CHBjCQ8Ayv8OY8Oi/Ou/Mxb11x8pkj+APn8weUi4+w</latexit> u <latexit sha1_base64="nJlVvcN9vOjDyjJtgZQyz4CZvEU=">AAAB73icbVBNS8NAEJ34WetX1aOXxSJ4KolI9VjUg8cK9gPaUDbbTbt0s4m7E6GE/gkvHhTx6t/x5r9x2+agrQ8GHu/NMDMvSKQw6Lrfzsrq2vrGZmGruL2zu7dfOjhsmjjVjDdYLGPdDqjhUijeQIGStxPNaRRI3gpGN1O/9cS1EbF6wHHC/YgOlAgFo2ildveWS6Qk7ZXKbsWdgSwTLydlyFHvlb66/ZilEVfIJDWm47kJ+hnVKJjkk2I3NTyhbEQHvGOpohE3fja7d0JOrdInYaxtKSQz9fdERiNjxlFgOyOKQ7PoTcX/vE6K4ZWfCZWkyBWbLwpTSTAm0+dJX2jOUI4toUwLeythQ6opQxtR0YbgLb68TJrnFa9aqd5flGvXeRwFOIYTOAMPLqEGd1CHBjCQ8Ayv8OY8Oi/Ou/Mxb11x8pkj+APn8weUi4+w</latexit> u D4orm IPOPT runtime [s] #robots 2D holonomic circular scenario
task planning parallelization, GPGPU environment optimization central-to-decentralized deployment divide-and-conquer hardware testbed +visible successful examples in real industrial setups (I may bring some in a few years)