Skip to main content

Lossless Clustering of Histories in Decentralized POMDPs

Frans Oliehoek‚ Shimon Whiteson and Matthijs Spaan

Abstract

Decentralized partially observable Markov decision processes (Dec-POMDPs) constitute a generic and expressive framework for multiagent planning under uncertainty. However, planning optimally is difficult because solutions map local observation histories to actions, and the number of such histories grows exponentially in the planning horizon. In this work, we identify a criterion that allows for lossless clustering of observation histories: i.e., we prove that when two histories satisfy the criterion, they have the same optimal value and thus can be treated as one. We show how this result can be exploited in optimal policy search and demonstrate empirically that it can provide a speed-up of multiple orders of magnitude, allowing the optimal solution of significantly larger problems. These results include the first-ever solution of the well-known Dec-Tiger problem for horizon h=5. We also perform an empirical analysis of the generality of our clustering method, which suggests that it may also be useful in other (approximate) Dec-POMDP solution methods.

Book Title
AAMAS 2009: Proceedings of the Eighth International Joint Conference on Autonomous Agents and Multi−Agent Systems
Month
May
Pages
577−584
Year
2009