We consider the problem of finding an n-agent joint-policy for the optimal
finite-horizon control of a decentralized Pomdp (Dec-Pomdp). This is a problem
of very high complexity (NEXP-hard in n >= 2). In this paper, we propose a new
mathematical programming approach for the problem. Our approach is based on two
ideas: First, we represent each agent's policy in the sequence-form and not in
the tree-form, thereby obtaining a very compact representation of the set of
joint-policies. Second, using this compact representation, we solve this
problem as an instance of combinatorial optimization for which we formulate a
mixed integer linear program (MILP). The optimal solution of the MILP directly
yields an optimal joint-policy for the Dec-Pomdp. Computational experience
shows that formulating and solving the MILP requires significantly less time to
solve benchmark Dec-Pomdp problems than existing algorithms. For example, the
multi-agent tiger problem for horizon 4 is solved in 72 secs with the MILP
whereas existing algorithms require several hours to solve it

Aras, Raghav

Charpillet, François

Dutech, Alain

English

arXiv

International audienceWe consider the problem of finding an n-agent joint-policy for the optimal finite-horizon control of a decentralized Pomdp (Dec-Pomdp). This is a problem of very high complexity (NEXP-hard in n >= 2). In this paper, we propose a new mathematical programming approach for the problem. Our approach is based on two ideas: First, we represent each agent's policy in the sequence-form and not in the tree-form, thereby obtaining a very compact representation of the set of joint-policies. Second, using this compact representation, we solve this problem as an instance of combinatorial optimization for which we formulate a mixed integer linear program (MILP). The optimal solution of the MILP directly yields an optimal joint-policy for the Dec-Pomdp. Computational experience shows that formulating and solving the MILP requires significantly less time to solve benchmark Dec-Pomdp problems than existing algorithms. For example, the multi-agent tiger problem for horizon 4 is solved in 72 secs with the MILP whereas existing algorithms require several hours to solve it

INRIA a CCSD electronic archive server

Mixed Integer Linear Programming For Exact Finite-Horizon Planning In Decentralized Pomdps

HAL-Rennes 1

We consider the problem of finding an n-agent jointpolicy for the optimal finite-horizon control of a decentralized Pomdp (Dec-Pomdp). This is a problem of very high complexity (NEXP-hard in n ≥ 2). In this paper, we propose a new mathematical programming approach for the problem. Our approach is based on two ideas: First, we represent each agent’s policy in the sequence-form and not in the tree-form, thereby obtaining a very compact representation of the set of joint-policies. Second, using this compact representation, we solve this problem as an instance of combinatorial optimization for which we formulate a mixed integer linear program (MILP). The optimal solution of the MILP directly yields an optimal joint-policy for the Dec-Pomdp. Computational experience shows that formulating and solving the MILP requires significantly less time to solve benchmark Dec-Pomdp problems than existing algorithms. For example, the multi-agent tiger problem for horizon 4 is solved in 72 secs with the MILP whereas existing algorithms require several hours to solve it

Raghav Aras

Alain Dutech

François Charpillet

CiteSeerX

https://hal.inria.fr/inria-00163372/file/raw-milp-icap-final.pdf

Mixed Integer Linear Programming For Exact Finite-Horizon Planning In Decentralized Pomdps

Abstract

Similar works

Full text

Available Versions

INRIA a CCSD electronic archive server

HAL-Rennes 1

CiteSeerX