Search CORE

2,011 research outputs found

Improved Memory-Bounded Dynamic Programming for Decentralized POMDPs

Author: Seuken Sven
Zilberstein Shlomo
Publication venue
Publication date: 20/06/2012
Field of study

Memory-Bounded Dynamic Programming (MBDP) has proved extremely effective in solving decentralized POMDPs with large horizons. We generalize the algorithm and improve its scalability by reducing the complexity with respect to the number of observations from exponential to polynomial. We derive error bounds on solution quality with respect to this new approximation and analyze the convergence behavior. To evaluate the effectiveness of the improvements, we introduce a new, larger benchmark problem. Experimental results show that despite the high complexity of decentralized POMDPs, scalable solution techniques such as MBDP perform surprisingly well.Comment: Appears in Proceedings of the Twenty-Third Conference on Uncertainty in Artificial Intelligence (UAI2007

arXiv.org e-Print Archive

ScholarWorks@UMass Amherst

Global Optimization for Value Function Approximation

Author: Petrik Marek
Zilberstein Shlomo
Publication venue
Publication date: 14/06/2010
Field of study

Existing value function approximation methods have been successfully used in many applications, but they often lack useful a priori error bounds. We propose a new approximate bilinear programming formulation of value function approximation, which employs global optimization. The formulation provides strong a priori guarantees on both robust and expected policy loss by minimizing specific norms of the Bellman residual. Solving a bilinear program optimally is NP-hard, but this is unavoidable because the Bellman-residual minimization itself is NP-hard. We describe and analyze both optimal and approximate algorithms for solving bilinear programs. The analysis shows that this algorithm offers a convergent generalization of approximate policy iteration. We also briefly analyze the behavior of bilinear programming algorithms under incomplete samples. Finally, we demonstrate that the proposed approach can consistently minimize the Bellman residual on simple benchmark problems

arXiv.org e-Print Archive

ScholarWorks@UMass Amherst

The locust frontal ganglion: a central pattern generator network controlling foregut rhythmic motor patterns

Author: Ayali A.
Cohen N.
Zilberstein Y.
Publication venue: The Company of Biologists Ltd
Publication date: 15/09/2002
Field of study

The frontal ganglion (FG) is part of the insect stomatogastric nervous system and is found in most insect orders. Previous work has shown that in the desert locust, Schistocerca gregaria, the FG constitutes a major source of innervation to the foregut. In an in vitro preparation, isolated from all descending and sensory inputs, the FG spontaneously generated rhythmic multi-unit bursts of action potentials that could be recorded from all its efferent nerves. The consistent endogenous FG rhythmic pattern indicates the presence of a central pattern generator network. We found the appearance of in vitro rhythmic activity to be strongly correlated with the physiological state of the donor locust. A robust pattern emerged only after a period of saline superfusion, if the locust had a very full foregut and crop, or if the animal was close to ecdysis. Accordingly, haemolymph collected at these stages inhibited an ongoing rhythmic pattern when applied onto the ganglion. We present this novel central pattern generating system as a basis for future work on the neural network characterisation and its role in generating and controlling behaviour

White Rose Research Online

MAA*: A Heuristic Search Algorithm for Solving Decentralized POMDPs

Author: Charpillet Francois
Szer Daniel
Zilberstein Shlomo
Publication venue
Publication date: 01/01/2012
Field of study

We present multi-agent A* (MAA*), the first complete and optimal heuristic search algorithm for solving decentralized partially-observable Markov decision problems (DEC-POMDPs) with finite horizon. The algorithm is suitable for computing optimal plans for a cooperative group of agents that operate in a stochastic environment such as multirobot coordination, network traffic control, `or distributed resource allocation. Solving such problems efiectively is a major challenge in the area of planning under uncertainty. Our solution is based on a synthesis of classical heuristic search and decentralized control theory. Experimental results show that MAA* has significant advantages. We introduce an anytime variant of MAA* and conclude with a discussion of promising extensions such as an approach to solving infinite horizon problems.Comment: Appears in Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence (UAI2005

arXiv.org e-Print Archive

ScholarWorks@UMass Amherst

Optimizing Memory-Bounded Controllers for Decentralized POMDPs

Author: Amato Christopher
Bernstein Daniel S
Zilberstein Shlomo
Publication venue
Publication date: 01/01/2012
Field of study

We present a memory-bounded optimization approach for solving infinite-horizon decentralized POMDPs. Policies for each agent are represented by stochastic finite state controllers. We formulate the problem of optimizing these policies as a nonlinear program, leveraging powerful existing nonlinear optimization techniques for solving the problem. While existing solvers only guarantee locally optimal solutions, we show that our formulation produces higher quality controllers than the state-of-the-art approach. We also incorporate a shared source of randomness in the form of a correlation device to further increase solution quality with only a limited increase in space and time. Our experimental results show that nonlinear optimization can be used to provide high quality, concise solutions to decentralized decision problems under uncertainty.Comment: Appears in Proceedings of the Twenty-Third Conference on Uncertainty in Artificial Intelligence (UAI2007

arXiv.org e-Print Archive

ScholarWorks@UMass Amherst