Optimal agent cooperation with limited information

De Mot, Jan

oai:dspace.mit.edu:1721.1/30360

Optimal agent cooperation with limited information

Authors: Jan De Mot
Publication date: 1 January 2005
Publisher: Massachusetts Institute of Technology

Abstract

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Mechanical Engineering, 2005.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Includes bibliographical references (p. 199-204).Multi-agent systems are in general believed to be more efficient, robust, and versatile than their single-agent equivalents. However, it is not an easy task to design strategies that fully exploit the multi-agent benefits, and with this in mind we address several multi-agent system design issues. Specifically, it is of central importance to determine the optimal agent group composition, which involves a trade-off between the cost and performance increase per additional agent. Further, truly autonomous agents solely rely on on-board environment measurements, the design of which requires quantifying the multi-agent performance as a function of the locally observed environment areas. In this thesis, we focus on the collaborative search for individually rewarding resources, i.e. it is possible for multiple agents to incur the same reward. The system objective is to maximize the aggregate rewards incurred. Motivated by a cooperative surveillance context, we formulate a graph traversal problem on an unbounded structured graph, and restrain the agent motion spatially so that only the lateral agent separation is controlled. We model the problem mathematically as a discrete, infinite state, infinite horizon Dynamic Program and convert it using standard techniques to an equivalent Linear Program (LP) with infinitely many constraints. The graph spatial invariance allows to decompose the LP into a set of infinitely many coupled LPs, each with finitely many constraints. We establish that the unique bounded function that simultaneously satisfies the latter LPs is the problem optimal value function.(cont.) Based on this, we compute the two-agent optimal value function explicitly as the solution of an LP with finitely many constraints for small agent separations, and implicitly in the form of a recursion for large agent separations, satisfying adequate connection constraints. Finally, we propose a similar method to compute the state probability distribution in steady state under an optimal policy, summarizing the agent behavior at large separations in a set of connection constraints, which is sufficient to compute the probability distribution at small separations. We analyze and compare the optimal performance of various problem instances. We confirm and quantify the intuition that the performance increases with the group size. Some results stand out: for cone-shaped local observation, two agents incur 25% less cost than a single agent in a mine field type environment (scarce though high costs); further, for some environment specifics, a third agent provides little to no performance increase. Then, we compare various local observation zones, and quantify their effect on the overall group performance. Finally, we study the agent spatial distribution under an optimal policy, and observe that as rewards are scarcer, the agents tend to spread in order to gather information on a larger environment part.by Jan De Mot.Ph.D

Similar works

Full text

DSpace@MIT

oai:dspace.mit.edu:1721.1/3036...

Last time updated on 11/06/2012

This paper was published in DSpace@MIT.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.