2 research outputs found

    Coordinating decentralized learning and conflict resolution across agent boundaries

    Get PDF
    It is crucial for embedded systems to adapt to the dynamics of open environments. This adaptation process becomes especially challenging in the context of multiagent systems because of scalability, partial information accessibility and complex interaction of agents. It is a challenge for agents to learn good policies, when they need to plan and coordinate in uncertain, dynamic environments, especially when they have large state spaces. It is also critical for agents operating in a multiagent system (MAS) to resolve conflicts among the learned policies of different agents, since such conflicts may have detrimental influence on the overall performance. The focus of this research is to use a reinforcement learning based local optimization algorithm within each agent to learn multiagent policies in a decentralized fashion. These policies will allow each agent to adapt to changes in environmental conditions while reorganizing the underlying multiagent network when needed. The research takes an adaptive approach to resolving conflicts that can arise between locally optimal agent policies. First an algorithm that uses heuristic rules to locally resolve simple conflicts is presented. When the environment is more dynamic and uncertain, a mediator-based mechanism to resolve more complicated conflicts and selectively expand the agents' state space during the learning process is harnessed. For scenarios where mediator-based mechanisms with partially global views are ineffective, a more rigorous approach for global conflict resolution that synthesizes multiagent reinforcement learning (MARL) and distributed constraint optimization (DCOP) is developed. These mechanisms are evaluated in the context of a multiagent tornado tracking application called NetRads. Empirical results show that these mechanisms significantly improve the performance of the tornado tracking network for a variety of weather scenarios. The major contributions of this work are: a state of the art decentralized learning approach that supports agent interactions and reorganizes the underlying network when needed; the use of abstract classes of scenarios/states/actions that efficiently manages the exploration of the search space; novel conflict resolution algorithms of increasing complexity that use heuristic rules, sophisticated automated negotiation mechanisms and distributed constraint optimization methods respectively; and finally, a rigorous study of the interplay between two popular theories used to solve multiagent problems, namely decentralized Markov decision processes and distributed constraint optimization

    Experiences Building a Distributed Sensor Network

    No full text
    Extended Abstract A central challenge in building advanced sensor networks will be the development of distributed and robust control for such networks that scales to thousands of intelligent sensors Together with this adaptive re-structuring of long-term roles and responsibilities, there is also a need for short-term adaptivity related to the dynamic allocation of sensors. This involves allocating the appropriate configuration of sensing/processing resources for effectively sensing the phenomena but also the resolution of conflicting resource assignments that may occur when there are multiple phenomena occurring in the environment that need to be tracked concurrently. More generally, this structuring can be thought of as organizational control. Organizational control is a multilevel control approach in which organizational goals, roles, and responsibilities are dynamically developed, distributed, and maintained to serve as guidelines for making detailed operational control decisions by the individual agents. The parameters guiding the creation and adaptation of the organization can have a dramatic impact on the performance of the sensor network. We have recently completed work on a smallscale sensor network (approximately 36 low-cost, adjustable radar nodes) for multivehicle tracking Our approach is built upon a soft, real-time agent architecture called SRTA, which we constructed as part of this effort Built upon this agent architecture is a virtual agent organization based on partitioning the environment into geographically self-contained sectors each with its own Experiences Building a Distributed Sensor Network 3 local management. Each of these sectors has a sector manager, a role in the organization which has several responsibilities associated with information flow and activity within the sector. Among these responsibilities is the dissemination of a scan schedule to each of the sensors in its sector, specifying the rate and frequency that should be used to scan for new targets. This information is used by each sensor to create a description of the scanning task, which is in turn used by the SRTA architecture to schedule local activities. When a new target is detected, the sector manager selects a track manager, a different organization role responsible for tracking that target as it moves through the environment. This allocation process uses an abstract view of what activities are presently being conducted in the sector to make a choice that load balances processor and communication requirements. Track manager activities entail estimating future location and heading, gathering available sensor information, requesting and negotiating over the sensors, and fusing the data they produce. Upon receipt of such a commitment to perform tracking, a sensor takes on a data collection role. Like the scan schedule, these commitments are used to generate task descriptions used by SRTA to schedule local activities. If conflicting commitments are received by a sensor that imply that the agent has been asked to perform multiple concurrent data collection roles, SRTA will attempt to satisfy all requests as best possible. This provides a window of marginal quality in which a conflict can be detected and then potentially resolved through negotiation with the competing agent to find an equitable long-term solution. As data is gathered, is it fused and interpreted to estimate the target's location, which allows the process to continue. We call this a virtual agent organization since a particular sensor/processor node may be multiplexing among different roles, e.g. sector manager and data collection. The SRTA architecture does the detail scheduling of activities associated with different roles based on their priority and deadline. The planning and scheduling ability of the SRTA architecture also allows us to approach the dynamic allocation of sensors to tracking tasks at an abstract level. Commitments made at this abstract level are then mapped into detail allocations of sensor resources and data processing activities. The organizational structuring we have discussed so far involves setting up longterm patterns of control and information processing. There is also a need for setting up more short-term and dynamic patterns involving the allocation of groups of sensors (sensor platforms and sensor heads) to the tracking of the movement of a specific vehicle. Since sensor heads have limited sensing range and orientation and the vehicle is moving, this allocation process must be repeated as the current group of sensors become inappropriate for tracking the vehicle. Further, the need for this allocation process may be occurring simultaneously in different parts of the sensor network when there are multiple vehicles moving in the environment. Finally, this allocation process is intimately tied with information fusing activities that are tracking the current locations of vehicles and predicting where they are likely to be going. The real-time ability to do this prediction accurately is key to having sensing resources appropriately allocated to sense the vehicle when it arrives in their sensing region. Resource contention is introduced when more than one target enters the viewable range of the same sensor platform. This type of resource allocation can be too complex and time consuming to perform in a centralized manner when the environmental characteristics are both distributed and dynamic, because the costs associated with continuously centralizing the 4 Victor Lesser necessary information are impractical. Negotiation, a form of distributed search Our approach, called SPAM (The Scalable Protocol for Anytime Multi-level negotiation In summary, the use of a sophisticated agent architecture (that includes capabilities for planning and scheduling) and distributed resource allocation mechanisms for short-term agent control and resource allocation, together with an organization structure for long-term agent control, create a powerful paradigm for building the next generation of large scale and intelligent sensor networks. More generally, we see these techniques as applicable to the building of advanced multi-agent applications
    corecore