369 research outputs found

    Stick-Breaking Policy Learning in Dec-POMDPs

    Get PDF
    Expectation maximization (EM) has recently been shown to be an efficient algorithm for learning finite-state controllers (FSCs) in large decentralized POMDPs (Dec-POMDPs). However, current methods use fixed-size FSCs and often converge to maxima that are far from optimal. This paper considers a variable-size FSC to represent the local policy of each agent. These variable-size FSCs are constructed using a stick-breaking prior, leading to a new framework called \emph{decentralized stick-breaking policy representation} (Dec-SBPR). This approach learns the controller parameters with a variational Bayesian algorithm without having to assume that the Dec-POMDP model is available. The performance of Dec-SBPR is demonstrated on several benchmark problems, showing that the algorithm scales to large problems while outperforming other state-of-the-art methods

    TOPOLOGY-AWARE APPROACH FOR THE EMERGENCE OF SOCIAL NORMS IN MULTIAGENT SYSTEMS

    Get PDF
    Social norms facilitate agent coordination and conflict resolution without explicit communication. Norms generally involve restrictions on a set of actions or behaviors of agents to a particular strategy and can significantly reduce the cost of coordination. There has been recent progress in multiagent systems (MAS) research to develop a deep understanding of the social norm formation process. This includes developing mechanisms to create social norms in an effective and efficient manner. The hypoth- esis of this dissertation is that equipping agents in networked MAS with ā€œnetwork thinkingā€ capabilities and using this contextual knowledge to form social norms in an effective and efficient manner improves the performance of the MAS. This disser- tation investigates the social norm emergence problem in conventional norms (where there is no conflict between individual and collective interests) and essential norms (where agents need to explicitly cooperate to achieve socially-efficient behavior) from a game-theoretic perspective. First, a comprehensive investigation of the social norm formation problem is performed in various types of networked MAS with an emphasis on the effect of the topological structures on the process. Based on the insights gained from these network-theoretic investigations, novel topology-aware decentralized mech- anisms are developed that facilitate the emergence of social norms suitable for various environments. It addresses the convention emergence problem in both small and large conventional norm spaces and equip agents to predict the topological structure to use the suitable convention mechanisms. It addresses the cooperation emergence prob- lem in the essential norm space by harnessing agent commitments and altruism where appropriate. Extensive simulation based experimentation has been conducted on dif- ferent network topologies by varying the topological features and agent interaction models. Comparisons with state-of-the-art norm formation techniques show that pro- posed mechanisms facilitate significant improvement in performance in a variety of networks

    Bayesian learning for multi-agent coordination

    No full text
    Multi-agent systems draw together a number of significant trends in modern technology: ubiquity, decentralisation, openness, dynamism and uncertainty. As work in these fields develops, such systems face increasing challenges. Two particular challenges are decision making in uncertain and partially-observable environments, and coordination with other agents in such environments. Although uncertainty and coordination have been tackled as separate problems, formal models for an integrated approach are typically restricted to simple classes of problem and are not scalable to problems with tens of agents and millions of states.We improve on these approaches by extending a principled Bayesian model into more challenging domains, using Bayesian networks to visualise specific cases of the model and thus as an aid in deriving the update equations for the system. One approach which has been shown to scale well for networked offline problems uses finite state machines to model other agents. We used this insight to develop an approximate scalable algorithm applicable to our general model, in combination with adapting a number of existing approximation techniques, including state clustering.We examine the performance of this approximate algorithm on several cases of an urban rescue problem with respect to differing problem parameters. Specifically, we consider first scenarios where agents are aware of the complete situation, but are not certain about the behaviour of others; that is, our model with all elements but the actions observable. Secondly, we examine the more complex case where agents can see the actions of others, but cannot see the full state and thus are not sure about the beliefs of others. Finally, we look at the performance of the partially observable state model when the system is dynamic or open. We find that our best response algorithm consistently outperforms a handwritten strategy for the problem, more noticeably as the number of agents and the number of states involved in the problem increase

    A Comprehensive Survey of Multiagent Reinforcement Learning

    Full text link

    Multiagent systems: games and learning from structures

    Get PDF
    Multiple agents have become increasingly utilized in various fields for both physical robots and software agents, such as search and rescue robots, automated driving, auctions and electronic commerce agents, and so on. In multiagent domains, agents interact and coadapt with other agents. Each agent's choice of policy depends on the others' joint policy to achieve the best available performance. During this process, the environment evolves and is no longer stationary, where each agent adapts to proceed towards its target. Each micro-level step in time may present a different learning problem which needs to be addressed. However, in this non-stationary environment, a holistic phenomenon forms along with the rational strategies of all players; we define this phenomenon as structural properties. In our research, we present the importance of analyzing the structural properties, and how to extract the structural properties in multiagent environments. According to the agents' objectives, a multiagent environment can be classified as self-interested, cooperative, or competitive. We examine the structure from these three general multiagent environments: self-interested random graphical game playing, distributed cooperative team playing, and competitive group survival. In each scenario, we analyze the structure in each environmental setting, and demonstrate the structure learned as a comprehensive representation: structure of players' action influence, structure of constraints in teamwork communication, and structure of inter-connections among strategies. This structure represents macro-level knowledge arising in a multiagent system, and provides critical, holistic information for each problem domain. Last, we present some open issues and point toward future research

    The Role of Models and Communication in the Ad Hoc Multiagent Team Decision Problem

    Get PDF
    Abstract Ad hoc teams are formed of members who have little or no information regarding one another. In order to achieve a shared goal, agents are tasked with learning the capabilities of their teammates such that they can coordinate effectively. Typically, the capabilities of the agent teammates encountered are constrained by the particular domain specifications. However, for wide application, it is desirable to develop systems that are able to coordinate with general ad hoc agents independent of the choice of domain. We propose examining ad hoc multiagent teamwork from a generalized perspective and discuss existing domains within the context of our framework. Furthermore, we consider how communication of agent intentions can provide a means of reducing teammate model uncertainty at key junctures, requiring an agent to consider its own information deficiencies in order to form communicative acts improving team coordination
    • ā€¦
    corecore