6,582 research outputs found

    Emergence and resilience in multi-agent reinforcement learning

    Get PDF
    Our world represents an enormous multi-agent system (MAS), consisting of a plethora of agents that make decisions under uncertainty to achieve certain goals. The interaction of agents constantly affects our world in various ways, leading to the emergence of interesting phenomena like life forms and civilizations that can last for many years while withstanding various kinds of disturbances. Building artificial MAS that are able to adapt and survive similarly to natural MAS is a major goal in artificial intelligence as a wide range of potential real-world applications like autonomous driving, multi-robot warehouses, and cyber-physical production systems can be straightforwardly modeled as MAS. Multi-agent reinforcement learning (MARL) is a promising approach to build such systems which has achieved remarkable progress in recent years. However, state-of-the-art MARL commonly assumes very idealized conditions to optimize performance in best-case scenarios while neglecting further aspects that are relevant to the real world. In this thesis, we address emergence and resilience in MARL which are important aspects to build artificial MAS that adapt and survive as effectively as natural MAS do. We first focus on emergent cooperation from local interaction of self-interested agents and introduce a peer incentivization approach based on mutual acknowledgments. We then propose to exploit emergent phenomena to further improve coordination in large cooperative MAS via decentralized planning or hierarchical value function factorization. To maintain multi-agent coordination in the presence of partial changes similar to classic distributed systems, we present adversarial methods to improve and evaluate resilience in MARL. Finally, we briefly cover a selection of further topics that are relevant to advance MARL towards real-world applicability.Unsere Welt stellt ein riesiges Multiagentensystem (MAS) dar, welches aus einer Vielzahl von Agenten besteht, die unter Unsicherheit Entscheidungen treffen müssen, um bestimmte Ziele zu erreichen. Die Interaktion der Agenten beeinflusst unsere Welt stets auf unterschiedliche Art und Weise, wodurch interessante emergente Phänomene wie beispielsweise Lebensformen und Zivilisationen entstehen, die über viele Jahre Bestand haben und dabei unterschiedliche Arten von Störungen überwinden können. Die Entwicklung von künstlichen MAS, die ähnlich anpassungs- und überlebensfähig wie natürliche MAS sind, ist eines der Hauptziele in der künstlichen Intelligenz, da viele potentielle Anwendungen wie zum Beispiel das autonome Fahren, die multi-robotergesteuerte Verwaltung von Lagerhallen oder der Betrieb von cyber-phyischen Produktionssystemen, direkt als MAS formuliert werden können. Multi-Agent Reinforcement Learning (MARL) ist ein vielversprechender Ansatz, mit dem in den letzten Jahren bemerkenswerte Fortschritte erzielt wurden, um solche Systeme zu entwickeln. Allerdings geht der Stand der Forschung aktuell von sehr idealisierten Annahmen aus, um die Effektivität ausschließlich für Szenarien im besten Fall zu optimieren. Dabei werden weiterführende Aspekte, die für die echte Welt relevant sind, größtenteils außer Acht gelassen. In dieser Arbeit werden die Aspekte Emergenz und Resilienz in MARL betrachtet, welche wichtig für die Entwicklung von anpassungs- und überlebensfähigen künstlichen MAS sind. Es wird zunächst die Entstehung von emergenter Kooperation durch lokale Interaktion von selbstinteressierten Agenten untersucht. Dazu wird ein Ansatz zur Peer-Incentivierung vorgestellt, welcher auf gegenseitiger Anerkennung basiert. Anschließend werden Ansätze zur Nutzung emergenter Phänomene für die Koordinationsverbesserung in großen kooperativen MAS präsentiert, die dezentrale Planungsverfahren oder hierarchische Faktorisierung von Evaluationsfunktionen nutzen. Zur Aufrechterhaltung der Multiagentenkoordination bei partiellen Veränderungen, ähnlich wie in klassischen verteilten Systemen, werden Methoden des Adversarial Learning vorgestellt, um die Resilienz in MARL zu verbessern und zu evaluieren. Abschließend wird kurz eine Auswahl von weiteren Themen behandelt, die für die Einsatzfähigkeit von MARL in der echten Welt relevant sind

    Agents for educational games and simulations

    Get PDF
    This book consists mainly of revised papers that were presented at the Agents for Educational Games and Simulation (AEGS) workshop held on May 2, 2011, as part of the Autonomous Agents and MultiAgent Systems (AAMAS) conference in Taipei, Taiwan. The 12 full papers presented were carefully reviewed and selected from various submissions. The papers are organized topical sections on middleware applications, dialogues and learning, adaption and convergence, and agent applications

    Towards adaptive multi-robot systems: self-organization and self-adaptation

    Get PDF
    Dieser Beitrag ist mit Zustimmung des Rechteinhabers aufgrund einer (DFG geförderten) Allianz- bzw. Nationallizenz frei zugänglich.This publication is with permission of the rights owner freely accessible due to an Alliance licence and a national licence (funded by the DFG, German Research Foundation) respectively.The development of complex systems ensembles that operate in uncertain environments is a major challenge. The reason for this is that system designers are not able to fully specify the system during specification and development and before it is being deployed. Natural swarm systems enjoy similar characteristics, yet, being self-adaptive and being able to self-organize, these systems show beneficial emergent behaviour. Similar concepts can be extremely helpful for artificial systems, especially when it comes to multi-robot scenarios, which require such solution in order to be applicable to highly uncertain real world application. In this article, we present a comprehensive overview over state-of-the-art solutions in emergent systems, self-organization, self-adaptation, and robotics. We discuss these approaches in the light of a framework for multi-robot systems and identify similarities, differences missing links and open gaps that have to be addressed in order to make this framework possible

    Dynamic Switching Mechanism to Support Self-organization in ADACOR Holonic Control System

    Get PDF
    Evolvable control systems face the demands for modularity, decentralization, reconfigurabil-ity and responsiveness pointed out by the Industrie 4.0 initiative. In these systems, the self-organization model assumes a critical issue to ensure the correct evolution of the system structure into different operating configurations. ADACOR holonic manufacturing control architecture introduces an adaptive production control mechanism that balances between two states, combining the optimization provided by hierarchical structures with agility and responsiveness to condition changes offered by decentralized structures. This paper describes the switching mechanism that supports this dynamic balance and particularly the local and global driving forces for the self-organization model. The proposed model was experimentally tested in a small scale production system.info:eu-repo/semantics/publishedVersio

    Predicting the expected behavior of agents that learn about agents: the CLRI framework

    Full text link
    We describe a framework and equations used to model and predict the behavior of multi-agent systems (MASs) with learning agents. A difference equation is used for calculating the progression of an agent's error in its decision function, thereby telling us how the agent is expected to fare in the MAS. The equation relies on parameters which capture the agent's learning abilities, such as its change rate, learning rate and retention rate, as well as relevant aspects of the MAS such as the impact that agents have on each other. We validate the framework with experimental results using reinforcement learning agents in a market system, as well as with other experimental results gathered from the AI literature. Finally, we use PAC-theory to show how to calculate bounds on the values of the learning parameters

    Games for a new climate: experiencing the complexity of future risks

    Full text link
    This repository item contains a single issue of the Pardee Center Task Force Reports, a publication series that began publishing in 2009 by the Boston University Frederick S. Pardee Center for the Study of the Longer-Range Future.This report is a product of the Pardee Center Task Force on Games for a New Climate, which met at Pardee House at Boston University in March 2012. The 12-member Task Force was convened on behalf of the Pardee Center by Visiting Research Fellow Pablo Suarez in collaboration with the Red Cross/Red Crescent Climate Centre to “explore the potential of participatory, game-based processes for accelerating learning, fostering dialogue, and promoting action through real-world decisions affecting the longer-range future, with an emphasis on humanitarian and development work, particularly involving climate risk management.” Compiled and edited by Janot Mendler de Suarez, Pablo Suarez and Carina Bachofen, the report includes contributions from all of the Task Force members and provides a detailed exploration of the current and potential ways in which games can be used to help a variety of stakeholders – including subsistence farmers, humanitarian workers, scientists, policymakers, and donors – to both understand and experience the difficulty and risks involved related to decision-making in a complex and uncertain future. The dozen Task Force experts who contributed to the report represent academic institutions, humanitarian organization, other non-governmental organizations, and game design firms with backgrounds ranging from climate modeling and anthropology to community-level disaster management and national and global policymaking as well as game design.Red Cross/Red Crescent Climate Centr

    Theory of Mind in Large Language Models: Examining Performance of 11 State-of-the-Art models vs. Children Aged 7-10 on Advanced Tests

    Full text link
    To what degree should we ascribe cognitive capacities to Large Language Models (LLMs), such as the ability to reason about intentions and beliefs known as Theory of Mind (ToM)? Here we add to this emerging debate by (i) testing 11 base- and instruction-tuned LLMs on capabilities relevant to ToM beyond the dominant false-belief paradigm, including non-literal language usage and recursive intentionality; (ii) using newly rewritten versions of standardized tests to gauge LLMs' robustness; (iii) prompting and scoring for open besides closed questions; and (iv) benchmarking LLM performance against that of children aged 7-10 on the same tasks. We find that instruction-tuned LLMs from the GPT family outperform other models, and often also children. Base-LLMs are mostly unable to solve ToM tasks, even with specialized prompting. We suggest that the interlinked evolution and development of language and ToM may help explain what instruction-tuning adds: rewarding cooperative communication that takes into account interlocutor and context. We conclude by arguing for a nuanced perspective on ToM in LLMs.Comment: 14 pages, 4 figures, Forthcoming in Proceedings of the 27th Conference on Computational Natural Language Learning (CoNLL

    Dynamic learning of the environment for eco-citizen behavior

    Get PDF
    Le développement de villes intelligentes et durables nécessite le déploiement des technologies de l'information et de la communication (ITC) pour garantir de meilleurs services et informations disponibles à tout moment et partout. Comme les dispositifs IoT devenant plus puissants et moins coûteux, la mise en place d'un réseau de capteurs dans un contexte urbain peut être coûteuse. Cette thèse propose une technique pour estimer les informations environnementales manquantes dans des environnements à large échelle. Notre technique permet de fournir des informations alors que les dispositifs ne sont pas disponibles dans une zone de l'environnement non couverte par des capteurs. La contribution de notre proposition est résumée dans les points suivants : - limiter le nombre de dispositifs de détection à déployer dans un environnement urbain ; - l'exploitation de données hétérogènes acquises par des dispositifs intermittents ; - le traitement en temps réel des informations ; - l'auto-calibration du système. Notre proposition utilise l'approche AMAS (Adaptive Multi-Agent System) pour résoudre le problème de l'indisponibilité des informations. Dans cette approche, une exception est considérée comme une situation non coopérative (NCS) qui doit être résolue localement et de manière coopérative. HybridIoT exploite à la fois des informations homogènes (informations du même type) et hétérogènes (informations de différents types ou unités) acquises à partir d'un capteur disponible pour fournir des estimations précises au point de l'environnement où un capteur n'est pas disponible. La technique proposée permet d'estimer des informations environnementales précises dans des conditions de variabilité résultant du contexte d'application urbaine dans lequel le projet est situé, et qui n'ont pas été explorées par les solutions de l'état de l'art : - ouverture : les capteurs peuvent entrer ou sortir du système à tout moment sans qu'aucune configuration particulière soit nécessaire ; - large échelle : le système peut être déployé dans un contexte urbain à large échelle et assurer un fonctionnement correct avec un nombre significatif de dispositifs ; - hétérogénéité : le système traite différents types d'informations sans aucune configuration a priori. Notre proposition ne nécessite aucun paramètre d'entrée ni aucune reconfiguration. Le système peut fonctionner dans des environnements ouverts et dynamiques tels que les villes, où un grand nombre de capteurs peuvent apparaître ou disparaître à tout moment et sans aucun préavis. Nous avons fait différentes expérimentations pour comparer les résultats obtenus à plusieurs techniques standard afin d'évaluer la validité de notre proposition. Nous avons également développé un ensemble de techniques standard pour produire des résultats de base qui seront comparés à ceux obtenus par notre proposition multi-agents.The development of sustainable smart cities requires the deployment of Information and Communication Technology (ICT) to ensure better services and available information at any time and everywhere. As IoT devices become more powerful and low-cost, the implementation of an extensive sensor network for an urban context can be expensive. This thesis proposes a technique for estimating missing environmental information in large scale environments. Our technique enables providing information whereas devices are not available for an area of the environment not covered by sensing devices. The contribution of our proposal is summarized in the following points: * limiting the number of sensing devices to be deployed in an urban environment; * the exploitation of heterogeneous data acquired from intermittent devices; * real-time processing of information; * self-calibration of the system. Our proposal uses the Adaptive Multi-Agent System (AMAS) approach to solve the problem of information unavailability. In this approach, an exception is considered as a Non-Cooperative Situation (NCS) that has to be solved locally and cooperatively. HybridIoT exploits both homogeneous (information of the same type) and heterogeneous information (information of different types or units) acquired from some available sensing device to provide accurate estimates in the point of the environment where a sensing device is not available. The proposed technique enables estimating accurate environmental information under conditions of uncertainty arising from the urban application context in which the project is situated, and which have not been explored by the state-of-the-art solutions: * openness: sensors can enter or leave the system at any time without the need for any reconfiguration; * large scale: the system can be deployed in a large, urban context and ensure correct operation with a significative number of devices; * heterogeneity: the system handles different types of information without any a priori configuration. Our proposal does not require any input parameters or reconfiguration. The system can operate in open, dynamic environments such as cities, where a large number of sensing devices can appear or disappear at any time and without any prior notification. We carried out different experiments to compare the obtained results to various standard techniques to assess the validity of our proposal. We also developed a pipeline of standard techniques to produce baseline results that will be compared to those obtained by our multi-agent proposal

    A Survey of Agent-Based Modeling Practices (January 1998 to July 2008)

    Get PDF
    In the 1990s, Agent-Based Modeling (ABM) began gaining popularity and represents a departure from the more classical simulation approaches. This departure, its recent development and its increasing application by non-traditional simulation disciplines indicates the need to continuously assess the current state of ABM and identify opportunities for improvement. To begin to satisfy this need, we surveyed and collected data from 279 articles from 92 unique publication outlets in which the authors had constructed and analyzed an agent-based model. From this large data set we establish the current practice of ABM in terms of year of publication, field of study, simulation software used, purpose of the simulation, acceptable validation criteria, validation techniques and complete description of the simulation. Based on the current practice we discuss six improvements needed to advance ABM as an analysis tool. These improvements include the development of ABM specific tools that are independent of software, the development of ABM as an independent discipline with a common language that extends across domains, the establishment of expectations for ABM that match their intended purposes, the requirement of complete descriptions of the simulation so others can independently replicate the results, the requirement that all models be completely validated and the development and application of statistical and non-statistical validation techniques specifically for ABM.Agent-Based Modeling, Survey, Current Practices, Simulation Validation, Simulation Purpose

    A Framework for Collaborative Multi-task, Multi-robot Missions

    Get PDF
    Robotics is a transformative technology that will empower our civilization for a new scale of human endeavors. Massive scale is only possible through the collaboration of individual or groups of robots. Collaboration allows specialization, meaning a multirobot system may accommodate heterogeneous platforms including human partners. This work develops a unified control architecture for collaborative missions comprised of multiple, multi-robot tasks. Using kinematic equations and Jacobian matrices, the system states are transformed into alternative control spaces which are more useful for the designer or more convenient for the operator. The architecture allows multiple tasks to be combined, composing tightly coordinated missions. Using this approach, the designer is able to compensate for non-ideal behavior in the appropriate space using whatever control scheme they choose. This work presents a general design methodology, including analysis techniques for relevant control metrics like stability, responsiveness, and disturbance rejection, which were missing in prior work. Multiple tasks may be combined into a collaborative mission. The unified motion control architecture merges the control space components for each task into a concise federated system to facilitate analysis and implementation. The task coordination function defines task commands as functions of mission commands and state values to create explicit closed-loop collaboration. This work presents analysis techniques to understand the effects of cross-coupling tasks. This work analyzes system stability for the particular control architecture and identifies an explicit condition to ensure stable switching when reallocating robots. We are unaware of any other automated control architectures that address large-scale collaborative systems composed of task-oriented multi-robot coalitions where relative spatial control is critical to mission performance. This architecture and methodology have been validated in experiments and in simulations, repeating earlier work and exploring new scenarios and. It can perform large-scale, complex missions via a rigorous design methodology
    corecore