265,843 research outputs found

    Innate-Values-driven Reinforcement Learning for Cooperative Multi-Agent Systems

    Full text link
    Innate values describe agents' intrinsic motivations, which reflect their inherent interests and preferences to pursue goals and drive them to develop diverse skills satisfying their various needs. The essence of reinforcement learning (RL) is learning from interaction based on reward-driven (such as utilities) behaviors, much like natural agents. It is an excellent model to describe the innate-values-driven (IV) behaviors of AI agents. Especially in multi-agent systems (MAS), building the awareness of AI agents to balance the group utilities and system costs and satisfy group members' needs in their cooperation is a crucial problem for individuals learning to support their community and integrate human society in the long term. This paper proposes a hierarchical compound intrinsic value reinforcement learning model -- innate-values-driven reinforcement learning termed IVRL to describe the complex behaviors of multi-agent interaction in their cooperation. We implement the IVRL architecture in the StarCraft Multi-Agent Challenge (SMAC) environment and compare the cooperative performance within three characteristics of innate value agents (Coward, Neutral, and Reckless) through three benchmark multi-agent RL algorithms: QMIX, IQL, and QTRAN. The results demonstrate that by organizing individual various needs rationally, the group can achieve better performance with lower costs effectively.Comment: This paper was accepted by the 38th AAAI 2024 workshop: "Cooperative Multi-Agent Systems Decision-Making and Learning: From Individual Needs to Swarm Intelligence

    Soft behaviour modelling of user communities

    Get PDF
    A soft modelling approach for describing behaviour in on-line user communities is introduced in this work. Behaviour models of individual users in dynamic virtual environments have been described in the literature in terms of timed transition automata; they have various drawbacks. Soft multi/agent behaviour automata are defined and proposed to describe multiple user behaviours and to recognise larger classes of user group histories, such as group histories which contain unexpected behaviours. The notion of deviation from the user community model allows defining a soft parsing process which assesses and evaluates the dynamic behaviour of a group of users interacting in virtual environments, such as e-learning and e-business platforms. The soft automaton model can describe virtually infinite sequences of actions due to multiple users and subject to temporal constraints. Soft measures assess a form of distance of observed behaviours by evaluating the amount of temporal deviation, additional or omitted actions contained in an observed history as well as actions performed by unexpected users. The proposed model allows the soft recognition of user group histories also when the observed actions only partially meet the given behaviour model constraints. This approach is more realistic for real-time user community support systems, concerning standard boolean model recognition, when more than one user model is potentially available, and the extent of deviation from community behaviour models can be used as a guide to generate the system support by anticipation, projection and other known techniques. Experiments based on logs from an e-learning platform and plan compilation of the soft multi-agent behaviour automaton show the expressiveness of the proposed model

    MARBLER: An Open Platform for Standarized Evaluation of Multi-Robot Reinforcement Learning Algorithms

    Full text link
    Multi-agent reinforcement learning (MARL) has enjoyed significant recent progress, thanks to deep learning. This is naturally starting to benefit multi-robot systems (MRS) in the form of multi-robot RL (MRRL). However, existing infrastructure to train and evaluate policies predominantly focus on challenges in coordinating virtual agents, and ignore characteristics important to robotic systems. Few platforms support realistic robot dynamics, and fewer still can evaluate Sim2Real performance of learned behavior. To address these issues, we contribute MARBLER: Multi-Agent RL Benchmark and Learning Environment for the Robotarium. MARBLER offers a robust and comprehensive evaluation platform for MRRL by marrying Georgia Tech's Robotarium (which enables rapid prototyping on physical MRS) and OpenAI's Gym framework (which facilitates standardized use of modern learning algorithms). MARBLER offers a highly controllable environment with realistic dynamics, including barrier certificate-based obstacle avoidance. It allows anyone across the world to train and deploy MRRL algorithms on a physical testbed with reproducibility. Further, we introduce five novel scenarios inspired by common challenges in MRS and provide support for new custom scenarios. Finally, we use MARBLER to evaluate popular MARL algorithms and provide insights into their suitability for MRRL. In summary, MARBLER can be a valuable tool to the MRS research community by facilitating comprehensive and standardized evaluation of learning algorithms on realistic simulations and physical hardware. Links to our open-source framework and the videos of real-world experiments can be found at https://shubhlohiya.github.io/MARBLER/.Comment: 7 pages, 3 figures, submitted to MRS 2023, for the associated website, see https://shubhlohiya.github.io/MARBLER

    Hypothesis Generation Using Network Structures on Community Health Center Cancer-Screening Performance

    Get PDF
    RESEARCH OBJECTIVES: Nationally sponsored cancer-care quality-improvement efforts have been deployed in community health centers to increase breast, cervical, and colorectal cancer-screening rates among vulnerable populations. Despite several immediate and short-term gains, screening rates remain below national benchmark objectives. Overall improvement has been both difficult to sustain over time in some organizational settings and/or challenging to diffuse to other settings as repeatable best practices. Reasons for this include facility-level changes, which typically occur in dynamic organizational environments that are complex, adaptive, and unpredictable. This study seeks to understand the factors that shape community health center facility-level cancer-screening performance over time. This study applies a computational-modeling approach, combining principles of health-services research, health informatics, network theory, and systems science. METHODS: To investigate the roles of knowledge acquisition, retention, and sharing within the setting of the community health center and to examine their effects on the relationship between clinical decision support capabilities and improvement in cancer-screening rate improvement, we employed Construct-TM to create simulated community health centers using previously collected point-in-time survey data. Construct-TM is a multi-agent model of network evolution. Because social, knowledge, and belief networks co-evolve, groups and organizations are treated as complex systems to capture the variability of human and organizational factors. In Construct-TM, individuals and groups interact by communicating, learning, and making decisions in a continuous cycle. Data from the survey was used to differentiate high-performing simulated community health centers from low-performing ones based on computer-based decision support usage and self-reported cancer-screening improvement. RESULTS: This virtual experiment revealed that patterns of overall network symmetry, agent cohesion, and connectedness varied by community health center performance level. Visual assessment of both the agent-to-agent knowledge sharing network and agent-to-resource knowledge use network diagrams demonstrated that community health centers labeled as high performers typically showed higher levels of collaboration and cohesiveness among agent classes, faster knowledge-absorption rates, and fewer agents that were unconnected to key knowledge resources. Conclusions and research implications: Using the point-in-time survey data outlining community health center cancer-screening practices, our computational model successfully distinguished between high and low performers. Results indicated that high-performance environments displayed distinctive network characteristics in patterns of interaction among agents, as well as in the access and utilization of key knowledge resources. Our study demonstrated how non-network-specific data obtained from a point-in-time survey can be employed to forecast community health center performance over time, thereby enhancing the sustainability of long-term strategic-improvement efforts. Our results revealed a strategic profile for community health center cancer-screening improvement via simulation over a projected 10-year period. The use of computational modeling allows additional inferential knowledge to be drawn from existing data when examining organizational performance in increasingly complex environments

    Engineering Local Electricity Markets for Residential Communities

    Get PDF
    In line with the progressing decentralization of electricity generation, local electricity markets (LEMs) support electricity end customers in becoming active market participants instead of passive price takers. They provide a market platform for trading locally generated (renewable) electricity between residential agents (consumers, prosumers, and producers) within their community. Based on a structured literature review, a market engineering framework for LEMs is developed. The work focuses on two of the framework\u27s eight components, namely the agent behavior and the (micro) market structure. Residential agent behavior is evaluated in two steps. Firstly, two empirical studies, a structural equation model-based survey with 195 respondents and an adaptive choice-based conjoint study with 656 respondents, are developed, conducted and evaluated. Secondly, a discount price LEM is designed following the surveys\u27 results. Theoretical solutions of the LEM bi-level optimization problem with complete information and heuristic reinforcement learning with incomplete information are investigated in a multi-agent simulation to find the profit-maximizing market allocations. The (micro) market structure is investigated with regards to LEM business models, information systems and real-world application projects. Potential business models and their characteristics are combined in a taxonomy based on the results of 14 expert interviews. Then, the Smart Grid Architecture Model is utilized to derive the organizational, informational, and technical requirements for centralized and distributed information systems in LEMs. After providing an overview on current LEM implementations projects in Germany, the Landau Microgrid Project is used as an example to test the derived requirements. In conclusion, the work recommends current LEM projects to focus on overall discount electricity trading. Premium priced local electricity should be offered to subgroups of households with individual higher valuations for local generation. Automated self-learning algorithms are needed to mitigate the trading effort for residential LEM agents in order to ensure participation. The utilization of regulatory niches is suggested until specific regulations for LEMs are established. Further, the development of specific business models for LEMs should become a prospective (research) focus

    Building Ethically Bounded AI

    Full text link
    The more AI agents are deployed in scenarios with possibly unexpected situations, the more they need to be flexible, adaptive, and creative in achieving the goal we have given them. Thus, a certain level of freedom to choose the best path to the goal is inherent in making AI robust and flexible enough. At the same time, however, the pervasive deployment of AI in our life, whether AI is autonomous or collaborating with humans, raises several ethical challenges. AI agents should be aware and follow appropriate ethical principles and should thus exhibit properties such as fairness or other virtues. These ethical principles should define the boundaries of AI's freedom and creativity. However, it is still a challenge to understand how to specify and reason with ethical boundaries in AI agents and how to combine them appropriately with subjective preferences and goal specifications. Some initial attempts employ either a data-driven example-based approach for both, or a symbolic rule-based approach for both. We envision a modular approach where any AI technique can be used for any of these essential ingredients in decision making or decision support systems, paired with a contextual approach to define their combination and relative weight. In a world where neither humans nor AI systems work in isolation, but are tightly interconnected, e.g., the Internet of Things, we also envision a compositional approach to building ethically bounded AI, where the ethical properties of each component can be fruitfully exploited to derive those of the overall system. In this paper we define and motivate the notion of ethically-bounded AI, we describe two concrete examples, and we outline some outstanding challenges.Comment: Published at AAAI Blue Sky Track, winner of Blue Sky Awar
    • ā€¦
    corecore