457,283 research outputs found

    Trial without Error: Towards Safe Reinforcement Learning via Human Intervention

    Full text link
    AI systems are increasingly applied to complex tasks that involve interaction with humans. During training, such systems are potentially dangerous, as they haven't yet learned to avoid actions that could cause serious harm. How can an AI system explore and learn without making a single mistake that harms humans or otherwise causes serious damage? For model-free reinforcement learning, having a human "in the loop" and ready to intervene is currently the only way to prevent all catastrophes. We formalize human intervention for RL and show how to reduce the human labor required by training a supervised learner to imitate the human's intervention decisions. We evaluate this scheme on Atari games, with a Deep RL agent being overseen by a human for four hours. When the class of catastrophes is simple, we are able to prevent all catastrophes without affecting the agent's learning (whereas an RL baseline fails due to catastrophic forgetting). However, this scheme is less successful when catastrophes are more complex: it reduces but does not eliminate catastrophes and the supervised learner fails on adversarial examples found by the agent. Extrapolating to more challenging environments, we show that our implementation would not scale (due to the infeasible amount of human labor required). We outline extensions of the scheme that are necessary if we are to train model-free agents without a single catastrophe

    Smart Residential Buildings as Learning Agent Organizations in the Internet of Things

    Get PDF
    Background: Smart buildings are one of the major application areas of technologies bound to embedded systems and the Internet of things. Such systems have to be adaptable and flexible in order to provide better services to its residents. Modelling such systems is an open research question. Herein, the question is approached using an organizational modelling methodology bound to the principles of the learning organization. Objectives: Providing a higher level of abstraction for understanding, developing and maintaining smart residential buildings in a more human understandable form. Methods/Approach: Organization theory provides us with the necessary concepts and methodology to approach complex organizational systems. Results: A set of principles for building learning agent organizations, a formalization of learning processes for agents, a framework for modelling knowledge transfer between agents and the environment, and a tailored organizational structure for smart residential buildings based on Nonaka’s hypertext organizational form. Conclusions: Organization theory is a promising field of research when dealing with complex engineering systems

    A case study of agent programmability in an online learning environment

    Get PDF
    Software agents are well-suited to assisting users with routine, repetitive, and time-consuming tasks in various educational environments. In order to achieve complex tasks effectively, humans and agents sometimes need to work together. However, some issues in human agent interaction have not been solved properly, such as delegation, trust and privacy. The agent research community has focused on technologies for constructing autonomous agents and techniques for collaboration among agents. Little attention has been paid to supporting interactions between humans and agents. p* The objectives of this research are to investigate how easy it might be for a user to program his/her agent, how users behave when given the ability to program their agents, whether access to necessary help resources can be improved, and whether such a system can facilitate collaborative learning. Studying users’ concerns about their privacy and how an online learning environment can be built to protect users’ privacy are also interesting issues to us. In this thesis two alternative systems were developed for programmable agents in which a human user can define a set of rules to direct an agent’s activities at execution time. The systems were built on top of a multi-agent collaborative learning environment that enables a user to program his or her agent to communicate with other agents and to monitor the activities of other users and their agents. These systems for end user programmable agents were evaluated and compared. The result demonstrated that an end-user programming environment is able to meet users’ individual needs on awareness information, facilitate the information exchange among the users, and enhance the communication between users within a virtual learning environment. This research provides a platform for investigating concerns over user privacy caused by agent programmability

    A study on like-attracts-like versus elitist selection criterion for human-like social behavior of memetic mulitagent systems

    Get PDF
    Memetic multi agent system emerges as an enhanced version of multiagent systems with the implementation of meme-inspired computational agents. It aims to evolve human-like behavior of multiple agents by exploiting the Dawkins' notion of a meme and Universal Darwinism. Previous research has developed a computational framework in which a series of memetic operations have been designed for implementing humanlike agents. This paper will focus on improving the human-like behavior of multiple agents when they are engaged in social interactions. The improvement is mainly on how an agent shall learn from others and adapt its behavior in a complex dynamic environment. In particular, we design a new mechanism that supervises how the agent shall select one of the other agents for the learning purpose. The selection is a trade-off between the elitist and like-attracts-like principles. We demonstrate the desirable interactions of multiple agents in two problem domains

    Exploring the Benefits of Teams in Multiagent Learning

    Full text link
    For problems requiring cooperation, many multiagent systems implement solutions among either individual agents or across an entire population towards a common goal. Multiagent teams are primarily studied when in conflict; however, organizational psychology (OP) highlights the benefits of teams among human populations for learning how to coordinate and cooperate. In this paper, we propose a new model of multiagent teams for reinforcement learning (RL) agents inspired by OP and early work on teams in artificial intelligence. We validate our model using complex social dilemmas that are popular in recent multiagent RL and find that agents divided into teams develop cooperative pro-social policies despite incentives to not cooperate. Furthermore, agents are better able to coordinate and learn emergent roles within their teams and achieve higher rewards compared to when the interests of all agents are aligned.Comment: 10 pages, 6 figures, Published at IJCAI 2022. arXiv admin note: text overlap with arXiv:2204.0747
    • …
    corecore