5,201 research outputs found

    Using the online cross-entropy method to learn relational policies for playing different games

    Get PDF
    By defining a video-game environment as a collection of objects, relations, actions and rewards, the relational reinforcement learning algorithm presented in this paper generates and optimises a set of concise, human-readable relational rules for achieving maximal reward. Rule learning is achieved using a combination of incremental specialisation of rules and a modified online cross-entropy method, which dynamically adjusts the rate of learning as the agent progresses. The algorithm is tested on the Ms. Pac-Man and Mario environments, with results indicating the agent learns an effective policy for acting within each environment

    Embodied Active Learning of Relational State Abstractions for Bilevel Planning

    Full text link
    State abstraction is an effective technique for planning in robotics environments with continuous states and actions, long task horizons, and sparse feedback. In object-oriented environments, predicates are a particularly useful form of state abstraction because of their compatibility with symbolic planners and their capacity for relational generalization. However, to plan with predicates, the agent must be able to interpret them in continuous environment states (i.e., ground the symbols). Manually programming predicate interpretations can be difficult, so we would instead like to learn them from data. We propose an embodied active learning paradigm where the agent learns predicate interpretations through online interaction with an expert. For example, after taking actions in a block stacking environment, the agent may ask the expert: "Is On(block1, block2) true?" From this experience, the agent learns to plan: it learns neural predicate interpretations, symbolic planning operators, and neural samplers that can be used for bilevel planning. During exploration, the agent plans to learn: it uses its current models to select actions towards generating informative expert queries. We learn predicate interpretations as ensembles of neural networks and use their entropy to measure the informativeness of potential queries. We evaluate this approach in three robotic environments and find that it consistently outperforms six baselines while exhibiting sample efficiency in two key metrics: number of environment interactions, and number of queries to the expert. Code: https://tinyurl.com/active-predicatesComment: Conference on Lifelong Learning Agents (CoLLAs) 202

    GLIB: Efficient Exploration for Relational Model-Based Reinforcement Learning via Goal-Literal Babbling

    Full text link
    We address the problem of efficient exploration for transition model learning in the relational model-based reinforcement learning setting without extrinsic goals or rewards. Inspired by human curiosity, we propose goal-literal babbling (GLIB), a simple and general method for exploration in such problems. GLIB samples relational conjunctive goals that can be understood as specific, targeted effects that the agent would like to achieve in the world, and plans to achieve these goals using the transition model being learned. We provide theoretical guarantees showing that exploration with GLIB will converge almost surely to the ground truth model. Experimentally, we find GLIB to strongly outperform existing methods in both prediction and planning on a range of tasks, encompassing standard PDDL and PPDDL planning benchmarks and a robotic manipulation task implemented in the PyBullet physics simulator. Video: https://youtu.be/F6lmrPT6TOY Code: https://git.io/JIsTBComment: AAAI 202

    Learning Symbolic Models of Stochastic Domains

    Full text link
    In this article, we work towards the goal of developing agents that can learn to act in complex worlds. We develop a probabilistic, relational planning rule representation that compactly models noisy, nondeterministic action effects, and show how such rules can be effectively learned. Through experiments in simple planning domains and a 3D simulated blocks world with realistic physics, we demonstrate that this learning algorithm allows agents to effectively model world dynamics

    Active learning of manipulation sequences

    Get PDF
    We describe a system allowing a robot to learn goal-directed manipulation sequences such as steps of an assembly task. Learning is based on a free mix of exploration and instruction by an external teacher, and may be active in the sense that the system tests actions to maximize learning progress and asks the teacher if needed. The main component is a symbolic planning engine that operates on learned rules, defined by actions and their pre- and postconditions. Learned by model-based reinforcement learning, rules are immediately available for planning. Thus, there are no distinct learning and application phases. We show how dynamic plans, replanned after every action if necessary, can be used for automatic execution of manipulation sequences, for monitoring of observed manipulation sequences, or a mix of the two, all while extending and refining the rule base on the fly. Quantitative results indicate fast convergence using few training examples, and highly effective teacher intervention at early stages of learning.Peer ReviewedPostprint (author’s final draft

    Neurons and Symbols: A Manifesto

    Get PDF
    We discuss the purpose of neural-symbolic integration including its principles, mechanisms and applications. We outline a cognitive computational model for neural-symbolic integration, position the model in the broader context of multi-agent systems, machine learning and automated reasoning, and list some of the challenges for the area of neural-symbolic computation to achieve the promise of effective integration of robust learning and expressive reasoning under uncertainty

    That's Mine! Learning Ownership Relations and Norms for Robots

    Full text link
    The ability for autonomous agents to learn and conform to human norms is crucial for their safety and effectiveness in social environments. While recent work has led to frameworks for the representation and inference of simple social rules, research into norm learning remains at an exploratory stage. Here, we present a robotic system capable of representing, learning, and inferring ownership relations and norms. Ownership is represented as a graph of probabilistic relations between objects and their owners, along with a database of predicate-based norms that constrain the actions permissible on owned objects. To learn these norms and relations, our system integrates (i) a novel incremental norm learning algorithm capable of both one-shot learning and induction from specific examples, (ii) Bayesian inference of ownership relations in response to apparent rule violations, and (iii) percept-based prediction of an object's likely owners. Through a series of simulated and real-world experiments, we demonstrate the competence and flexibility of the system in performing object manipulation tasks that require a variety of norms to be followed, laying the groundwork for future research into the acquisition and application of social norms.Comment: 9 pg., 2 fig., accepted for AAAI-2019. Video demo: https://bit.ly/2z8obET GitHub: https://github.com/OwnageBot/ownage_bo
    corecore