627 research outputs found

    A Review of Symbolic, Subsymbolic and Hybrid Methods for Sequential Decision Making

    Full text link
    The field of Sequential Decision Making (SDM) provides tools for solving Sequential Decision Processes (SDPs), where an agent must make a series of decisions in order to complete a task or achieve a goal. Historically, two competing SDM paradigms have view for supremacy. Automated Planning (AP) proposes to solve SDPs by performing a reasoning process over a model of the world, often represented symbolically. Conversely, Reinforcement Learning (RL) proposes to learn the solution of the SDP from data, without a world model, and represent the learned knowledge subsymbolically. In the spirit of reconciliation, we provide a review of symbolic, subsymbolic and hybrid methods for SDM. We cover both methods for solving SDPs (e.g., AP, RL and techniques that learn to plan) and for learning aspects of their structure (e.g., world models, state invariants and landmarks). To the best of our knowledge, no other review in the field provides the same scope. As an additional contribution, we discuss what properties an ideal method for SDM should exhibit and argue that neurosymbolic AI is the current approach which most closely resembles this ideal method. Finally, we outline several proposals to advance the field of SDM via the integration of symbolic and subsymbolic AI

    A survey on active simultaneous localization and mapping: state of the art and new frontiers

    Get PDF
    Active simultaneous localization and mapping (SLAM) is the problem of planning and controlling the motion of a robot to build the most accurate and complete model of the surrounding environment. Since the first foundational work in active perception appeared, more than three decades ago, this field has received increasing attention across different scientific communities. This has brought about many different approaches and formulations, and makes a review of the current trends necessary and extremely valuable for both new and experienced researchers. In this article, we survey the state of the art in active SLAM and take an in-depth look at the open challenges that still require attention to meet the needs of modern applications. After providing a historical perspective, we present a unified problem formulation and review the well-established modular solution scheme, which decouples the problem into three stages that identify, select, and execute potential navigation actions. We then analyze alternative approaches, including belief-space planning and deep reinforcement learning techniques, and review related work on multirobot coordination. This article concludes with a discussion of new research directions, addressing reproducible research, active spatial perception, and practical applications, among other topics

    Helping humans and agents avoid undesirable consequences with models of intervention

    Get PDF
    2021 Fall.Includes bibliographical references.When working in an unfamiliar online environment, it can be helpful to have an observer that can intervene and guide a user toward a desirable outcome while avoiding undesirable outcomes or frustration. The Intervention Problem is deciding when to intervene in order to help a user. The Intervention Problem is similar to, but distinct from, Plan Recognition because the observer must not only recognize the intended goals of a user but also when to intervene to help the user when necessary. In this dissertation, we formalize a family of intervention problems to address two sub-problems: (1) The Intervention Recognition Problem, and (2) The Intervention Recovery Problem. The Intervention Recognition Problem views the environment as a state transition system where an agent (or a human user), in order to achieve a desirable outcome, executes actions that change the environment from one state to the next. Some states in the environment are undesirable and the user does not have the ability to recognize them and the intervening agent wants to help the user in the environment avoid the undesirable state. In this dissertation, we model the environment as a classical planning problem and discuss three intervention models to address the Intervention Recognition Problem. The three models address different dimensions of the Intervention Recognition Problem, specifically the actors in the environment, information hidden from the intervening agent, type of observations and noise in the observations. The first model: Intervention by Recognizing Actions Enabling Multiple Undesirable Consequences, is motivated by a study where we observed how home computer users practice cyber-security and take action to unwittingly put their online safety at risk. The model is defined for an environment where three agents: the user, the attacker and the intervening agent are present. The intervening agent helps the user reach a desirable goal that is hidden from the intervening agent by recognizing critical actions that enable multiple undesirable consequences. We view the problem of recognizing critical actions as a multi-factor decision problem of three domain-independent metrics: certainty, timeliness and desirability. The three metrics simulate the trade-off between the safety and freedom of the observed agent when selecting critical actions to intervene. The second model: Intervention as Classical Planning, we model scenarios where the intervening agent observes a user and a competitor attempting to achieve different goals in the same environment. A key difference in this model compared to the first model is that the intervening agent is aware of the user's desirable goal and the undesirable state. The intervening agent exploits the classical planning representation of the environment and uses automated planning to project the possible outcomes in the environment exactly and approximately. To recognize when intervention is required, the observer analyzes the plan suffixes leading to the user's desirable goal and the undesirable state and learns the differences between the plans that achieve the desirable goal and plans that achieve the undesirable state using machine learning. Similar to the first model, learning the differences between the safe and unsafe plans allows the intervening agent to balance specific actions with those that are necessary for the user to allow some freedom. The third model: Human-aware Intervention, we assume that the user is a human solving a cognitively engaging planning task. When human users plan, unlike an automated planner, they do not have the ability to use heuristics to search for the best solution. They often make mistakes and spend time exploring the search space of the planning problem. The complication this adds to the Intervention Recognition Problem is that deciding to intervene by analyzing plan suffixes generated by an automated planner is no longer feasible. Using a cognitively engaging puzzle solving task (Rush Hour) we study how human users solve the puzzle as a planning task and develop the Human-aware Intervention model combining automated planning and machine learning. The intervening agent uses a domain specific feature set more appropriate for human behavior to decide in real time whether to intervene the human user. Our experiments using the benchmark planning domains and human subject studies show that the three intervention recognition models out performs existing plan recognition algorithms in predicting when intervention is required. Our solution to address the Intervention Recovery Problem goes beyond the typical preventative measures to help the human user recover from intervention. We propose the Interactive Human-aware Intervention where a human user solves a cognitively engaging planning task with the assistance of an agent that implements the Human-aware Intervention. The Interactive Human-aware Intervention is different from typical preventive measures where the agent executes actions to modify the domain such that the undesirable plan can not progress (e.g., block an action). Our approach interactively guides the human user toward the solution to the planning task by revealing information about the remaining planning task. We evaluate the Interactive Human-aware Intervention using both subjective and objective measures in a human subject study
    corecore