173 research outputs found

    A POMDP framework for modelling human interaction with assistive robots

    Full text link
    This paper presents a framework for modelling the interaction between a human operator and a robotic device, that enables the robot to collaborate with the human to jointly accomplish tasks. States of the system are captured in a model based on a partially observable Markov decision process (POMDP). States representing the human operator are motivated by behaviours from the psychology of the human action cycle. Hierarchical nature of these states allows the exploitation of data structures based on algebraic decision diagrams (ADD) to efficiently solve the resulting POMDP. The proposed framework is illustrated using two examples from as-sistive robotics; a robotic wheel chair and an intelligent walking device. Experimental results from trials conducted in an office environment with the wheelchair is used to demonstrate the proposed technique. © 2011 IEEE

    Reinforcement Learning Approaches in Social Robotics

    Full text link
    This article surveys reinforcement learning approaches in social robotics. Reinforcement learning is a framework for decision-making problems in which an agent interacts through trial-and-error with its environment to discover an optimal behavior. Since interaction is a key component in both reinforcement learning and social robotics, it can be a well-suited approach for real-world interactions with physically embodied social robots. The scope of the paper is focused particularly on studies that include social physical robots and real-world human-robot interactions with users. We present a thorough analysis of reinforcement learning approaches in social robotics. In addition to a survey, we categorize existent reinforcement learning approaches based on the used method and the design of the reward mechanisms. Moreover, since communication capability is a prominent feature of social robots, we discuss and group the papers based on the communication medium used for reward formulation. Considering the importance of designing the reward function, we also provide a categorization of the papers based on the nature of the reward. This categorization includes three major themes: interactive reinforcement learning, intrinsically motivated methods, and task performance-driven methods. The benefits and challenges of reinforcement learning in social robotics, evaluation methods of the papers regarding whether or not they use subjective and algorithmic measures, a discussion in the view of real-world reinforcement learning challenges and proposed solutions, the points that remain to be explored, including the approaches that have thus far received less attention is also given in the paper. Thus, this paper aims to become a starting point for researchers interested in using and applying reinforcement learning methods in this particular research field

    Planning and sequential decision making for human-aware robots

    Full text link
    University of Technology, Sydney. Faculty of Engineering and Information Technology.This thesis explores the use of probabilistic techniques for enhancing the interaction between a human and a robotic assistant. The human in this context is regarded as an integral part of the system, providing a major contribution to the decision making process and is able to overwrite, re-evaluate and correct decisions made by the robot to fulfil her or his true intentions and ultimate goals and needs. Conversely, the robot is expected to behave as an intelligent collaborative agent that predicts human intentions and makes decisions by merging learned behaviours with the information it cmTently possesses. The work is motivated by the rapid increase of the application domains in which robotic systems operate, and the presence of humans in many of these domains. The proposed framework facilitates human-robot social integration by increasing the synergy between robot's capabilities and human needs, primarily during assistive navigational tasks. The first part of the thesis ets the groundwork by developing a path-planning/re-planning strategy able to produce smooth feasible paths to address the issue of navigating a robotic wheelchair in cluttered indoor environments. This strategy integrates a global path-planner that operates as a mission controller, and a local reactive planner that navigates locally in an optimal manner while preventing collisions with static and dynamic obstacles in the local area. The proposed strategy also encapsulates social behaviour, such as navigating through preferred routes, in order to generate socially and behavioura11y acceptable plans. The work then focuses on predicting and responding to human interactions with a robotic agent by exploiting probabilistic techniques for sequential decision making and planning under uncertainty. Dynamic Bayesian networks and partially observable Markov decision processes are examined for estimating human intention in order to minimise the flow of information between the human and the robot during navigation tasks. A framework to capture human behaviour, motivated by the human action cycle as derived from the psychology domain is developed. This framework embeds a human-robot interaction layer, which defines variables and procedures to model interaction scenarios, and facilitates the transfer of information during human-robot collaborative tasks. Experiments using a human-operated robotic wheelchair carrying out navigational daily routines are conducted to demonstrate the capacity of the proposed methodology to understand human intentions and comply with their long term plans. The results obtained are presented as the outcome of a set of trials conducted with actor users, or simulated experiments based on real scenarios

    Data-driven robotic manipulation of cloth-like deformable objects : the present, challenges and future prospects

    Get PDF
    Manipulating cloth-like deformable objects (CDOs) is a long-standing problem in the robotics community. CDOs are flexible (non-rigid) objects that do not show a detectable level of compression strength while two points on the article are pushed towards each other and include objects such as ropes (1D), fabrics (2D) and bags (3D). In general, CDOs’ many degrees of freedom (DoF) introduce severe self-occlusion and complex state–action dynamics as significant obstacles to perception and manipulation systems. These challenges exacerbate existing issues of modern robotic control methods such as imitation learning (IL) and reinforcement learning (RL). This review focuses on the application details of data-driven control methods on four major task families in this domain: cloth shaping, knot tying/untying, dressing and bag manipulation. Furthermore, we identify specific inductive biases in these four domains that present challenges for more general IL and RL algorithms.Publisher PDFPeer reviewe

    Shared Autonomy via Hindsight Optimization

    Full text link
    In shared autonomy, user input and robot autonomy are combined to control a robot to achieve a goal. Often, the robot does not know a priori which goal the user wants to achieve, and must both predict the user's intended goal, and assist in achieving that goal. We formulate the problem of shared autonomy as a Partially Observable Markov Decision Process with uncertainty over the user's goal. We utilize maximum entropy inverse optimal control to estimate a distribution over the user's goal based on the history of inputs. Ideally, the robot assists the user by solving for an action which minimizes the expected cost-to-go for the (unknown) goal. As solving the POMDP to select the optimal action is intractable, we use hindsight optimization to approximate the solution. In a user study, we compare our method to a standard predict-then-blend approach. We find that our method enables users to accomplish tasks more quickly while utilizing less input. However, when asked to rate each system, users were mixed in their assessment, citing a tradeoff between maintaining control authority and accomplishing tasks quickly

    Multi-target detection and recognition by UAVs using online POMDPs

    Get PDF
    This paper tackles high-level decision-making techniques for robotic missions, which involve both active sensing and symbolic goal reaching, under uncertain probabilistic environments and strong time constraints. Our case study is a POMDP model of an online multi-target detection and recognition mission by an autonomous UAV.The POMDP model of the multi-target detection and recognition problem is generated online from a list of areas of interest, which are automatically extracted at the beginning of the flight from a coarse-grained high altitude observation of the scene. The POMDP observation model relies on a statistical abstraction of an image processing algorithm's output used to detect targets. As the POMDP problem cannot be known and thus optimized before the beginning of the flight, our main contribution is an ``optimize-while-execute'' algorithmic framework: it drives a POMDP sub-planner to optimize and execute the POMDP policy in parallel under action duration constraints. We present new results from real outdoor flights and SAIL simulations, which highlight both the benefits of using POMDPs in multi-target detection and recognition missions, and of our`optimize-while-execute'' paradigm

    NeBula: TEAM CoSTAR’s robotic autonomy solution that won phase II of DARPA subterranean challenge

    Get PDF
    This paper presents and discusses algorithms, hardware, and software architecture developed by the TEAM CoSTAR (Collaborative SubTerranean Autonomous Robots), competing in the DARPA Subterranean Challenge. Specifically, it presents the techniques utilized within the Tunnel (2019) and Urban (2020) competitions, where CoSTAR achieved second and first place, respectively. We also discuss CoSTAR’s demonstrations in Martian-analog surface and subsurface (lava tubes) exploration. The paper introduces our autonomy solution, referred to as NeBula (Networked Belief-aware Perceptual Autonomy). NeBula is an uncertainty-aware framework that aims at enabling resilient and modular autonomy solutions by performing reasoning and decision making in the belief space (space of probability distributions over the robot and world states). We discuss various components of the NeBula framework, including (i) geometric and semantic environment mapping, (ii) a multi-modal positioning system, (iii) traversability analysis and local planning, (iv) global motion planning and exploration behavior, (v) risk-aware mission planning, (vi) networking and decentralized reasoning, and (vii) learning-enabled adaptation. We discuss the performance of NeBula on several robot types (e.g., wheeled, legged, flying), in various environments. We discuss the specific results and lessons learned from fielding this solution in the challenging courses of the DARPA Subterranean Challenge competition.Peer ReviewedAgha, A., Otsu, K., Morrell, B., Fan, D. D., Thakker, R., Santamaria-Navarro, A., Kim, S.-K., Bouman, A., Lei, X., Edlund, J., Ginting, M. F., Ebadi, K., Anderson, M., Pailevanian, T., Terry, E., Wolf, M., Tagliabue, A., Vaquero, T. S., Palieri, M., Tepsuporn, S., Chang, Y., Kalantari, A., Chavez, F., Lopez, B., Funabiki, N., Miles, G., Touma, T., Buscicchio, A., Tordesillas, J., Alatur, N., Nash, J., Walsh, W., Jung, S., Lee, H., Kanellakis, C., Mayo, J., Harper, S., Kaufmann, M., Dixit, A., Correa, G. J., Lee, C., Gao, J., Merewether, G., Maldonado-Contreras, J., Salhotra, G., Da Silva, M. S., Ramtoula, B., Fakoorian, S., Hatteland, A., Kim, T., Bartlett, T., Stephens, A., Kim, L., Bergh, C., Heiden, E., Lew, T., Cauligi, A., Heywood, T., Kramer, A., Leopold, H. A., Melikyan, H., Choi, H. C., Daftry, S., Toupet, O., Wee, I., Thakur, A., Feras, M., Beltrame, G., Nikolakopoulos, G., Shim, D., Carlone, L., & Burdick, JPostprint (published version
    corecore