173 research outputs found
A POMDP framework for modelling human interaction with assistive robots
This paper presents a framework for modelling the interaction between a human operator and a robotic device, that enables the robot to collaborate with the human to jointly accomplish tasks. States of the system are captured in a model based on a partially observable Markov decision process (POMDP). States representing the human operator are motivated by behaviours from the psychology of the human action cycle. Hierarchical nature of these states allows the exploitation of data structures based on algebraic decision diagrams (ADD) to efficiently solve the resulting POMDP. The proposed framework is illustrated using two examples from as-sistive robotics; a robotic wheel chair and an intelligent walking device. Experimental results from trials conducted in an office environment with the wheelchair is used to demonstrate the proposed technique. © 2011 IEEE
Reinforcement Learning Approaches in Social Robotics
This article surveys reinforcement learning approaches in social robotics.
Reinforcement learning is a framework for decision-making problems in which an
agent interacts through trial-and-error with its environment to discover an
optimal behavior. Since interaction is a key component in both reinforcement
learning and social robotics, it can be a well-suited approach for real-world
interactions with physically embodied social robots. The scope of the paper is
focused particularly on studies that include social physical robots and
real-world human-robot interactions with users. We present a thorough analysis
of reinforcement learning approaches in social robotics. In addition to a
survey, we categorize existent reinforcement learning approaches based on the
used method and the design of the reward mechanisms. Moreover, since
communication capability is a prominent feature of social robots, we discuss
and group the papers based on the communication medium used for reward
formulation. Considering the importance of designing the reward function, we
also provide a categorization of the papers based on the nature of the reward.
This categorization includes three major themes: interactive reinforcement
learning, intrinsically motivated methods, and task performance-driven methods.
The benefits and challenges of reinforcement learning in social robotics,
evaluation methods of the papers regarding whether or not they use subjective
and algorithmic measures, a discussion in the view of real-world reinforcement
learning challenges and proposed solutions, the points that remain to be
explored, including the approaches that have thus far received less attention
is also given in the paper. Thus, this paper aims to become a starting point
for researchers interested in using and applying reinforcement learning methods
in this particular research field
Planning and sequential decision making for human-aware robots
University of Technology, Sydney. Faculty of Engineering and Information Technology.This thesis explores the use of probabilistic techniques for enhancing the interaction between a human and a robotic assistant. The human in this context is regarded as an integral part of the system, providing a major contribution to the decision making process and is able to overwrite, re-evaluate and correct decisions made by the robot to fulfil her or his true intentions and ultimate goals and needs. Conversely, the robot is expected to behave as an intelligent collaborative agent that predicts human intentions and makes decisions by merging learned behaviours with the information it cmTently possesses. The work is motivated by the rapid increase of the application domains in which robotic systems operate, and the presence of humans in many of these domains. The proposed framework facilitates human-robot social integration by increasing the synergy between robot's capabilities and human needs, primarily during assistive navigational tasks. The first part of the thesis ets the groundwork by developing a path-planning/re-planning strategy able to produce smooth feasible paths to address the issue of navigating a robotic wheelchair in
cluttered indoor environments. This strategy integrates a global path-planner that operates as a mission controller, and a local reactive planner that navigates locally in an optimal manner while preventing collisions with static and dynamic obstacles in the local area. The proposed strategy also encapsulates social behaviour, such as navigating through preferred routes, in order to generate socially and behavioura11y acceptable plans.
The work then focuses on predicting and responding to human interactions with a robotic agent by exploiting probabilistic techniques for sequential decision making and planning under uncertainty. Dynamic Bayesian networks and partially observable Markov decision processes are examined for estimating human intention in order to minimise the flow of information between the human and the robot during navigation tasks. A framework to capture human behaviour, motivated by the human action cycle as derived from the psychology domain is developed. This framework embeds a human-robot interaction layer, which defines variables and procedures to model interaction scenarios, and facilitates the transfer of information during human-robot collaborative tasks. Experiments using a human-operated robotic wheelchair carrying out navigational daily routines are conducted to demonstrate the capacity of the proposed methodology to understand human intentions
and comply with their long term plans. The results obtained are presented as the outcome
of a set of trials conducted with actor users, or simulated experiments based on real scenarios
Data-driven robotic manipulation of cloth-like deformable objects : the present, challenges and future prospects
Manipulating cloth-like deformable objects (CDOs) is a long-standing problem in the robotics community. CDOs are flexible (non-rigid) objects that do not show a detectable level of compression strength while two points on the article are pushed towards each other and include objects such as ropes (1D), fabrics (2D) and bags (3D). In general, CDOs’ many degrees of freedom (DoF) introduce severe self-occlusion and complex state–action dynamics as significant obstacles to perception and manipulation systems. These challenges exacerbate existing issues of modern robotic control methods such as imitation learning (IL) and reinforcement learning (RL). This review focuses on the application details of data-driven control methods on four major task families in this domain: cloth shaping, knot tying/untying, dressing and bag manipulation. Furthermore, we identify specific inductive biases in these four domains that present challenges for more general IL and RL algorithms.Publisher PDFPeer reviewe
Shared Autonomy via Hindsight Optimization
In shared autonomy, user input and robot autonomy are combined to control a
robot to achieve a goal. Often, the robot does not know a priori which goal the
user wants to achieve, and must both predict the user's intended goal, and
assist in achieving that goal. We formulate the problem of shared autonomy as a
Partially Observable Markov Decision Process with uncertainty over the user's
goal. We utilize maximum entropy inverse optimal control to estimate a
distribution over the user's goal based on the history of inputs. Ideally, the
robot assists the user by solving for an action which minimizes the expected
cost-to-go for the (unknown) goal. As solving the POMDP to select the optimal
action is intractable, we use hindsight optimization to approximate the
solution. In a user study, we compare our method to a standard
predict-then-blend approach. We find that our method enables users to
accomplish tasks more quickly while utilizing less input. However, when asked
to rate each system, users were mixed in their assessment, citing a tradeoff
between maintaining control authority and accomplishing tasks quickly
Multi-target detection and recognition by UAVs using online POMDPs
This paper tackles high-level decision-making techniques for robotic missions, which involve both active sensing and symbolic goal reaching, under uncertain probabilistic environments and strong time constraints. Our case study is a POMDP model of an online multi-target detection and recognition mission by an autonomous UAV.The POMDP model of the multi-target detection and recognition problem is generated online from a list of areas of interest, which are automatically extracted at the beginning of the flight from a coarse-grained high altitude observation of the scene. The POMDP observation model relies on a statistical abstraction of an image processing algorithm's output used to detect targets. As the POMDP problem cannot be known and thus optimized before the beginning of the flight, our main contribution is an ``optimize-while-execute'' algorithmic framework: it drives a POMDP sub-planner to optimize and execute the POMDP policy in parallel under action duration constraints. We present new results from real outdoor flights and SAIL simulations, which highlight both the benefits of using POMDPs in multi-target detection and recognition missions, and of our`optimize-while-execute'' paradigm
NeBula: TEAM CoSTAR’s robotic autonomy solution that won phase II of DARPA subterranean challenge
This paper presents and discusses algorithms, hardware, and software architecture developed by the TEAM CoSTAR (Collaborative SubTerranean Autonomous Robots), competing in the DARPA Subterranean Challenge. Specifically, it presents the techniques utilized within the Tunnel (2019) and Urban (2020) competitions, where CoSTAR achieved second and first place, respectively. We also discuss CoSTAR’s demonstrations in Martian-analog surface and subsurface (lava tubes) exploration. The paper introduces our autonomy solution, referred to as NeBula (Networked Belief-aware Perceptual Autonomy). NeBula is an uncertainty-aware framework that aims at enabling resilient and modular autonomy solutions by performing reasoning and decision making in the belief space (space of probability distributions over the robot and world states). We discuss various components of the NeBula framework, including (i) geometric and semantic environment mapping, (ii) a multi-modal positioning system, (iii) traversability analysis and local planning, (iv) global motion planning and exploration behavior, (v) risk-aware mission planning, (vi) networking and decentralized reasoning, and (vii) learning-enabled adaptation. We discuss the performance of NeBula on several robot types (e.g., wheeled, legged, flying), in various environments. We discuss the specific results and lessons learned from fielding this solution in the challenging courses of the DARPA Subterranean Challenge competition.Peer ReviewedAgha, A., Otsu, K., Morrell, B., Fan, D. D., Thakker, R., Santamaria-Navarro, A., Kim, S.-K., Bouman, A., Lei, X., Edlund, J., Ginting, M. F., Ebadi, K., Anderson, M., Pailevanian, T., Terry, E., Wolf, M., Tagliabue, A., Vaquero, T. S., Palieri, M., Tepsuporn, S., Chang, Y., Kalantari, A., Chavez, F., Lopez, B., Funabiki, N., Miles, G., Touma, T., Buscicchio, A., Tordesillas, J., Alatur, N., Nash, J., Walsh, W., Jung, S., Lee, H., Kanellakis, C., Mayo, J., Harper, S., Kaufmann, M., Dixit, A., Correa, G. J., Lee, C., Gao, J., Merewether, G., Maldonado-Contreras, J., Salhotra, G., Da Silva, M. S., Ramtoula, B., Fakoorian, S., Hatteland, A., Kim, T., Bartlett, T., Stephens, A., Kim, L., Bergh, C., Heiden, E., Lew, T., Cauligi, A., Heywood, T., Kramer, A., Leopold, H. A., Melikyan, H., Choi, H. C., Daftry, S., Toupet, O., Wee, I., Thakur, A., Feras, M., Beltrame, G., Nikolakopoulos, G., Shim, D., Carlone, L., & Burdick, JPostprint (published version
- …