786 research outputs found
Learning and Reasoning for Robot Sequential Decision Making under Uncertainty
Robots frequently face complex tasks that require more than one action, where
sequential decision-making (SDM) capabilities become necessary. The key
contribution of this work is a robot SDM framework, called LCORPP, that
supports the simultaneous capabilities of supervised learning for passive state
estimation, automated reasoning with declarative human knowledge, and planning
under uncertainty toward achieving long-term goals. In particular, we use a
hybrid reasoning paradigm to refine the state estimator, and provide
informative priors for the probabilistic planner. In experiments, a mobile
robot is tasked with estimating human intentions using their motion
trajectories, declarative contextual knowledge, and human-robot interaction
(dialog-based and motion-based). Results suggest that, in efficiency and
accuracy, our framework performs better than its no-learning and no-reasoning
counterparts in office environment.Comment: In proceedings of 34th AAAI conference on Artificial Intelligence,
202
BWIBots: A platform for bridging the gap between AI and human–robot interaction research
Recent progress in both AI and robotics have enabled the development of general purpose robot platforms that are capable of executing a wide variety of complex, temporally extended service tasks in open environments. This article introduces a novel, custom-designed multi-robot platform for research on AI, robotics, and especially human–robot interaction for service robots. Called BWIBots, the robots were designed as a part of the Building-Wide Intelligence (BWI) project at the University of Texas at Austin. The article begins with a description of, and justification for, the hardware and software design decisions underlying the BWIBots, with the aim of informing the design of such platforms in the future. It then proceeds to present an overview of various research contributions that have enabled the BWIBots to better (a) execute action sequences to complete user requests, (b) efficiently ask questions to resolve user requests, (c) understand human commands given in natural language, and (d) understand human intention from afar. The article concludes with a look forward towards future research opportunities and applications enabled by the BWIBot platform
A Survey of Knowledge-based Sequential Decision Making under Uncertainty
Reasoning with declarative knowledge (RDK) and sequential decision-making
(SDM) are two key research areas in artificial intelligence. RDK methods reason
with declarative domain knowledge, including commonsense knowledge, that is
either provided a priori or acquired over time, while SDM methods
(probabilistic planning and reinforcement learning) seek to compute action
policies that maximize the expected cumulative utility over a time horizon;
both classes of methods reason in the presence of uncertainty. Despite the rich
literature in these two areas, researchers have not fully explored their
complementary strengths. In this paper, we survey algorithms that leverage RDK
methods while making sequential decisions under uncertainty. We discuss
significant developments, open problems, and directions for future work
iCORPP: Interleaved Commonsense Reasoning and Probabilistic Planning on Robots
Robot sequential decision-making in the real world is a challenge because it
requires the robots to simultaneously reason about the current world state and
dynamics, while planning actions to accomplish complex tasks. On the one hand,
declarative languages and reasoning algorithms well support representing and
reasoning with commonsense knowledge. But these algorithms are not good at
planning actions toward maximizing cumulative reward over a long, unspecified
horizon. On the other hand, probabilistic planning frameworks, such as Markov
decision processes (MDPs) and partially observable MDPs (POMDPs), well support
planning to achieve long-term goals under uncertainty. But they are
ill-equipped to represent or reason about knowledge that is not directly
related to actions.
In this article, we present a novel algorithm, called iCORPP, to
simultaneously estimate the current world state, reason about world dynamics,
and construct task-oriented controllers. In this process, robot decision-making
problems are decomposed into two interdependent (smaller) subproblems that
focus on reasoning to "understand the world" and planning to "achieve the goal"
respectively. Contextual knowledge is represented in the reasoning component,
which makes the planning component epistemic and enables active information
gathering. The developed algorithm has been implemented and evaluated both in
simulation and on real robots using everyday service tasks, such as indoor
navigation, dialog management, and object delivery. Results show significant
improvements in scalability, efficiency, and adaptiveness, compared to
competitive baselines including handcrafted action policies
Learning and Reasoning for Robot Dialog and Navigation Tasks
You are viewing an article from the Proceedings of the 21st Annual Meeting of the Special Interest Group on Discourse and Dialogue that was in the Good Systems Network Digest in 2020.Office of the VP for Researc
Guiding Robot Exploration in Reinforcement Learning via Automated Planning
Reinforcement learning (RL) enables an agent to learn from trial-and-error
experiences toward achieving long-term goals; automated planning aims to
compute plans for accomplishing tasks using action knowledge. Despite their
shared goal of completing complex tasks, the development of RL and automated
planning has been largely isolated due to their different computational
modalities. Focusing on improving RL agents' learning efficiency, we develop
Guided Dyna-Q (GDQ) to enable RL agents to reason with action knowledge to
avoid exploring less-relevant states. The action knowledge is used for
generating artificial experiences from an optimistic simulation. GDQ has been
evaluated in simulation and using a mobile robot conducting navigation tasks in
a multi-room office environment. Compared with competitive baselines, GDQ
significantly reduces the effort in exploration while improving the quality of
learned policies.Comment: Accepted in International Conference of Planning and Scheduling
(ICAPS-21
- …