7 research outputs found
Answer Set Programming for Non-Stationary Markov Decision Processes
Non-stationary domains, where unforeseen changes happen, present a challenge
for agents to find an optimal policy for a sequential decision making problem.
This work investigates a solution to this problem that combines Markov Decision
Processes (MDP) and Reinforcement Learning (RL) with Answer Set Programming
(ASP) in a method we call ASP(RL). In this method, Answer Set Programming is
used to find the possible trajectories of an MDP, from where Reinforcement
Learning is applied to learn the optimal policy of the problem. Results show
that ASP(RL) is capable of efficiently finding the optimal solution of an MDP
representing non-stationary domains
Learning and Reasoning for Robot Sequential Decision Making under Uncertainty
Robots frequently face complex tasks that require more than one action, where
sequential decision-making (SDM) capabilities become necessary. The key
contribution of this work is a robot SDM framework, called LCORPP, that
supports the simultaneous capabilities of supervised learning for passive state
estimation, automated reasoning with declarative human knowledge, and planning
under uncertainty toward achieving long-term goals. In particular, we use a
hybrid reasoning paradigm to refine the state estimator, and provide
informative priors for the probabilistic planner. In experiments, a mobile
robot is tasked with estimating human intentions using their motion
trajectories, declarative contextual knowledge, and human-robot interaction
(dialog-based and motion-based). Results suggest that, in efficiency and
accuracy, our framework performs better than its no-learning and no-reasoning
counterparts in office environment.Comment: In proceedings of 34th AAAI conference on Artificial Intelligence,
202
iCORPP: Interleaved Commonsense Reasoning and Probabilistic Planning on Robots
Robot sequential decision-making in the real world is a challenge because it
requires the robots to simultaneously reason about the current world state and
dynamics, while planning actions to accomplish complex tasks. On the one hand,
declarative languages and reasoning algorithms well support representing and
reasoning with commonsense knowledge. But these algorithms are not good at
planning actions toward maximizing cumulative reward over a long, unspecified
horizon. On the other hand, probabilistic planning frameworks, such as Markov
decision processes (MDPs) and partially observable MDPs (POMDPs), well support
planning to achieve long-term goals under uncertainty. But they are
ill-equipped to represent or reason about knowledge that is not directly
related to actions.
In this article, we present a novel algorithm, called iCORPP, to
simultaneously estimate the current world state, reason about world dynamics,
and construct task-oriented controllers. In this process, robot decision-making
problems are decomposed into two interdependent (smaller) subproblems that
focus on reasoning to "understand the world" and planning to "achieve the goal"
respectively. Contextual knowledge is represented in the reasoning component,
which makes the planning component epistemic and enables active information
gathering. The developed algorithm has been implemented and evaluated both in
simulation and on real robots using everyday service tasks, such as indoor
navigation, dialog management, and object delivery. Results show significant
improvements in scalability, efficiency, and adaptiveness, compared to
competitive baselines including handcrafted action policies