10,304 research outputs found
Learning and Reasoning for Robot Sequential Decision Making under Uncertainty
Robots frequently face complex tasks that require more than one action, where
sequential decision-making (SDM) capabilities become necessary. The key
contribution of this work is a robot SDM framework, called LCORPP, that
supports the simultaneous capabilities of supervised learning for passive state
estimation, automated reasoning with declarative human knowledge, and planning
under uncertainty toward achieving long-term goals. In particular, we use a
hybrid reasoning paradigm to refine the state estimator, and provide
informative priors for the probabilistic planner. In experiments, a mobile
robot is tasked with estimating human intentions using their motion
trajectories, declarative contextual knowledge, and human-robot interaction
(dialog-based and motion-based). Results suggest that, in efficiency and
accuracy, our framework performs better than its no-learning and no-reasoning
counterparts in office environment.Comment: In proceedings of 34th AAAI conference on Artificial Intelligence,
202
Developmental Bayesian Optimization of Black-Box with Visual Similarity-Based Transfer Learning
We present a developmental framework based on a long-term memory and
reasoning mechanisms (Vision Similarity and Bayesian Optimisation). This
architecture allows a robot to optimize autonomously hyper-parameters that need
to be tuned from any action and/or vision module, treated as a black-box. The
learning can take advantage of past experiences (stored in the episodic and
procedural memories) in order to warm-start the exploration using a set of
hyper-parameters previously optimized from objects similar to the new unknown
one (stored in a semantic memory). As example, the system has been used to
optimized 9 continuous hyper-parameters of a professional software (Kamido)
both in simulation and with a real robot (industrial robotic arm Fanuc) with a
total of 13 different objects. The robot is able to find a good object-specific
optimization in 68 (simulation) or 40 (real) trials. In simulation, we
demonstrate the benefit of the transfer learning based on visual similarity, as
opposed to an amnesic learning (i.e. learning from scratch all the time).
Moreover, with the real robot, we show that the method consistently outperforms
the manual optimization from an expert with less than 2 hours of training time
to achieve more than 88% of success
Accelerating Reinforcement Learning by Composing Solutions of Automatically Identified Subtasks
This paper discusses a system that accelerates reinforcement learning by
using transfer from related tasks. Without such transfer, even if two tasks are
very similar at some abstract level, an extensive re-learning effort is
required. The system achieves much of its power by transferring parts of
previously learned solutions rather than a single complete solution. The system
exploits strong features in the multi-dimensional function produced by
reinforcement learning in solving a particular task. These features are stable
and easy to recognize early in the learning process. They generate a
partitioning of the state space and thus the function. The partition is
represented as a graph. This is used to index and compose functions stored in a
case base to form a close approximation to the solution of the new task.
Experiments demonstrate that function composition often produces more than an
order of magnitude increase in learning rate compared to a basic reinforcement
learning algorithm
Logic, self-awareness and self-improvement: The metacognitive loop and the problem of brittleness
This essay describes a general approach to building perturbation-tolerant autonomous systems, based on the conviction that artificial agents should be able notice when something is amiss, assess the anomaly, and guide a solution into place. We call this basic strategy of self-guided learning the metacognitive loop; it involves the system monitoring, reasoning about, and, when necessary, altering its own decision-making components. In this essay, we (a) argue that equipping agents with a metacognitive loop can help to overcome the brittleness problem, (b) detail the metacognitive loop and its relation to our ongoing work on time-sensitive commonsense reasoning, (c) describe specific, implemented systems whose perturbation tolerance was improved by adding a metacognitive loop, and (d) outline both short-term and long-term research agendas
- …