3,524 research outputs found
Accelerating Reinforcement Learning by Composing Solutions of Automatically Identified Subtasks
This paper discusses a system that accelerates reinforcement learning by
using transfer from related tasks. Without such transfer, even if two tasks are
very similar at some abstract level, an extensive re-learning effort is
required. The system achieves much of its power by transferring parts of
previously learned solutions rather than a single complete solution. The system
exploits strong features in the multi-dimensional function produced by
reinforcement learning in solving a particular task. These features are stable
and easy to recognize early in the learning process. They generate a
partitioning of the state space and thus the function. The partition is
represented as a graph. This is used to index and compose functions stored in a
case base to form a close approximation to the solution of the new task.
Experiments demonstrate that function composition often produces more than an
order of magnitude increase in learning rate compared to a basic reinforcement
learning algorithm
An Asymptotically-Optimal Sampling-Based Algorithm for Bi-directional Motion Planning
Bi-directional search is a widely used strategy to increase the success and
convergence rates of sampling-based motion planning algorithms. Yet, few
results are available that merge both bi-directional search and asymptotic
optimality into existing optimal planners, such as PRM*, RRT*, and FMT*. The
objective of this paper is to fill this gap. Specifically, this paper presents
a bi-directional, sampling-based, asymptotically-optimal algorithm named
Bi-directional FMT* (BFMT*) that extends the Fast Marching Tree (FMT*)
algorithm to bi-directional search while preserving its key properties, chiefly
lazy search and asymptotic optimality through convergence in probability. BFMT*
performs a two-source, lazy dynamic programming recursion over a set of
randomly-drawn samples, correspondingly generating two search trees: one in
cost-to-come space from the initial configuration and another in cost-to-go
space from the goal configuration. Numerical experiments illustrate the
advantages of BFMT* over its unidirectional counterpart, as well as a number of
other state-of-the-art planners.Comment: Accepted to the 2015 IEEE Intelligent Robotics and Systems Conference
in Hamburg, Germany. This submission represents the long version of the
conference manuscript, with additional proof details (Section IV) regarding
the asymptotic optimality of the BFMT* algorith
Modeling the power consumption of a Wifibot and studying the role of communication cost in operation time
Mobile robots are becoming part of our every day living at home, work or
entertainment. Due to their limited power capabilities, the development of new
energy consumption models can lead to energy conservation and energy efficient
designs. In this paper, we carry out a number of experiments and we focus on
the motors power consumption of a specific robot called Wifibot. Based on the
experimentation results, we build models for different speed and acceleration
levels. We compare the motors power consumption to other robot running modes.
We, also, create a simple robot network scenario and we investigate whether
forwarding data through a closer node could lead to longer operation times. We
assess the effect energy capacity, traveling distance and data rate on the
operation time
Real Time Motion Generation for Mobile Robot
International audienc
Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition
This paper presents the MAXQ approach to hierarchical reinforcement learning
based on decomposing the target Markov decision process (MDP) into a hierarchy
of smaller MDPs and decomposing the value function of the target MDP into an
additive combination of the value functions of the smaller MDPs. The paper
defines the MAXQ hierarchy, proves formal results on its representational
power, and establishes five conditions for the safe use of state abstractions.
The paper presents an online model-free learning algorithm, MAXQ-Q, and proves
that it converges wih probability 1 to a kind of locally-optimal policy known
as a recursively optimal policy, even in the presence of the five kinds of
state abstraction. The paper evaluates the MAXQ representation and MAXQ-Q
through a series of experiments in three domains and shows experimentally that
MAXQ-Q (with state abstractions) converges to a recursively optimal policy much
faster than flat Q learning. The fact that MAXQ learns a representation of the
value function has an important benefit: it makes it possible to compute and
execute an improved, non-hierarchical policy via a procedure similar to the
policy improvement step of policy iteration. The paper demonstrates the
effectiveness of this non-hierarchical execution experimentally. Finally, the
paper concludes with a comparison to related work and a discussion of the
design tradeoffs in hierarchical reinforcement learning.Comment: 63 pages, 15 figure
Dynamic collision avoidance system for a manipulator based on RGB-D data
The new paradigms of Industry 4.0 demand the collabora-
tion between robot and humans. They could help and collaborate each
other without any additional safety unlike other manipulators. The robot
should have the ability of acquire the environment and plan (or re-plan)
on-the-
y the movement avoiding the obstacles and people. This paper
proposes a system that acquires the environment space, based on a kinect
sensor, performs the path planning of a UR5 manipulator for pick and
place tasks while avoiding the objects, based on the point cloud from
kinect. Results allow to validate the proposed system.Project ”TEC4Growth - Pervasive Intelligence, Enhancers and Proofs of Concept with Industrial Impact/NORTE-01-0145-FEDER-000020” is financed by the North Portugal Regional Operational. Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, and through the European Regional Development Fund (ERDF). This work is also financed by the ERDF – European Regional Development Fund through the Operational Programme for Competitiveness and Internationalisation -COMPETE 2020 Programme within project POCI-01-0145-FEDER-006961, and by National Funds through the FCT – Fundação para a Ciência e a Tecnologia (Portuguese Foundation for Science and Technology) as part of project UID/EEA/50014/2013.info:eu-repo/semantics/publishedVersio
Scaling Robot Motion Planning to Multi-core Processors and the Cloud
Imagine a world in which robots safely interoperate with humans, gracefully and efficiently accomplishing everyday tasks. The robot's motions for these tasks, constrained by the design of the robot and task at hand, must avoid collisions with obstacles. Unfortunately, planning a constrained obstacle-free motion for a robot is computationally complex---often resulting in slow computation of inefficient motions. The methods in this dissertation speed up this motion plan computation with new algorithms and data structures that leverage readily available parallel processing, whether that processing power is on the robot or in the cloud, enabling robots to operate safer, more gracefully, and with improved efficiency. The contributions of this dissertation that enable faster motion planning are novel parallel lock-free algorithms, fast and concurrent nearest neighbor searching data structures, cache-aware operation, and split robot-cloud computation. Parallel lock-free algorithms avoid contention over shared data structures, resulting in empirical speedup proportional to the number of CPU cores working on the problem. Fast nearest neighbor data structures speed up searching in SO(3) and SE(3) metric spaces, which are needed for rigid body motion planning. Concurrent nearest neighbor data structures improve searching performance on metric spaces common to robot motion planning problems, while providing asymptotic wait-free concurrent operation. Cache-aware operation avoids long memory access times, allowing the algorithm to exhibit superlinear speedup. Split robot-cloud computation enables robots with low-power CPUs to react to changing environments by having the robot compute reactive paths in real-time from a set of motion plan options generated in a computationally intensive cloud-based algorithm. We demonstrate the scalability and effectiveness of our contributions in solving motion planning problems both in simulation and on physical robots of varying design and complexity. Problems include finding a solution to a complex motion planning problem, pre-computing motion plans that converge towards the optimal, and reactive interaction with dynamic environments. Robots include 2D holonomic robots, 3D rigid-body robots, a self-driving 1/10 scale car, articulated robot arms with and without mobile bases, and a small humanoid robot.Doctor of Philosoph
- …