24,293 research outputs found
Batch Informed Trees (BIT*): Informed Asymptotically Optimal Anytime Search
Path planning in robotics often requires finding high-quality solutions to
continuously valued and/or high-dimensional problems. These problems are
challenging and most planning algorithms instead solve simplified
approximations. Popular approximations include graphs and random samples, as
respectively used by informed graph-based searches and anytime sampling-based
planners. Informed graph-based searches, such as A*, traditionally use
heuristics to search a priori graphs in order of potential solution quality.
This makes their search efficient but leaves their performance dependent on the
chosen approximation. If its resolution is too low then they may not find a
(suitable) solution but if it is too high then they may take a prohibitively
long time to do so. Anytime sampling-based planners, such as RRT*,
traditionally use random sampling to approximate the problem domain
incrementally. This allows them to increase resolution until a suitable
solution is found but makes their search dependent on the order of
approximation. Arbitrary sequences of random samples approximate the problem
domain in every direction simultaneously and but may be prohibitively
inefficient at containing a solution. This paper unifies and extends these two
approaches to develop Batch Informed Trees (BIT*), an informed, anytime
sampling-based planner. BIT* solves continuous path planning problems
efficiently by using sampling and heuristics to alternately approximate and
search the problem domain. Its search is ordered by potential solution quality,
as in A*, and its approximation improves indefinitely with additional
computational time, as in RRT*. It is shown analytically to be almost-surely
asymptotically optimal and experimentally to outperform existing sampling-based
planners, especially on high-dimensional planning problems.Comment: International Journal of Robotics Research (IJRR). 32 Pages. 16
Figure
Batch Informed Trees (BIT*): Sampling-based Optimal Planning via the Heuristically Guided Search of Implicit Random Geometric Graphs
In this paper, we present Batch Informed Trees (BIT*), a planning algorithm
based on unifying graph- and sampling-based planning techniques. By recognizing
that a set of samples describes an implicit random geometric graph (RGG), we
are able to combine the efficient ordered nature of graph-based techniques,
such as A*, with the anytime scalability of sampling-based algorithms, such as
Rapidly-exploring Random Trees (RRT).
BIT* uses a heuristic to efficiently search a series of increasingly dense
implicit RGGs while reusing previous information. It can be viewed as an
extension of incremental graph-search techniques, such as Lifelong Planning A*
(LPA*), to continuous problem domains as well as a generalization of existing
sampling-based optimal planners. It is shown that it is probabilistically
complete and asymptotically optimal.
We demonstrate the utility of BIT* on simulated random worlds in
and and manipulation problems on CMU's HERB, a
14-DOF two-armed robot. On these problems, BIT* finds better solutions faster
than RRT, RRT*, Informed RRT*, and Fast Marching Trees (FMT*) with faster
anytime convergence towards the optimum, especially in high dimensions.Comment: 8 Pages. 6 Figures. Video available at
http://www.youtube.com/watch?v=TQIoCC48gp
Sequential change-point detection when unknown parameters are present in the pre-change distribution
In the sequential change-point detection literature, most research specifies
a required frequency of false alarms at a given pre-change distribution
and tries to minimize the detection delay for every possible
post-change distribution . In this paper, motivated by a number of
practical examples, we first consider the reverse question by specifying a
required detection delay at a given post-change distribution and trying to
minimize the frequency of false alarms for every possible pre-change
distribution . We present asymptotically optimal procedures for
one-parameter exponential families. Next, we develop a general theory for
change-point problems when both the pre-change distribution and
the post-change distribution involve unknown parameters. We also
apply our approach to the special case of detecting shifts in the mean of
independent normal observations.Comment: Published at http://dx.doi.org/10.1214/009053605000000859 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Adaptive Policies for Sequential Sampling under Incomplete Information and a Cost Constraint
We consider the problem of sequential sampling from a finite number of
independent statistical populations to maximize the expected infinite horizon
average outcome per period, under a constraint that the expected average
sampling cost does not exceed an upper bound. The outcome distributions are not
known. We construct a class of consistent adaptive policies, under which the
average outcome converges with probability 1 to the true value under complete
information for all distributions with finite means. We also compare the rate
of convergence for various policies in this class using simulation
Non-Bayesian Quickest Detection with Stochastic Sample Right Constraints
In this paper, we study the design and analysis of optimal detection scheme
for sensors that are deployed to monitor the change in the environment and are
powered by the energy harvested from the environment. In this type of
applications, detection delay is of paramount importance. We model this problem
as quickest change detection problem with a stochastic energy constraint. In
particular, a wireless sensor powered by renewable energy takes observations
from a random sequence, whose distribution will change at a certain unknown
time. Such a change implies events of interest. The energy in the sensor is
consumed by taking observations and is replenished randomly. The sensor cannot
take observations if there is no energy left in the battery. Our goal is to
design a power allocation scheme and a detection strategy to minimize the worst
case detection delay, which is the difference between the time when an alarm is
raised and the time when the change occurs. Two types of average run length
(ARL) constraint, namely an algorithm level ARL constraint and an system level
ARL constraint, are considered. We propose a low complexity scheme in which the
energy allocation rule is to spend energy to take observations as long as the
battery is not empty and the detection scheme is the Cumulative Sum test. We
show that this scheme is optimal for the formulation with the algorithm level
ARL constraint and is asymptotically optimal for the formulations with the
system level ARL constraint.Comment: 30 pages, 5 figure
Decentralized Learning for Multi-player Multi-armed Bandits
We consider the problem of distributed online learning with multiple players
in multi-armed bandits (MAB) models. Each player can pick among multiple arms.
When a player picks an arm, it gets a reward. We consider both i.i.d. reward
model and Markovian reward model. In the i.i.d. model each arm is modelled as
an i.i.d. process with an unknown distribution with an unknown mean. In the
Markovian model, each arm is modelled as a finite, irreducible, aperiodic and
reversible Markov chain with an unknown probability transition matrix and
stationary distribution. The arms give different rewards to different players.
If two players pick the same arm, there is a "collision", and neither of them
get any reward. There is no dedicated control channel for coordination or
communication among the players. Any other communication between the users is
costly and will add to the regret. We propose an online index-based distributed
learning policy called algorithm that trades off
\textit{exploration v. exploitation} in the right way, and achieves expected
regret that grows at most as near-. The motivation comes from
opportunistic spectrum access by multiple secondary users in cognitive radio
networks wherein they must pick among various wireless channels that look
different to different users. This is the first distributed learning algorithm
for multi-player MABs to the best of our knowledge.Comment: 33 pages, 3 figures. Submitted to IEEE Transactions on Information
Theor
POWERPLAY: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem
Most of computer science focuses on automatically solving given computational
problems. I focus on automatically inventing or discovering problems in a way
inspired by the playful behavior of animals and humans, to train a more and
more general problem solver from scratch in an unsupervised fashion. Consider
the infinite set of all computable descriptions of tasks with possibly
computable solutions. The novel algorithmic framework POWERPLAY (2011)
continually searches the space of possible pairs of new tasks and modifications
of the current problem solver, until it finds a more powerful problem solver
that provably solves all previously learned tasks plus the new one, while the
unmodified predecessor does not. Wow-effects are achieved by continually making
previously learned skills more efficient such that they require less time and
space. New skills may (partially) re-use previously learned skills. POWERPLAY's
search orders candidate pairs of tasks and solver modifications by their
conditional computational (time & space) complexity, given the stored
experience so far. The new task and its corresponding task-solving skill are
those first found and validated. The computational costs of validating new
tasks need not grow with task repertoire size. POWERPLAY's ongoing search for
novelty keeps breaking the generalization abilities of its present solver. This
is related to Goedel's sequence of increasingly powerful formal theories based
on adding formerly unprovable statements to the axioms without affecting
previously provable theorems. The continually increasing repertoire of problem
solving procedures can be exploited by a parallel search for solutions to
additional externally posed tasks. POWERPLAY may be viewed as a greedy but
practical implementation of basic principles of creativity. A first
experimental analysis can be found in separate papers [53,54].Comment: 21 pages, additional connections to previous work, references to
first experiments with POWERPLA
- ā¦