Search CORE

24,293 research outputs found

Batch Informed Trees (BIT*): Informed Asymptotically Optimal Anytime Search

Author: Barfoot Timothy D.
Gammell Jonathan D.
Srinivasa Siddhartha S.
Publication venue: 'SAGE Publications'
Publication date: 01/01/2020
Field of study

Path planning in robotics often requires finding high-quality solutions to continuously valued and/or high-dimensional problems. These problems are challenging and most planning algorithms instead solve simplified approximations. Popular approximations include graphs and random samples, as respectively used by informed graph-based searches and anytime sampling-based planners. Informed graph-based searches, such as A*, traditionally use heuristics to search a priori graphs in order of potential solution quality. This makes their search efficient but leaves their performance dependent on the chosen approximation. If its resolution is too low then they may not find a (suitable) solution but if it is too high then they may take a prohibitively long time to do so. Anytime sampling-based planners, such as RRT*, traditionally use random sampling to approximate the problem domain incrementally. This allows them to increase resolution until a suitable solution is found but makes their search dependent on the order of approximation. Arbitrary sequences of random samples approximate the problem domain in every direction simultaneously and but may be prohibitively inefficient at containing a solution. This paper unifies and extends these two approaches to develop Batch Informed Trees (BIT*), an informed, anytime sampling-based planner. BIT* solves continuous path planning problems efficiently by using sampling and heuristics to alternately approximate and search the problem domain. Its search is ordered by potential solution quality, as in A*, and its approximation improves indefinitely with additional computational time, as in RRT*. It is shown analytically to be almost-surely asymptotically optimal and experimentally to outperform existing sampling-based planners, especially on high-dimensional planning problems.Comment: International Journal of Robotics Research (IJRR). 32 Pages. 16 Figure

arXiv.org e-Print Archive

Oxford University Research Archive

Batch Informed Trees (BIT*): Sampling-based Optimal Planning via the Heuristically Guided Search of Implicit Random Geometric Graphs

Author: Barfoot Timothy D.
Gammell Jonathan D.
Srinivasa Siddhartha S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/08/2015
Field of study

In this paper, we present Batch Informed Trees (BIT*), a planning algorithm based on unifying graph- and sampling-based planning techniques. By recognizing that a set of samples describes an implicit random geometric graph (RGG), we are able to combine the efficient ordered nature of graph-based techniques, such as A*, with the anytime scalability of sampling-based algorithms, such as Rapidly-exploring Random Trees (RRT). BIT* uses a heuristic to efficiently search a series of increasingly dense implicit RGGs while reusing previous information. It can be viewed as an extension of incremental graph-search techniques, such as Lifelong Planning A* (LPA*), to continuous problem domains as well as a generalization of existing sampling-based optimal planners. It is shown that it is probabilistically complete and asymptotically optimal. We demonstrate the utility of BIT* on simulated random worlds in

\mathbb{R}^2

and

\mathbb{R}^8

and manipulation problems on CMU's HERB, a 14-DOF two-armed robot. On these problems, BIT* finds better solutions faster than RRT, RRT*, Informed RRT*, and Fast Marching Trees (FMT*) with faster anytime convergence towards the optimum, especially in high dimensions.Comment: 8 Pages. 6 Figures. Video available at http://www.youtube.com/watch?v=TQIoCC48gp

arXiv.org e-Print Archive

CiteSeerX

Sequential change-point detection when unknown parameters are present in the pre-change distribution

Author: Mei Yajun
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/02/2006
Field of study

In the sequential change-point detection literature, most research specifies a required frequency of false alarms at a given pre-change distribution

f_{\theta}

and tries to minimize the detection delay for every possible post-change distribution

g_{\lambda}

. In this paper, motivated by a number of practical examples, we first consider the reverse question by specifying a required detection delay at a given post-change distribution and trying to minimize the frequency of false alarms for every possible pre-change distribution

f_{\theta}

. We present asymptotically optimal procedures for one-parameter exponential families. Next, we develop a general theory for change-point problems when both the pre-change distribution

f_{\theta}

and the post-change distribution

g_{\lambda}

involve unknown parameters. We also apply our approach to the special case of detecting shifts in the mean of independent normal observations.Comment: Published at http://dx.doi.org/10.1214/009053605000000859 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Caltech Authors

Adaptive Policies for Sequential Sampling under Incomplete Information and a Cost Constraint

Author: Burnetas Apostolos
Kanavetas Odysseas
Publication venue
Publication date: 19/01/2012
Field of study

We consider the problem of sequential sampling from a finite number of independent statistical populations to maximize the expected infinite horizon average outcome per period, under a constraint that the expected average sampling cost does not exceed an upper bound. The outcome distributions are not known. We construct a class of consistent adaptive policies, under which the average outcome converges with probability 1 to the true value under complete information for all distributions with finite means. We also compare the rate of convergence for various policies in this class using simulation

arXiv.org e-Print Archive

CiteSeerX

Non-Bayesian Quickest Detection with Stochastic Sample Right Constraints

Author: Geng Jun
Lai Lifeng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/02/2013
Field of study

In this paper, we study the design and analysis of optimal detection scheme for sensors that are deployed to monitor the change in the environment and are powered by the energy harvested from the environment. In this type of applications, detection delay is of paramount importance. We model this problem as quickest change detection problem with a stochastic energy constraint. In particular, a wireless sensor powered by renewable energy takes observations from a random sequence, whose distribution will change at a certain unknown time. Such a change implies events of interest. The energy in the sensor is consumed by taking observations and is replenished randomly. The sensor cannot take observations if there is no energy left in the battery. Our goal is to design a power allocation scheme and a detection strategy to minimize the worst case detection delay, which is the difference between the time when an alarm is raised and the time when the change occurs. Two types of average run length (ARL) constraint, namely an algorithm level ARL constraint and an system level ARL constraint, are considered. We propose a low complexity scheme in which the energy allocation rule is to spend energy to take observations as long as the battery is not empty and the detection scheme is the Cumulative Sum test. We show that this scheme is optimal for the formulation with the algorithm level ARL constraint and is asymptotically optimal for the formulations with the system level ARL constraint.Comment: 30 pages, 5 figure

arXiv.org e-Print Archive

CiteSeerX

Decentralized Learning for Multi-player Multi-armed Bandits

Author: Jain Rahul
Kalathil Dileep
Nayyar Naumaan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/06/2012
Field of study

We consider the problem of distributed online learning with multiple players in multi-armed bandits (MAB) models. Each player can pick among multiple arms. When a player picks an arm, it gets a reward. We consider both i.i.d. reward model and Markovian reward model. In the i.i.d. model each arm is modelled as an i.i.d. process with an unknown distribution with an unknown mean. In the Markovian model, each arm is modelled as a finite, irreducible, aperiodic and reversible Markov chain with an unknown probability transition matrix and stationary distribution. The arms give different rewards to different players. If two players pick the same arm, there is a "collision", and neither of them get any reward. There is no dedicated control channel for coordination or communication among the players. Any other communication between the users is costly and will add to the regret. We propose an online index-based distributed learning policy called

{\tt dUCB_4}

algorithm that trades off \textit{exploration v. exploitation} in the right way, and achieves expected regret that grows at most as near-

O(\log^2 T)

. The motivation comes from opportunistic spectrum access by multiple secondary users in cognitive radio networks wherein they must pick among various wireless channels that look different to different users. This is the first distributed learning algorithm for multi-player MABs to the best of our knowledge.Comment: 33 pages, 3 figures. Submitted to IEEE Transactions on Information Theor

arXiv.org e-Print Archive

Crossref

POWERPLAY: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem

Author: Schmidhuber Jürgen
Publication venue
Publication date: 01/01/2011
Field of study

Most of computer science focuses on automatically solving given computational problems. I focus on automatically inventing or discovering problems in a way inspired by the playful behavior of animals and humans, to train a more and more general problem solver from scratch in an unsupervised fashion. Consider the infinite set of all computable descriptions of tasks with possibly computable solutions. The novel algorithmic framework POWERPLAY (2011) continually searches the space of possible pairs of new tasks and modifications of the current problem solver, until it finds a more powerful problem solver that provably solves all previously learned tasks plus the new one, while the unmodified predecessor does not. Wow-effects are achieved by continually making previously learned skills more efficient such that they require less time and space. New skills may (partially) re-use previously learned skills. POWERPLAY's search orders candidate pairs of tasks and solver modifications by their conditional computational (time & space) complexity, given the stored experience so far. The new task and its corresponding task-solving skill are those first found and validated. The computational costs of validating new tasks need not grow with task repertoire size. POWERPLAY's ongoing search for novelty keeps breaking the generalization abilities of its present solver. This is related to Goedel's sequence of increasingly powerful formal theories based on adding formerly unprovable statements to the axioms without affecting previously provable theorems. The continually increasing repertoire of problem solving procedures can be exploited by a parallel search for solutions to additional externally posed tasks. POWERPLAY may be viewed as a greedy but practical implementation of basic principles of creativity. A first experimental analysis can be found in separate papers [53,54].Comment: 21 pages, additional connections to previous work, references to first experiments with POWERPLA

arXiv.org e-Print Archive

CiteSeerX

Crossref

Directory of Open Access Journals

Frontiers - Publisher Connector

PubMed Central