5,053 research outputs found
Admissible anytime-valid sequential inference must rely on nonnegative martingales
Wald's anytime-valid -values and Robbins' confidence sequences enable
sequential inference for composite and nonparametric classes of distributions
at arbitrary stopping times, as do more recent proposals involving Vovk's
`-values' or Shafer's `betting scores'. Examining the literature, one finds
that at the heart of all these (quite different) approaches has been the
identification of composite nonnegative (super)martingales. Thus, informally,
nonnegative (super)martingales are known to be sufficient for \emph{valid}
sequential inference. Our central contribution is to show that martingales are
also universal---all \emph{admissible} constructions of (composite) anytime
-values, confidence sequences, or -values must necessarily utilize
nonnegative martingales (or so-called max-martingales in the case of
-values). Sufficient conditions for composite admissibility are also
provided. Our proofs utilize a plethora of modern mathematical tools for
composite testing and estimation problems: max-martingales, Snell envelopes,
and new Doob-L\'evy martingales make appearances in previously unencountered
ways. Informally, if one wishes to perform anytime-valid sequential inference,
then any existing approach can be recovered or dominated using martingales. We
provide several sophisticated examples, with special focus on the nonparametric
problem of testing if a distribution is symmetric, where our new constructions
render past methods inadmissible.Comment: 35 page
Informed RRT*: Optimal Sampling-based Path Planning Focused via Direct Sampling of an Admissible Ellipsoidal Heuristic
Rapidly-exploring random trees (RRTs) are popular in motion planning because
they find solutions efficiently to single-query problems. Optimal RRTs (RRT*s)
extend RRTs to the problem of finding the optimal solution, but in doing so
asymptotically find the optimal path from the initial state to every state in
the planning domain. This behaviour is not only inefficient but also
inconsistent with their single-query nature.
For problems seeking to minimize path length, the subset of states that can
improve a solution can be described by a prolate hyperspheroid. We show that
unless this subset is sampled directly, the probability of improving a solution
becomes arbitrarily small in large worlds or high state dimensions. In this
paper, we present an exact method to focus the search by directly sampling this
subset.
The advantages of the presented sampling technique are demonstrated with a
new algorithm, Informed RRT*. This method retains the same probabilistic
guarantees on completeness and optimality as RRT* while improving the
convergence rate and final solution quality. We present the algorithm as a
simple modification to RRT* that could be further extended by more advanced
path-planning algorithms. We show experimentally that it outperforms RRT* in
rate of convergence, final solution cost, and ability to find difficult
passages while demonstrating less dependence on the state dimension and range
of the planning problem.Comment: 8 pages, 11 figures. Videos available at
https://www.youtube.com/watch?v=d7dX5MvDYTc and
https://www.youtube.com/watch?v=nsl-5MZfwu
Batch Informed Trees (BIT*): Informed Asymptotically Optimal Anytime Search
Path planning in robotics often requires finding high-quality solutions to
continuously valued and/or high-dimensional problems. These problems are
challenging and most planning algorithms instead solve simplified
approximations. Popular approximations include graphs and random samples, as
respectively used by informed graph-based searches and anytime sampling-based
planners. Informed graph-based searches, such as A*, traditionally use
heuristics to search a priori graphs in order of potential solution quality.
This makes their search efficient but leaves their performance dependent on the
chosen approximation. If its resolution is too low then they may not find a
(suitable) solution but if it is too high then they may take a prohibitively
long time to do so. Anytime sampling-based planners, such as RRT*,
traditionally use random sampling to approximate the problem domain
incrementally. This allows them to increase resolution until a suitable
solution is found but makes their search dependent on the order of
approximation. Arbitrary sequences of random samples approximate the problem
domain in every direction simultaneously and but may be prohibitively
inefficient at containing a solution. This paper unifies and extends these two
approaches to develop Batch Informed Trees (BIT*), an informed, anytime
sampling-based planner. BIT* solves continuous path planning problems
efficiently by using sampling and heuristics to alternately approximate and
search the problem domain. Its search is ordered by potential solution quality,
as in A*, and its approximation improves indefinitely with additional
computational time, as in RRT*. It is shown analytically to be almost-surely
asymptotically optimal and experimentally to outperform existing sampling-based
planners, especially on high-dimensional planning problems.Comment: International Journal of Robotics Research (IJRR). 32 Pages. 16
Figure
How widespread are non-linear crowding out effects? The response of private transfers to income in four developing countries
This paper investigates whether there is a non-linear relationship between income and the private transfers received by households in developing countries. If private transfers are unresponsive to household income, expansion of public social security and other transfer programs is unlikely to crowd out private transfers, contrary to concerns first raised by Barro and Becker. There is little existing evidence for crowding out effects in the literature, but this may be because they have been obscured by methods that ignore non-linearities. If donors switch from altruistic motivations to exchange motivations as recipient income increases, a sharp non-linear relationship between private transfers and income may result. In fact, threshold regression techniques find such non-linearity in the Philippines and after accounting for these there is evidence of serious crowding out, with 30 to 80 percent of private transfers potentially displaced for low-income households [Cox, Hansen and Jimenez 2004, 'How Responsiveare Private Transfers to Income?' Journal of Public Economics]. To see if these non-linear effects occur more widely, semiparametric and threshold regression methods are used to model private transfers in four developing countries - China, Indonesia, Papua New Guinea and Vietnam. The results of our paper suggest that non-linear crowding-out effects are not important features of transfer behaviour in these countries. The transfer derivatives under a variety of assumptions only range between 0 and -0.08. If our results are valid, expansions of public social security to cover the poorest households need not be stymied by offsetting private responses
Better parameter-free anytime search by minimizing time between solutions
This paper presents a new anytime search algorithm, any- time explicit estimation search (AEES). AEES is an anytime search algorithm which attempts to minimize the time between improvements to its incumbent solution by taking advantage of the differences between solution cost and length. We provide an argument that minimizing the time between solutions is the right thing to do for an anytime search algorithm and show that when actions have differing costs, many state-of-the-art search algorithms, including the search strategy of LAMA11 and anytime nonparametric A*, do not minimize the time between solutions. An empirical evaluation on seven domains shows that AEES often has both the shortest time between incumbent solutions and the best solution in hand for a wide variety of cutoffs
Game-theoretic statistics and safe anytime-valid inference
Safe anytime-valid inference (SAVI) provides measures of statistical evidence and certaintyâe-processes for testing and confidence sequences for estimationâthat remain valid at all stopping times, accommodating continuous monitoring and analysis of accumulating data and optional stopping or continuation for any reason. These measures crucially rely on test martingales, which are nonnegative martingales starting at one. Since a test martingale is the wealth process of a player in a betting game, SAVI centrally employs game-theoretic intuition, language and mathematics. We summarize the SAVI goals and philosophy, and report recent advances in testing composite hypotheses and estimating functionals in nonparametric settings
Stick-Breaking Policy Learning in Dec-POMDPs
Expectation maximization (EM) has recently been shown to be an efficient
algorithm for learning finite-state controllers (FSCs) in large decentralized
POMDPs (Dec-POMDPs). However, current methods use fixed-size FSCs and often
converge to maxima that are far from optimal. This paper considers a
variable-size FSC to represent the local policy of each agent. These
variable-size FSCs are constructed using a stick-breaking prior, leading to a
new framework called \emph{decentralized stick-breaking policy representation}
(Dec-SBPR). This approach learns the controller parameters with a variational
Bayesian algorithm without having to assume that the Dec-POMDP model is
available. The performance of Dec-SBPR is demonstrated on several benchmark
problems, showing that the algorithm scales to large problems while
outperforming other state-of-the-art methods
Information-based complexity, feedback and dynamics in convex programming
We study the intrinsic limitations of sequential convex optimization through
the lens of feedback information theory. In the oracle model of optimization,
an algorithm queries an {\em oracle} for noisy information about the unknown
objective function, and the goal is to (approximately) minimize every function
in a given class using as few queries as possible. We show that, in order for a
function to be optimized, the algorithm must be able to accumulate enough
information about the objective. This, in turn, puts limits on the speed of
optimization under specific assumptions on the oracle and the type of feedback.
Our techniques are akin to the ones used in statistical literature to obtain
minimax lower bounds on the risks of estimation procedures; the notable
difference is that, unlike in the case of i.i.d. data, a sequential
optimization algorithm can gather observations in a {\em controlled} manner, so
that the amount of information at each step is allowed to change in time. In
particular, we show that optimization algorithms often obey the law of
diminishing returns: the signal-to-noise ratio drops as the optimization
algorithm approaches the optimum. To underscore the generality of the tools, we
use our approach to derive fundamental lower bounds for a certain active
learning problem. Overall, the present work connects the intuitive notions of
information in optimization, experimental design, estimation, and active
learning to the quantitative notion of Shannon information.Comment: final version; to appear in IEEE Transactions on Information Theor
- âŠ