12 research outputs found
Approximate Nearest-Neighbor Search for Line Segments
Approximate nearest-neighbor search is a fundamental algorithmic problem that
continues to inspire study due its essential role in numerous contexts. In
contrast to most prior work, which has focused on point sets, we consider
nearest-neighbor queries against a set of line segments in , for
constant dimension . Given a set of disjoint line segments in
and an error parameter , the objective is to
build a data structure such that for any query point , it is possible to
return a line segment whose Euclidean distance from is at most
times the distance from to its nearest line segment. We
present a data structure for this problem with storage and query time , where is the spread of the set of
segments . Our approach is based on a covering of space by anisotropic
elements, which align themselves according to the orientations of nearby
segments.Comment: 20 pages (including appendix), 5 figure
Efficiently Sampling the PSD Cone with the Metric Dikin Walk
Semi-definite programs represent a frontier of efficient computation. While
there has been much progress on semi-definite optimization, with moderate-sized
instances currently solvable in practice by the interior-point method, the
basic problem of sampling semi-definite solutions remains a formidable
challenge. The direct application of known polynomial-time algorithms for
sampling general convex bodies to semi-definite sampling leads to a
prohibitively high running time. In addition, known general methods require an
expensive rounding phase as pre-processing. Here we analyze the Dikin walk, by
first adapting it to general metrics, then devising suitable metrics for the
PSD cone with affine constraints. The resulting mixing time and per-step
complexity are considerably smaller, and by an appropriate choice of the
metric, the dependence on the number of constraints can be made
polylogarithmic. We introduce a refined notion of self-concordant matrix
functions and give rules for combining different metrics. Along the way, we
further develop the theory of interior-point methods for sampling.Comment: 54 page
Importance is Important: A Guide to Informed Importance Tempering Methods
Informed importance tempering (IIT) is an easy-to-implement MCMC algorithm
that can be seen as an extension of the familiar Metropolis-Hastings algorithm
with the special feature that informed proposals are always accepted, and which
was shown in Zhou and Smith (2022) to converge much more quickly in some common
circumstances. This work develops a new, comprehensive guide to the use of IIT
in many situations. First, we propose two IIT schemes that run faster than
existing informed MCMC methods on discrete spaces by not requiring the
posterior evaluation of all neighboring states. Second, we integrate IIT with
other MCMC techniques, including simulated tempering, pseudo-marginal and
multiple-try methods (on general state spaces), which have been conventionally
implemented as Metropolis-Hastings schemes and can suffer from low acceptance
rates. The use of IIT allows us to always accept proposals and brings about new
opportunities for optimizing the sampler which are not possible under the
Metropolis-Hastings framework. Numerical examples illustrating our findings are
provided for each proposed algorithm, and a general theory on the complexity of
IIT methods is developed
Condition-number-independent convergence rate of Riemannian Hamiltonian Monte Carlo with numerical integrators
We study the convergence rate of discretized Riemannian Hamiltonian Monte
Carlo on sampling from distributions in the form of on a convex
body . We show that for distributions in the
form of on a polytope with constraints, the
convergence rate of a family of commonly-used integrators is independent of
and the geometry of the polytope. In
particular, the implicit midpoint method (IMM) and the generalized Leapfrog
method (LM) have a mixing time of to achieve
total variation distance to the target distribution. These
guarantees are based on a general bound on the convergence rate for densities
of the form in terms of parameters of the manifold and the
integrator. Our theoretical guarantee complements the empirical results of
[KLSV22], which shows that RHMC with IMM can sample ill-conditioned, non-smooth
and constrained distributions in very high dimension efficiently in practice.Comment: Improved writing & Theory for arXiv:2202.0190
Randomized Control in Performance Analysis and Empirical Asset Pricing
The present article explores the application of randomized control techniques
in empirical asset pricing and performance evaluation. It introduces geometric
random walks, a class of Markov chain Monte Carlo methods, to construct
flexible control groups in the form of random portfolios adhering to investor
constraints. The sampling-based methods enable an exploration of the
relationship between academically studied factor premia and performance in a
practical setting. In an empirical application, the study assesses the
potential to capture premias associated with size, value, quality, and momentum
within a strongly constrained setup, exemplified by the investor guidelines of
the MSCI Diversified Multifactor index. Additionally, the article highlights
issues with the more traditional use case of random portfolios for drawing
inferences in performance evaluation, showcasing challenges related to the
intricacies of high-dimensional geometry.Comment: 57 pages, 7 figures, 2 table