12 research outputs found

    Approximate Nearest-Neighbor Search for Line Segments

    Get PDF
    Approximate nearest-neighbor search is a fundamental algorithmic problem that continues to inspire study due its essential role in numerous contexts. In contrast to most prior work, which has focused on point sets, we consider nearest-neighbor queries against a set of line segments in Rd\mathbb{R}^d, for constant dimension dd. Given a set SS of nn disjoint line segments in Rd\mathbb{R}^d and an error parameter ε>0\varepsilon > 0, the objective is to build a data structure such that for any query point qq, it is possible to return a line segment whose Euclidean distance from qq is at most (1+ε)(1+\varepsilon) times the distance from qq to its nearest line segment. We present a data structure for this problem with storage O((n2/εd)log(Δ/ε))O((n^2/\varepsilon^{d}) \log (\Delta/\varepsilon)) and query time O(log(max(n,Δ)/ε))O(\log (\max(n,\Delta)/\varepsilon)), where Δ\Delta is the spread of the set of segments SS. Our approach is based on a covering of space by anisotropic elements, which align themselves according to the orientations of nearby segments.Comment: 20 pages (including appendix), 5 figure

    Efficiently Sampling the PSD Cone with the Metric Dikin Walk

    Full text link
    Semi-definite programs represent a frontier of efficient computation. While there has been much progress on semi-definite optimization, with moderate-sized instances currently solvable in practice by the interior-point method, the basic problem of sampling semi-definite solutions remains a formidable challenge. The direct application of known polynomial-time algorithms for sampling general convex bodies to semi-definite sampling leads to a prohibitively high running time. In addition, known general methods require an expensive rounding phase as pre-processing. Here we analyze the Dikin walk, by first adapting it to general metrics, then devising suitable metrics for the PSD cone with affine constraints. The resulting mixing time and per-step complexity are considerably smaller, and by an appropriate choice of the metric, the dependence on the number of constraints can be made polylogarithmic. We introduce a refined notion of self-concordant matrix functions and give rules for combining different metrics. Along the way, we further develop the theory of interior-point methods for sampling.Comment: 54 page

    Importance is Important: A Guide to Informed Importance Tempering Methods

    Full text link
    Informed importance tempering (IIT) is an easy-to-implement MCMC algorithm that can be seen as an extension of the familiar Metropolis-Hastings algorithm with the special feature that informed proposals are always accepted, and which was shown in Zhou and Smith (2022) to converge much more quickly in some common circumstances. This work develops a new, comprehensive guide to the use of IIT in many situations. First, we propose two IIT schemes that run faster than existing informed MCMC methods on discrete spaces by not requiring the posterior evaluation of all neighboring states. Second, we integrate IIT with other MCMC techniques, including simulated tempering, pseudo-marginal and multiple-try methods (on general state spaces), which have been conventionally implemented as Metropolis-Hastings schemes and can suffer from low acceptance rates. The use of IIT allows us to always accept proposals and brings about new opportunities for optimizing the sampler which are not possible under the Metropolis-Hastings framework. Numerical examples illustrating our findings are provided for each proposed algorithm, and a general theory on the complexity of IIT methods is developed

    Condition-number-independent convergence rate of Riemannian Hamiltonian Monte Carlo with numerical integrators

    Full text link
    We study the convergence rate of discretized Riemannian Hamiltonian Monte Carlo on sampling from distributions in the form of ef(x)e^{-f(x)} on a convex body MRn\mathcal{M}\subset\mathbb{R}^{n}. We show that for distributions in the form of eαxe^{-\alpha^{\top}x} on a polytope with mm constraints, the convergence rate of a family of commonly-used integrators is independent of α2\left\Vert \alpha\right\Vert _{2} and the geometry of the polytope. In particular, the implicit midpoint method (IMM) and the generalized Leapfrog method (LM) have a mixing time of O~(mn3)\widetilde{O}\left(mn^{3}\right) to achieve ϵ\epsilon total variation distance to the target distribution. These guarantees are based on a general bound on the convergence rate for densities of the form ef(x)e^{-f(x)} in terms of parameters of the manifold and the integrator. Our theoretical guarantee complements the empirical results of [KLSV22], which shows that RHMC with IMM can sample ill-conditioned, non-smooth and constrained distributions in very high dimension efficiently in practice.Comment: Improved writing & Theory for arXiv:2202.0190

    Randomized Control in Performance Analysis and Empirical Asset Pricing

    Full text link
    The present article explores the application of randomized control techniques in empirical asset pricing and performance evaluation. It introduces geometric random walks, a class of Markov chain Monte Carlo methods, to construct flexible control groups in the form of random portfolios adhering to investor constraints. The sampling-based methods enable an exploration of the relationship between academically studied factor premia and performance in a practical setting. In an empirical application, the study assesses the potential to capture premias associated with size, value, quality, and momentum within a strongly constrained setup, exemplified by the investor guidelines of the MSCI Diversified Multifactor index. Additionally, the article highlights issues with the more traditional use case of random portfolios for drawing inferences in performance evaluation, showcasing challenges related to the intricacies of high-dimensional geometry.Comment: 57 pages, 7 figures, 2 table
    corecore