7,031 research outputs found

    Outlier detection using distributionally robust optimization under the Wasserstein metric

    Full text link
    We present a Distributionally Robust Optimization (DRO) approach to outlier detection in a linear regression setting, where the closeness of probability distributions is measured using the Wasserstein metric. Training samples contaminated with outliers skew the regression plane computed by least squares and thus impede outlier detection. Classical approaches, such as robust regression, remedy this problem by downweighting the contribution of atypical data points. In contrast, our Wasserstein DRO approach hedges against a family of distributions that are close to the empirical distribution. We show that the resulting formulation encompasses a class of models, which include the regularized Least Absolute Deviation (LAD) as a special case. We provide new insights into the regularization term and give guidance on the selection of the regularization coefficient from the standpoint of a confidence region. We establish two types of performance guarantees for the solution to our formulation under mild conditions. One is related to its out-of-sample behavior, and the other concerns the discrepancy between the estimated and true regression planes. Extensive numerical results demonstrate the superiority of our approach to both robust regression and the regularized LAD in terms of estimation accuracy and outlier detection rates

    Geometrical interpretation of fluctuating hydrodynamics in diffusive systems

    Get PDF
    We discuss geometric formulations of hydrodynamic limits in diffusive systems. Specifically, we describe a geometrical construction in the space of density profiles --- the Wasserstein geometry --- which allows the deterministic hydrodynamic evolution of the systems to be related to steepest descent of the free energy, and show how this formulation can be related to most probable paths of mesoscopic dissipative systems. The geometric viewpoint is also linked to fluctuating hydrodynamics of these systems via a saddle point argument.Comment: 19 page

    Geometry Helps to Compare Persistence Diagrams

    Full text link
    Exploiting geometric structure to improve the asymptotic complexity of discrete assignment problems is a well-studied subject. In contrast, the practical advantages of using geometry for such problems have not been explored. We implement geometric variants of the Hopcroft--Karp algorithm for bottleneck matching (based on previous work by Efrat el al.) and of the auction algorithm by Bertsekas for Wasserstein distance computation. Both implementations use k-d trees to replace a linear scan with a geometric proximity query. Our interest in this problem stems from the desire to compute distances between persistence diagrams, a problem that comes up frequently in topological data analysis. We show that our geometric matching algorithms lead to a substantial performance gain, both in running time and in memory consumption, over their purely combinatorial counterparts. Moreover, our implementation significantly outperforms the only other implementation available for comparing persistence diagrams.Comment: 20 pages, 10 figures; extended version of paper published in ALENEX 201

    Optimal Switching Synthesis for Jump Linear Systems with Gaussian initial state uncertainty

    Full text link
    This paper provides a method to design an optimal switching sequence for jump linear systems with given Gaussian initial state uncertainty. In the practical perspective, the initial state contains some uncertainties that come from measurement errors or sensor inaccuracies and we assume that the type of this uncertainty has the form of Gaussian distribution. In order to cope with Gaussian initial state uncertainty and to measure the system performance, Wasserstein metric that defines the distance between probability density functions is used. Combining with the receding horizon framework, an optimal switching sequence for jump linear systems can be obtained by minimizing the objective function that is expressed in terms of Wasserstein distance. The proposed optimal switching synthesis also guarantees the mean square stability for jump linear systems. The validations of the proposed methods are verified by examples.Comment: ASME Dynamic Systems and Control Conference (DSCC), 201

    Bayes and maximum likelihood for L1L^1-Wasserstein deconvolution of Laplace mixtures

    Full text link
    We consider the problem of recovering a distribution function on the real line from observations additively contaminated with errors following the standard Laplace distribution. Assuming that the latent distribution is completely unknown leads to a nonparametric deconvolution problem. We begin by studying the rates of convergence relative to the L2L^2-norm and the Hellinger metric for the direct problem of estimating the sampling density, which is a mixture of Laplace densities with a possibly unbounded set of locations: the rate of convergence for the Bayes' density estimator corresponding to a Dirichlet process prior over the space of all mixing distributions on the real line matches, up to a logarithmic factor, with the n3/8log1/8nn^{-3/8}\log^{1/8}n rate for the maximum likelihood estimator. Then, appealing to an inversion inequality translating the L2L^2-norm and the Hellinger distance between general kernel mixtures, with a kernel density having polynomially decaying Fourier transform, into any LpL^p-Wasserstein distance, p1p\geq1, between the corresponding mixing distributions, provided their Laplace transforms are finite in some neighborhood of zero, we derive the rates of convergence in the L1L^1-Wasserstein metric for the Bayes' and maximum likelihood estimators of the mixing distribution. Merging in the L1L^1-Wasserstein distance between Bayes and maximum likelihood follows as a by-product, along with an assessment on the stochastic order of the discrepancy between the two estimation procedures
    corecore