1,261 research outputs found

    Sliced Wasserstein Distance for Learning Gaussian Mixture Models

    Full text link
    Gaussian mixture models (GMM) are powerful parametric tools with many applications in machine learning and computer vision. Expectation maximization (EM) is the most popular algorithm for estimating the GMM parameters. However, EM guarantees only convergence to a stationary point of the log-likelihood function, which could be arbitrarily worse than the optimal solution. Inspired by the relationship between the negative log-likelihood function and the Kullback-Leibler (KL) divergence, we propose an alternative formulation for estimating the GMM parameters using the sliced Wasserstein distance, which gives rise to a new algorithm. Specifically, we propose minimizing the sliced-Wasserstein distance between the mixture model and the data distribution with respect to the GMM parameters. In contrast to the KL-divergence, the energy landscape for the sliced-Wasserstein distance is more well-behaved and therefore more suitable for a stochastic gradient descent scheme to obtain the optimal GMM parameters. We show that our formulation results in parameter estimates that are more robust to random initializations and demonstrate that it can estimate high-dimensional data distributions more faithfully than the EM algorithm

    Fourier-Domain Inversion for the Modulo Radon Transform

    Full text link
    Inspired by the multiple-exposure fusion approach in computational photography, recently, several practitioners have explored the idea of high dynamic range (HDR) X-ray imaging and tomography. While establishing promising results, these approaches inherit the limitations of multiple-exposure fusion strategy. To overcome these disadvantages, the modulo Radon transform (MRT) has been proposed. The MRT is based on a co-design of hardware and algorithms. In the hardware step, Radon transform projections are folded using modulo non-linearities. Thereon, recovery is performed by algorithmically inverting the folding, thus enabling a single-shot, HDR approach to tomography. The first steps in this topic established rigorous mathematical treatment to the problem of reconstruction from folded projections. This paper takes a step forward by proposing a new, Fourier domain recovery algorithm that is backed by mathematical guarantees. The advantages include recovery at lower sampling rates while being agnostic to modulo threshold, lower computational complexity and empirical robustness to system noise. Beyond numerical simulations, we use prototype modulo ADC based hardware experiments to validate our claims. In particular, we report image recovery based on hardware measurements up to 10 times larger than the sensor's dynamic range while benefiting with lower quantization noise (∼\sim12 dB).Comment: 12 pages, submitted for possible publicatio

    Linear convergence of accelerated conditional gradient algorithms in spaces of measures

    Full text link
    A class of generalized conditional gradient algorithms for the solution of optimization problem in spaces of Radon measures is presented. The method iteratively inserts additional Dirac-delta functions and optimizes the corresponding coefficients. Under general assumptions, a sub-linear O(1/k)\mathcal{O}(1/k) rate in the objective functional is obtained, which is sharp in most cases. To improve efficiency, one can fully resolve the finite-dimensional subproblems occurring in each iteration of the method. We provide an analysis for the resulting procedure: under a structural assumption on the optimal solution, a linear O(ζk)\mathcal{O}(\zeta^k) convergence rate is obtained locally.Comment: 30 pages, 7 figure

    A function space framework for structural total variation regularization with applications in inverse problems

    Get PDF
    In this work, we introduce a function space setting for a wide class of structural/weighted total variation (TV) regularization methods motivated by their applications in inverse problems. In particular, we consider a regularizer that is the appropriate lower semi-continuous envelope (relaxation) of a suitable total variation type functional initially defined for sufficiently smooth functions. We study examples where this relaxation can be expressed explicitly, and we also provide refinements for weighted total variation for a wide range of weights. Since an integral characterization of the relaxation in function space is, in general, not always available, we show that, for a rather general linear inverse problems setting, instead of the classical Tikhonov regularization problem, one can equivalently solve a saddle-point problem where no a priori knowledge of an explicit formulation of the structural TV functional is needed. In particular, motivated by concrete applications, we deduce corresponding results for linear inverse problems with norm and Poisson log-likelihood data discrepancy terms. Finally, we provide proof-of-concept numerical examples where we solve the saddle-point problem for weighted TV denoising as well as for MR guided PET image reconstruction

    Thomas decompositions of parametric nonlinear control systems

    Full text link
    This paper presents an algorithmic method to study structural properties of nonlinear control systems in dependence of parameters. The result consists of a description of parameter configurations which cause different control-theoretic behaviour of the system (in terms of observability, flatness, etc.). The constructive symbolic method is based on the differential Thomas decomposition into disjoint simple systems, in particular its elimination properties

    Strong Asymptotic Assertions for Discrete MDL in Regression and Classification

    Full text link
    We study the properties of the MDL (or maximum penalized complexity) estimator for Regression and Classification, where the underlying model class is countable. We show in particular a finite bound on the Hellinger losses under the only assumption that there is a "true" model contained in the class. This implies almost sure convergence of the predictive distribution to the true one at a fast rate. It corresponds to Solomonoff's central theorem of universal induction, however with a bound that is exponentially larger.Comment: 6 two-column page

    Byzantine Approximate Agreement on Graphs

    Get PDF
    Consider a distributed system with n processors out of which f can be Byzantine faulty. In the approximate agreement task, each processor i receives an input value x_i and has to decide on an output value y_i such that 1) the output values are in the convex hull of the non-faulty processors\u27 input values, 2) the output values are within distance d of each other. Classically, the values are assumed to be from an m-dimensional Euclidean space, where m >= 1. In this work, we study the task in a discrete setting, where input values with some structure expressible as a graph. Namely, the input values are vertices of a finite graph G and the goal is to output vertices that are within distance d of each other in G, but still remain in the graph-induced convex hull of the input values. For d=0, the task reduces to consensus and cannot be solved with a deterministic algorithm in an asynchronous system even with a single crash fault. For any d >= 1, we show that the task is solvable in asynchronous systems when G is chordal and n > (omega+1)f, where omega is the clique number of G. In addition, we give the first Byzantine-tolerant algorithm for a variant of lattice agreement. For synchronous systems, we show tight resilience bounds for the exact variants of these and related tasks over a large class of combinatorial structures
    • …
    corecore