53 research outputs found

    Convergence Analysis and Improvements for Projection Algorithms and Splitting Methods

    Get PDF
    Non-smooth convex optimization problems occur in all fields of engineering. A common approach to solving this class of problems is proximal algorithms, or splitting methods. These first-order optimization algorithms are often simple, well suited to solve large-scale problems and have a low computational cost per iteration. Essentially, they encode the solution to an optimization problem as a fixed point of some operator, and iterating this operator eventually results in convergence to an optimal point. However, as for other first order methods, the convergence rate is heavily dependent on the conditioning of the problem. Even though the per-iteration cost is usually low, the number of iterations can become prohibitively large for ill-conditioned problems, especially if a high accuracy solution is sought.In this thesis, a few methods for alleviating this slow convergence are studied, which can be divided into two main approaches. The first are heuristic methods that can be applied to a range of fixed-point algorithms. They are based on understanding typical behavior of these algorithms. While these methods are shown to converge, they come with no guarantees on improved convergence rates.The other approach studies the theoretical rates of a class of projection methods that are used to solve convex feasibility problems. These are problems where the goal is to find a point in the intersection of two, or possibly more, convex sets. A study of how the parameters in the algorithm affect the theoretical convergence rate is presented, as well as how they can be chosen to optimize this rate

    Projected Statistical Methods for Distributional Data on the Real Line with the Wasserstein Metric

    Get PDF
    We present a novel class of projected methods, to perform statistical analysis on a data set of probability distributions on the real line, with the 2-Wasserstein metric. We focus in particular on Principal Component Analysis (PCA) and regression. To define these models, we exploit a representation of the Wasserstein space closely related to its weak Riemannian structure, by mapping the data to a suitable linear space and using a metric projection operator to constrain the results in the Wasserstein space. By carefully choosing the tangent point, we are able to derive fast empirical methods, exploiting a constrained B-spline approximation. As a byproduct of our approach, we are also able to derive faster routines for previous work on PCA for distributions. By means of simulation studies, we compare our approaches to previously proposed methods, showing that our projected PCA has similar performance for a fraction of the computational cost and that the projected regression is extremely flexible even under misspecification. Several theoretical properties of the models are investigated and asymptotic consistency is proven. Two real world applications to Covid-19 mortality in the US and wind speed forecasting are discussed

    Network dependence

    Get PDF
    I am grateful for funding from the Spanish Ministry of Economy and Competitiveness (MDM2014-0431 and ECO2017-86675-P) and the Community of Madrid (MadEco-CM S2015/HUM-3444)Programa de Doctorado en Economía por la Universidad Carlos III de MadridPresidente: Wenceslao González Manteiga.- Secretario: Carlos Velasco Gómez.- Vocal: Gábor Lugos

    The Geometry of Monotone Operator Splitting Methods

    Full text link
    We propose a geometric framework to describe and analyze a wide array of operator splitting methods for solving monotone inclusion problems. The initial inclusion problem, which typically involves several operators combined through monotonicity-preserving operations, is seldom solvable in its original form. We embed it in an auxiliary space, where it is associated with a surrogate monotone inclusion problem with a more tractable structure and which allows for easy recovery of solutions to the initial problem. The surrogate problem is solved by successive projections onto half-spaces containing its solution set. The outer approximation half-spaces are constructed by using the individual operators present in the model separately. This geometric framework is shown to encompass traditional methods as well as state-of-the-art asynchronous block-iterative algorithms, and its flexible structure provides a pattern to design new ones

    Non-Markovian Quantum Process Tomography

    Full text link
    Characterisation protocols have so far played a central role in the development of noisy intermediate-scale quantum (NISQ) computers capable of impressive quantum feats. This trajectory is expected to continue in building the next generation of devices: ones that can surpass classical computers for particular tasks -- but progress in characterisation must keep up with the complexities of intricate device noise. A missing piece in the zoo of characterisation procedures is tomography which can completely describe non-Markovian dynamics. Here, we formally introduce a generalisation of quantum process tomography, which we call process tensor tomography. We detail the experimental requirements, construct the necessary post-processing algorithms for maximum-likelihood estimation, outline the best-practice aspects for accurate results, and make the procedure efficient for low-memory processes. The characterisation is the pathway to diagnostics and informed control of correlated noise. As an example application of the technique, we improve multi-time circuit fidelities on IBM Quantum devices for both standalone qubits and in the presence of crosstalk to a level comparable with the fault-tolerant noise threshold in a variety of different noise conditions. Our methods could form the core for carefully developed software that may help hardware consistently pass the fault-tolerant noise threshold

    Metric and Representation Learning

    Full text link
    All data has some inherent mathematical structure. I am interested in understanding the intrinsic geometric and probabilistic structure of data to design effective algorithms and tools that can be applied to machine learning and across all branches of science. The focus of this thesis is to increase the effectiveness of machine learning techniques by developing a mathematical and algorithmic framework using which, given any type of data, we can learn an optimal representation. Representation learning is done for many reasons. It could be done to fix the corruption given corrupted data or to learn a low dimensional or simpler representation, given high dimensional data or a very complex representation of the data. It could also be that the current representation of the data does not capture the important geometric features of the data. One of the many challenges in representation learning is determining ways to judge the quality of the representation learned. In many cases, the consensus is that if d is the natural metric on the representation, then this metric should provide meaningful information about the data. Many examples of this can be seen in areas such as metric learning, manifold learning, and graph embedding. However, most algorithms that solve these problems learn a representation in a metric space first and then extract a metric. A large part of my research is exploring what happens if the order is switched, that is, learn the appropriate metric first and the embedding later. The philosophy behind this approach is that understanding the inherent geometry of the data is the most crucial part of representation learning. Often, studying the properties of the appropriate metric on the input data sets indicates the type of space, we should be seeking for the representation. Hence giving us more robust representations. Optimizing for the appropriate metric can also help overcome issues such as missing and noisy data. My projects fall into three different areas of representation learning. 1) Geometric and probabilistic analysis of representation learning methods. 2) Developing methods to learn optimal metrics on large datasets. 3) Applications. For the category of geometric and probabilistic analysis of representation learning methods, we have three projects. First, designing optimal training data for denoising autoencoders. Second, formulating a new optimal transport problem and understanding the geometric structure. Third, analyzing the robustness to perturbations of the solutions obtained from the classical multidimensional scaling algorithm versus that of the true solutions to the multidimensional scaling problem. For learning optimal metric, we are given a dissimilarity matrix hatDhat{D}, some function ff and some a subset SS of the space of all metrics and we want to find DinSD in S that minimizes f(D,hatD)f(D,hat{D}). In this thesis, we consider the version of the problem when SS is the space of metrics defined on a fixed graph. That is, given a graph GG, we let SS, be the space of all metrics defined via GG. For this SS, we consider the sparse objective function as well as convex objective functions. We also looked at the problem where we want to learn a tree. We also show how the ideas behind learning the optimal metric can be applied to dimensionality reduction in the presence of missing data. Finally, we look at an application to real world data. Specifically trying to reconstruct ancient Greek text.PHDApplied and Interdisciplinary MathematicsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/169738/1/rsonthal_1.pd

    Path following in the exact penalty method of convex programming

    Get PDF
    Classical penalty methods solve a sequence of unconstrained problems that put greater and greater stress on meeting the constraints. In the limit as the penalty constant tends to ∞, one recovers the constrained solution. In the exact penalty method, squared penalties are replaced by absolute value penalties, and the solution is recovered for a finite value of the penalty constant. In practice, the kinks in the penalty and the unknown magnitude of the penalty constant prevent wide application of the exact penalty method in nonlinear programming. In this article, we examine a strategy of path following consistent with the exact penalty method. Instead of performing optimization at a single penalty constant, we trace the solution as a continuous function of the penalty constant. Thus, path following starts at the unconstrained solution and follows the solution path as the penalty constant increases. In the process, the solution path hits, slides along, and exits from the various constraints. For quadratic programming, the solution path is piecewise linear and takes large jumps from constraint to constraint. For a general convex program, the solution path is piecewise smooth, and path following operates by numerically solving an ordinary differential equation segment by segment. Our diverse applications to (a) projection onto a convex set, (b) nonnegative least squares, (c) quadratically constrained quadratic programming, (d) geometric programming, and (e) semidefinite programming illustrate the mechanics and potential of path following. The final detour to image denoising demonstrates the relevance of path following to regularized estimation in inverse problems. In regularized estimation, one follows the solution path as the penalty constant decreases from a large value

    Statistical learning of random probability measures

    Get PDF
    The study of random probability measures is a lively research topic that has attracted interest from different fields in recent years. In this thesis, we consider random probability measures in the context of Bayesian nonparametrics, where the law of a random probability measure is used as prior distribution, and in the context of distributional data analysis, where the goal is to perform inference given avsample from the law of a random probability measure. The contributions contained in this thesis can be subdivided according to three different topics: (i) the use of almost surely discrete repulsive random measures (i.e., whose support points are well separated) for Bayesian model-based clustering, (ii) the proposal of new laws for collections of random probability measures for Bayesian density estimation of partially exchangeable data subdivided into different groups, and (iii) the study of principal component analysis and regression models for probability distributions seen as elements of the 2-Wasserstein space. Specifically, for point (i) above we propose an efficient Markov chain Monte Carlo algorithm for posterior inference, which sidesteps the need of split-merge reversible jump moves typically associated with poor performance, we propose a model for clustering high-dimensional data by introducing a novel class of anisotropic determinantal point processes, and study the distributional properties of the repulsive measures, shedding light on important theoretical results which enable more principled prior elicitation and more efficient posterior simulation algorithms. For point (ii) above, we consider several models suitable for clustering homogeneous populations, inducing spatial dependence across groups of data, extracting the characteristic traits common to all the data-groups, and propose a novel vector autoregressive model to study of growth curves of Singaporean kids. Finally, for point (iii), we propose a novel class of projected statistical methods for distributional data analysis for measures on the real line and on the unit-circle
    corecore