43 research outputs found

    Distributionally Robust Optimization: A Review

    Full text link
    The concepts of risk-aversion, chance-constrained optimization, and robust optimization have developed significantly over the last decade. Statistical learning community has also witnessed a rapid theoretical and applied growth by relying on these concepts. A modeling framework, called distributionally robust optimization (DRO), has recently received significant attention in both the operations research and statistical learning communities. This paper surveys main concepts and contributions to DRO, and its relationships with robust optimization, risk-aversion, chance-constrained optimization, and function regularization

    Uncertainty Relations for Angular Momentum

    Get PDF
    In this work we study various notions of uncertainty for angular momentum in the spin-s representation of SU(2). We characterize the "uncertainty regions'' given by all vectors, whose components are specified by the variances of the three angular momentum components. A basic feature of this set is a lower bound for the sum of the three variances. We give a method for obtaining optimal lower bounds for uncertainty regions for general operator triples, and evaluate these for small s. Further lower bounds are derived by generalizing the technique by which Robertson obtained his state-dependent lower bound. These are optimal for large s, since they are saturated by states taken from the Holstein-Primakoff approximation. We show that, for all s, all variances are consistent with the so-called vector model, i.e., they can also be realized by a classical probability measure on a sphere of radius sqrt(s(s+1)). Entropic uncertainty relations can be discussed similarly, but are minimized by different states than those minimizing the variances for small s. For large s the Maassen-Uffink bound becomes sharp and we explicitly describe the extremalizing states. Measurement uncertainty, as recently discussed by Busch, Lahti and Werner for position and momentum, is introduced and a generalized observable (POVM) which minimizes the worst case measurement uncertainty of all angular momentum components is explicitly determined, along with the minimal uncertainty. The output vectors for the optimal measurement all have the same length r(s), where r(s)/s goes to 1 as s tends to infinity.Comment: 30 pages, 22 figures, 1 cut-out paper model, video abstract available on https://youtu.be/h01pHekcwF

    Classical and quantum algorithms for scaling problems

    Get PDF
    This thesis is concerned with scaling problems, which have a plethora of connections to different areas of mathematics, physics and computer science. Although many structural aspects of these problems are understood by now, we only know how to solve them efficiently in special cases.We give new algorithms for non-commutative scaling problems with complexity guarantees that match the prior state of the art. To this end, we extend the well-known (self-concordance based) interior-point method (IPM) framework to Riemannian manifolds, motivated by its success in the commutative setting. Moreover, the IPM framework does not obviously suffer from the same obstructions to efficiency as previous methods. It also yields the first high-precision algorithms for other natural geometric problems in non-positive curvature.For the (commutative) problems of matrix scaling and balancing, we show that quantum algorithms can outperform the (already very efficient) state-of-the-art classical algorithms. Their time complexity can be sublinear in the input size; in certain parameter regimes they are also optimal, whereas in others we show no quantum speedup over the classical methods is possible. Along the way, we provide improvements over the long-standing state of the art for searching for all marked elements in a list, and computing the sum of a list of numbers.We identify a new application in the context of tensor networks for quantum many-body physics. We define a computable canonical form for uniform projected entangled pair states (as the solution to a scaling problem), circumventing previously known undecidability results. We also show, by characterizing the invariant polynomials, that the canonical form is determined by evaluating the tensor network contractions on networks of bounded size

    Data-driven Inverse Optimization with Imperfect Information

    Full text link
    In data-driven inverse optimization an observer aims to learn the preferences of an agent who solves a parametric optimization problem depending on an exogenous signal. Thus, the observer seeks the agent's objective function that best explains a historical sequence of signals and corresponding optimal actions. We focus here on situations where the observer has imperfect information, that is, where the agent's true objective function is not contained in the search space of candidate objectives, where the agent suffers from bounded rationality or implementation errors, or where the observed signal-response pairs are corrupted by measurement noise. We formalize this inverse optimization problem as a distributionally robust program minimizing the worst-case risk that the {\em predicted} decision ({\em i.e.}, the decision implied by a particular candidate objective) differs from the agent's {\em actual} response to a random signal. We show that our framework offers rigorous out-of-sample guarantees for different loss functions used to measure prediction errors and that the emerging inverse optimization problems can be exactly reformulated as (or safely approximated by) tractable convex programs when a new suboptimality loss function is used. We show through extensive numerical tests that the proposed distributionally robust approach to inverse optimization attains often better out-of-sample performance than the state-of-the-art approaches

    Exchangeable equilibria

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 183-188).The main contribution of this thesis is a new solution concept for symmetric games (of complete information in strategic form), the exchangeable equilibrium. This is an intermediate notion between symmetric Nash and symmetric correlated equilibrium. While a variety of weaker solution concepts than correlated equilibrium and a variety of refinements of Nash equilibrium are known, there is little previous work on "interpolating" between Nash and correlated equilibrium. Several game-theoretic interpretations suggest that exchangeable equilibria are natural objects to study. Moreover, these show that the notion of symmetric correlated equilibrium is too weak and exchangeable equilibrium is a more natural analog of correlated equilibrium for symmetric games. The geometric properties of exchangeable equilibria are a mix of those of Nash and correlated equilibria. The set of exchangeable equilibria is convex, compact, and semi-algebraic, but not necessarily a polytope. A variety of examples illustrate how it relates to the Nash and correlated equilibria. The same ideas which lead to the notion of exchangeable equilibria can be used to construct tighter convex relaxations of the symmetric Nash equilibria as well as convex relaxations of the set of all Nash equilibria in asymmetric games. These have similar mathematical properties to the exchangeable equilibria. An example game reveals an algebraic obstruction to computing exact exchangeable equilibria, but these can be approximated to any degree of accuracy in polynomial time. On the other hand, optimizing a linear function over the exchangeable equilibria is NP-hard. There are practical linear and semidefinite programming heuristics for both problems. A secondary contribution of this thesis is the computation of extreme points of the set of correlated equilibria in a simple family of games. These examples illustrate that in finite games there can be factorially many more extreme correlated equilibria than extreme Nash equilibria, so enumerating extreme correlated equilibria is not an effective method for enumerating extreme Nash equilibria. In the case of games with a continuum of strategies and polynomial utilities, the examples illustrate that while the set of Nash equilibria has a known finite-dimensional description in terms of moments, the set of correlated equilibria admits no such finite-dimensional characterization.by Noah D. Stein.Ph.D

    Optimal Convex and Nonconvex Regularizers for a Data Source

    Full text link
    In optimization-based approaches to inverse problems and to statistical estimation, it is common to augment the objective with a regularizer to address challenges associated with ill-posedness. The choice of a suitable regularizer is typically driven by prior domain information and computational considerations. Convex regularizers are attractive as they are endowed with certificates of optimality as well as the toolkit of convex analysis, but exhibit a computational scaling that makes them ill-suited beyond moderate-sized problem instances. On the other hand, nonconvex regularizers can often be deployed at scale, but do not enjoy the certification properties associated with convex regularizers. In this paper, we seek a systematic understanding of the power and the limitations of convex regularization by investigating the following questions: Given a distribution, what are the optimal regularizers, both convex and nonconvex, for data drawn from the distribution? What properties of a data source govern whether it is amenable to convex regularization? We address these questions for the class of continuous and positively homogenous regularizers for which convex and nonconvex regularizers correspond, respectively, to convex bodies and star bodies. By leveraging dual Brunn-Minkowski theory, we show that a radial function derived from a data distribution is the key quantity for identifying optimal regularizers and for assessing the amenability of a data source to convex regularization. Using tools such as Γ\Gamma-convergence, we show that our results are robust in the sense that the optimal regularizers for a sample drawn from a distribution converge to their population counterparts as the sample size grows large. Finally, we give generalization guarantees that recover previous results for polyhedral regularizers (i.e., dictionary learning) and lead to new ones for semidefinite regularizers

    International Conference on Continuous Optimization (ICCOPT) 2019 Conference Book

    Get PDF
    The Sixth International Conference on Continuous Optimization took place on the campus of the Technical University of Berlin, August 3-8, 2019. The ICCOPT is a flagship conference of the Mathematical Optimization Society (MOS), organized every three years. ICCOPT 2019 was hosted by the Weierstrass Institute for Applied Analysis and Stochastics (WIAS) Berlin. It included a Summer School and a Conference with a series of plenary and semi-plenary talks, organized and contributed sessions, and poster sessions. This book comprises the full conference program. It contains, in particular, the scientific program in survey style as well as with all details, and information on the social program, the venue, special meetings, and more

    Metric and Representation Learning

    Full text link
    All data has some inherent mathematical structure. I am interested in understanding the intrinsic geometric and probabilistic structure of data to design effective algorithms and tools that can be applied to machine learning and across all branches of science. The focus of this thesis is to increase the effectiveness of machine learning techniques by developing a mathematical and algorithmic framework using which, given any type of data, we can learn an optimal representation. Representation learning is done for many reasons. It could be done to fix the corruption given corrupted data or to learn a low dimensional or simpler representation, given high dimensional data or a very complex representation of the data. It could also be that the current representation of the data does not capture the important geometric features of the data. One of the many challenges in representation learning is determining ways to judge the quality of the representation learned. In many cases, the consensus is that if d is the natural metric on the representation, then this metric should provide meaningful information about the data. Many examples of this can be seen in areas such as metric learning, manifold learning, and graph embedding. However, most algorithms that solve these problems learn a representation in a metric space first and then extract a metric. A large part of my research is exploring what happens if the order is switched, that is, learn the appropriate metric first and the embedding later. The philosophy behind this approach is that understanding the inherent geometry of the data is the most crucial part of representation learning. Often, studying the properties of the appropriate metric on the input data sets indicates the type of space, we should be seeking for the representation. Hence giving us more robust representations. Optimizing for the appropriate metric can also help overcome issues such as missing and noisy data. My projects fall into three different areas of representation learning. 1) Geometric and probabilistic analysis of representation learning methods. 2) Developing methods to learn optimal metrics on large datasets. 3) Applications. For the category of geometric and probabilistic analysis of representation learning methods, we have three projects. First, designing optimal training data for denoising autoencoders. Second, formulating a new optimal transport problem and understanding the geometric structure. Third, analyzing the robustness to perturbations of the solutions obtained from the classical multidimensional scaling algorithm versus that of the true solutions to the multidimensional scaling problem. For learning optimal metric, we are given a dissimilarity matrix hatDhat{D}, some function ff and some a subset SS of the space of all metrics and we want to find DinSD in S that minimizes f(D,hatD)f(D,hat{D}). In this thesis, we consider the version of the problem when SS is the space of metrics defined on a fixed graph. That is, given a graph GG, we let SS, be the space of all metrics defined via GG. For this SS, we consider the sparse objective function as well as convex objective functions. We also looked at the problem where we want to learn a tree. We also show how the ideas behind learning the optimal metric can be applied to dimensionality reduction in the presence of missing data. Finally, we look at an application to real world data. Specifically trying to reconstruct ancient Greek text.PHDApplied and Interdisciplinary MathematicsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/169738/1/rsonthal_1.pd
    corecore