43 research outputs found
Distributionally Robust Optimization: A Review
The concepts of risk-aversion, chance-constrained optimization, and robust
optimization have developed significantly over the last decade. Statistical
learning community has also witnessed a rapid theoretical and applied growth by
relying on these concepts. A modeling framework, called distributionally robust
optimization (DRO), has recently received significant attention in both the
operations research and statistical learning communities. This paper surveys
main concepts and contributions to DRO, and its relationships with robust
optimization, risk-aversion, chance-constrained optimization, and function
regularization
Uncertainty Relations for Angular Momentum
In this work we study various notions of uncertainty for angular momentum in
the spin-s representation of SU(2). We characterize the "uncertainty regions''
given by all vectors, whose components are specified by the variances of the
three angular momentum components. A basic feature of this set is a lower bound
for the sum of the three variances. We give a method for obtaining optimal
lower bounds for uncertainty regions for general operator triples, and evaluate
these for small s. Further lower bounds are derived by generalizing the
technique by which Robertson obtained his state-dependent lower bound. These
are optimal for large s, since they are saturated by states taken from the
Holstein-Primakoff approximation. We show that, for all s, all variances are
consistent with the so-called vector model, i.e., they can also be realized by
a classical probability measure on a sphere of radius sqrt(s(s+1)). Entropic
uncertainty relations can be discussed similarly, but are minimized by
different states than those minimizing the variances for small s. For large s
the Maassen-Uffink bound becomes sharp and we explicitly describe the
extremalizing states. Measurement uncertainty, as recently discussed by Busch,
Lahti and Werner for position and momentum, is introduced and a generalized
observable (POVM) which minimizes the worst case measurement uncertainty of all
angular momentum components is explicitly determined, along with the minimal
uncertainty. The output vectors for the optimal measurement all have the same
length r(s), where r(s)/s goes to 1 as s tends to infinity.Comment: 30 pages, 22 figures, 1 cut-out paper model, video abstract available
on https://youtu.be/h01pHekcwF
Classical and quantum algorithms for scaling problems
This thesis is concerned with scaling problems, which have a plethora of connections to different areas of mathematics, physics and computer science. Although many structural aspects of these problems are understood by now, we only know how to solve them efficiently in special cases.We give new algorithms for non-commutative scaling problems with complexity guarantees that match the prior state of the art. To this end, we extend the well-known (self-concordance based) interior-point method (IPM) framework to Riemannian manifolds, motivated by its success in the commutative setting. Moreover, the IPM framework does not obviously suffer from the same obstructions to efficiency as previous methods. It also yields the first high-precision algorithms for other natural geometric problems in non-positive curvature.For the (commutative) problems of matrix scaling and balancing, we show that quantum algorithms can outperform the (already very efficient) state-of-the-art classical algorithms. Their time complexity can be sublinear in the input size; in certain parameter regimes they are also optimal, whereas in others we show no quantum speedup over the classical methods is possible. Along the way, we provide improvements over the long-standing state of the art for searching for all marked elements in a list, and computing the sum of a list of numbers.We identify a new application in the context of tensor networks for quantum many-body physics. We define a computable canonical form for uniform projected entangled pair states (as the solution to a scaling problem), circumventing previously known undecidability results. We also show, by characterizing the invariant polynomials, that the canonical form is determined by evaluating the tensor network contractions on networks of bounded size
Data-driven Inverse Optimization with Imperfect Information
In data-driven inverse optimization an observer aims to learn the preferences
of an agent who solves a parametric optimization problem depending on an
exogenous signal. Thus, the observer seeks the agent's objective function that
best explains a historical sequence of signals and corresponding optimal
actions. We focus here on situations where the observer has imperfect
information, that is, where the agent's true objective function is not
contained in the search space of candidate objectives, where the agent suffers
from bounded rationality or implementation errors, or where the observed
signal-response pairs are corrupted by measurement noise. We formalize this
inverse optimization problem as a distributionally robust program minimizing
the worst-case risk that the {\em predicted} decision ({\em i.e.}, the decision
implied by a particular candidate objective) differs from the agent's {\em
actual} response to a random signal. We show that our framework offers rigorous
out-of-sample guarantees for different loss functions used to measure
prediction errors and that the emerging inverse optimization problems can be
exactly reformulated as (or safely approximated by) tractable convex programs
when a new suboptimality loss function is used. We show through extensive
numerical tests that the proposed distributionally robust approach to inverse
optimization attains often better out-of-sample performance than the
state-of-the-art approaches
Exchangeable equilibria
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 183-188).The main contribution of this thesis is a new solution concept for symmetric games (of complete information in strategic form), the exchangeable equilibrium. This is an intermediate notion between symmetric Nash and symmetric correlated equilibrium. While a variety of weaker solution concepts than correlated equilibrium and a variety of refinements of Nash equilibrium are known, there is little previous work on "interpolating" between Nash and correlated equilibrium. Several game-theoretic interpretations suggest that exchangeable equilibria are natural objects to study. Moreover, these show that the notion of symmetric correlated equilibrium is too weak and exchangeable equilibrium is a more natural analog of correlated equilibrium for symmetric games. The geometric properties of exchangeable equilibria are a mix of those of Nash and correlated equilibria. The set of exchangeable equilibria is convex, compact, and semi-algebraic, but not necessarily a polytope. A variety of examples illustrate how it relates to the Nash and correlated equilibria. The same ideas which lead to the notion of exchangeable equilibria can be used to construct tighter convex relaxations of the symmetric Nash equilibria as well as convex relaxations of the set of all Nash equilibria in asymmetric games. These have similar mathematical properties to the exchangeable equilibria. An example game reveals an algebraic obstruction to computing exact exchangeable equilibria, but these can be approximated to any degree of accuracy in polynomial time. On the other hand, optimizing a linear function over the exchangeable equilibria is NP-hard. There are practical linear and semidefinite programming heuristics for both problems. A secondary contribution of this thesis is the computation of extreme points of the set of correlated equilibria in a simple family of games. These examples illustrate that in finite games there can be factorially many more extreme correlated equilibria than extreme Nash equilibria, so enumerating extreme correlated equilibria is not an effective method for enumerating extreme Nash equilibria. In the case of games with a continuum of strategies and polynomial utilities, the examples illustrate that while the set of Nash equilibria has a known finite-dimensional description in terms of moments, the set of correlated equilibria admits no such finite-dimensional characterization.by Noah D. Stein.Ph.D
Optimal Convex and Nonconvex Regularizers for a Data Source
In optimization-based approaches to inverse problems and to statistical
estimation, it is common to augment the objective with a regularizer to address
challenges associated with ill-posedness. The choice of a suitable regularizer
is typically driven by prior domain information and computational
considerations. Convex regularizers are attractive as they are endowed with
certificates of optimality as well as the toolkit of convex analysis, but
exhibit a computational scaling that makes them ill-suited beyond
moderate-sized problem instances. On the other hand, nonconvex regularizers can
often be deployed at scale, but do not enjoy the certification properties
associated with convex regularizers. In this paper, we seek a systematic
understanding of the power and the limitations of convex regularization by
investigating the following questions: Given a distribution, what are the
optimal regularizers, both convex and nonconvex, for data drawn from the
distribution? What properties of a data source govern whether it is amenable to
convex regularization? We address these questions for the class of continuous
and positively homogenous regularizers for which convex and nonconvex
regularizers correspond, respectively, to convex bodies and star bodies. By
leveraging dual Brunn-Minkowski theory, we show that a radial function derived
from a data distribution is the key quantity for identifying optimal
regularizers and for assessing the amenability of a data source to convex
regularization. Using tools such as -convergence, we show that our
results are robust in the sense that the optimal regularizers for a sample
drawn from a distribution converge to their population counterparts as the
sample size grows large. Finally, we give generalization guarantees that
recover previous results for polyhedral regularizers (i.e., dictionary
learning) and lead to new ones for semidefinite regularizers
International Conference on Continuous Optimization (ICCOPT) 2019 Conference Book
The Sixth International Conference on Continuous Optimization took place on the campus of the Technical University of Berlin, August 3-8, 2019. The ICCOPT is a flagship conference of the Mathematical Optimization Society (MOS), organized every three years. ICCOPT 2019 was hosted by the Weierstrass Institute for Applied Analysis and Stochastics (WIAS) Berlin. It included a Summer School and a Conference with a series of plenary and semi-plenary talks, organized and contributed sessions, and poster sessions.
This book comprises the full conference program. It contains, in particular, the scientific program in survey style as well as with all details, and information on the social program, the venue, special meetings, and more
Metric and Representation Learning
All data has some inherent mathematical structure. I am interested in understanding the intrinsic geometric and probabilistic structure of data to design effective algorithms and tools that can be applied to machine learning and across all branches of science.
The focus of this thesis is to increase the effectiveness of machine learning techniques by developing a mathematical and algorithmic framework using which, given any type of data, we can learn an optimal representation. Representation learning is done for many reasons. It could be done to fix the corruption given corrupted data or to learn a low dimensional or simpler representation, given high dimensional data or a very complex representation of the data. It could also be that the current representation of the data does not capture the important geometric features of the data.
One of the many challenges in representation learning is determining ways to judge the quality of the representation learned. In many cases, the consensus is that if d is the natural metric on the representation, then this metric should provide meaningful information about the data. Many examples of this can be seen in areas such as metric learning, manifold learning, and graph embedding. However, most algorithms that solve these problems learn a representation in a metric space first and then extract a metric.
A large part of my research is exploring what happens if the order is switched, that is, learn the appropriate metric first and the embedding later. The philosophy behind this approach is that understanding the inherent geometry of the data is the most crucial part of representation learning. Often, studying the properties of the appropriate metric on the input data sets indicates the type of space, we should be seeking for the representation. Hence giving us more robust representations. Optimizing for the appropriate metric can also help overcome issues such as missing and noisy data. My projects fall into three different areas of representation learning.
1) Geometric and probabilistic analysis of representation learning methods.
2) Developing methods to learn optimal metrics on large datasets.
3) Applications.
For the category of geometric and probabilistic analysis of representation learning methods, we have three projects. First, designing optimal training data for denoising autoencoders. Second, formulating a new optimal transport problem and understanding the geometric structure. Third, analyzing the robustness to perturbations of the solutions obtained from the classical multidimensional scaling algorithm versus that of the true solutions to the multidimensional scaling problem.
For learning optimal metric, we are given a dissimilarity matrix , some function and some a subset of the space of all metrics and we want to find that minimizes . In this thesis, we consider the version of the problem when is the space of metrics defined on a fixed graph. That is, given a graph , we let , be the space of all metrics defined via . For this , we consider the sparse objective function as well as convex objective functions. We also looked at the problem where we want to learn a tree. We also show how the ideas behind learning the optimal metric can be applied to dimensionality reduction in the presence of missing data.
Finally, we look at an application to real world data. Specifically trying to reconstruct ancient Greek text.PHDApplied and Interdisciplinary MathematicsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/169738/1/rsonthal_1.pd