118 research outputs found
Randomized Low-Memory Singular Value Projection
Affine rank minimization algorithms typically rely on calculating the
gradient of a data error followed by a singular value decomposition at every
iteration. Because these two steps are expensive, heuristic approximations are
often used to reduce computational burden. To this end, we propose a recovery
scheme that merges the two steps with randomized approximations, and as a
result, operates on space proportional to the degrees of freedom in the
problem. We theoretically establish the estimation guarantees of the algorithm
as a function of approximation tolerance. While the theoretical approximation
requirements are overly pessimistic, we demonstrate that in practice the
algorithm performs well on the quantum tomography recovery problem.Comment: 13 pages. This version has a revised theorem and new numerical
experiment
On the quality of randomized approximations of Tukey's depth
Tukey's depth (or halfspace depth) is a widely used measure of centrality for
multivariate data. However, exact computation of Tukey's depth is known to be a
hard problem in high dimensions. As a remedy, randomized approximations of
Tukey's depth have been proposed. In this paper we explore when such randomized
algorithms return a good approximation of Tukey's depth. We study the case when
the data are sampled from a log-concave isotropic distribution. We prove that,
if one requires that the algorithm runs in polynomial time in the dimension,
the randomized algorithm correctly approximates the maximal depth and
depths close to zero. On the other hand, for any point of intermediate depth,
any good approximation requires exponential complexity
High-performance Kernel Machines with Implicit Distributed Optimization and Randomization
In order to fully utilize "big data", it is often required to use "big
models". Such models tend to grow with the complexity and size of the training
data, and do not make strong parametric assumptions upfront on the nature of
the underlying statistical dependencies. Kernel methods fit this need well, as
they constitute a versatile and principled statistical methodology for solving
a wide range of non-parametric modelling problems. However, their high
computational costs (in storage and time) pose a significant barrier to their
widespread adoption in big data applications.
We propose an algorithmic framework and high-performance implementation for
massive-scale training of kernel-based statistical models, based on combining
two key technical ingredients: (i) distributed general purpose convex
optimization, and (ii) the use of randomization to improve the scalability of
kernel methods. Our approach is based on a block-splitting variant of the
Alternating Directions Method of Multipliers, carefully reconfigured to handle
very large random feature matrices, while exploiting hybrid parallelism
typically found in modern clusters of multicore machines. Our implementation
supports a variety of statistical learning tasks by enabling several loss
functions, regularization schemes, kernels, and layers of randomized
approximations for both dense and sparse datasets, in a highly extensible
framework. We evaluate the ability of our framework to learn models on data
from applications, and provide a comparison against existing sequential and
parallel libraries.Comment: Work presented at MMDS 2014 (June 2014) and JSM 201
Simple Approximations of Semialgebraic Sets and their Applications to Control
Many uncertainty sets encountered in control systems analysis and design can
be expressed in terms of semialgebraic sets, that is as the intersection of
sets described by means of polynomial inequalities. Important examples are for
instance the solution set of linear matrix inequalities or the Schur/Hurwitz
stability domains. These sets often have very complicated shapes (non-convex,
and even non-connected), which renders very difficult their manipulation. It is
therefore of considerable importance to find simple-enough approximations of
these sets, able to capture their main characteristics while maintaining a low
level of complexity. For these reasons, in the past years several convex
approximations, based for instance on hyperrect-angles, polytopes, or
ellipsoids have been proposed. In this work, we move a step further, and
propose possibly non-convex approximations , based on a small volume polynomial
superlevel set of a single positive polynomial of given degree. We show how
these sets can be easily approximated by minimizing the L1 norm of the
polynomial over the semialgebraic set, subject to positivity constraints.
Intuitively, this corresponds to the trace minimization heuristic commonly
encounter in minimum volume ellipsoid problems. From a computational viewpoint,
we design a hierarchy of linear matrix inequality problems to generate these
approximations, and we provide theoretically rigorous convergence results, in
the sense that the hierarchy of outer approximations converges in volume (or,
equivalently, almost everywhere and almost uniformly) to the original set. Two
main applications of the proposed approach are considered. The first one aims
at reconstruction/approximation of sets from a finite number of samples. In the
second one, we show how the concept of polynomial superlevel set can be used to
generate samples uniformly distributed on a given semialgebraic set. The
efficiency of the proposed approach is demonstrated by different numerical
examples
An Adversarial Interpretation of Information-Theoretic Bounded Rationality
Recently, there has been a growing interest in modeling planning with
information constraints. Accordingly, an agent maximizes a regularized expected
utility known as the free energy, where the regularizer is given by the
information divergence from a prior to a posterior policy. While this approach
can be justified in various ways, including from statistical mechanics and
information theory, it is still unclear how it relates to decision-making
against adversarial environments. This connection has previously been suggested
in work relating the free energy to risk-sensitive control and to extensive
form games. Here, we show that a single-agent free energy optimization is
equivalent to a game between the agent and an imaginary adversary. The
adversary can, by paying an exponential penalty, generate costs that diminish
the decision maker's payoffs. It turns out that the optimal strategy of the
adversary consists in choosing costs so as to render the decision maker
indifferent among its choices, which is a definining property of a Nash
equilibrium, thus tightening the connection between free energy optimization
and game theory.Comment: 7 pages, 4 figures. Proceedings of AAAI-1
Generalized Network Dismantling
Finding the set of nodes, which removed or (de)activated can stop the spread
of (dis)information, contain an epidemic or disrupt the functioning of a
corrupt/criminal organization is still one of the key challenges in network
science. In this paper, we introduce the generalized network dismantling
problem, which aims to find the set of nodes that, when removed from a network,
results in a network fragmentation into subcritical network components at
minimum cost. For unit costs, our formulation becomes equivalent to the
standard network dismantling problem. Our non-unit cost generalization allows
for the inclusion of topological cost functions related to node centrality and
non-topological features such as the price, protection level or even social
value of a node. In order to solve this optimization problem, we propose a
method, which is based on the spectral properties of a novel node-weighted
Laplacian operator. The proposed method is applicable to large-scale networks
with millions of nodes. It outperforms current state-of-the-art methods and
opens new directions in understanding the vulnerability and robustness of
complex systems.Comment: 6 pages, 5 figure
- …