14,684 research outputs found
An Approximate Shapley-Folkman Theorem
The Shapley-Folkman theorem shows that Minkowski averages of uniformly
bounded sets tend to be convex when the number of terms in the sum becomes much
larger than the ambient dimension. In optimization, Aubin and Ekeland [1976]
show that this produces an a priori bound on the duality gap of separable
nonconvex optimization problems involving finite sums. This bound is highly
conservative and depends on unstable quantities, and we relax it in several
directions to show that non convexity can have a much milder impact on finite
sum minimization problems such as empirical risk minimization and multi-task
classification. As a byproduct, we show a new version of Maurey's classical
approximate Carath\'eodory lemma where we sample a significant fraction of the
coefficients, without replacement, as well as a result on sampling constraints
using an approximate Helly theorem, both of independent interest.Comment: Added constraint sampling result, simplified sampling results,
reformat, et
Optimal Bounds on Approximation of Submodular and XOS Functions by Juntas
We investigate the approximability of several classes of real-valued
functions by functions of a small number of variables ({\em juntas}). Our main
results are tight bounds on the number of variables required to approximate a
function within -error over
the uniform distribution: 1. If is submodular, then it is -close
to a function of variables.
This is an exponential improvement over previously known results. We note that
variables are necessary even for linear
functions. 2. If is fractionally subadditive (XOS) it is -close
to a function of variables. This result holds for all
functions with low total -influence and is a real-valued analogue of
Friedgut's theorem for boolean functions. We show that
variables are necessary even for XOS functions.
As applications of these results, we provide learning algorithms over the
uniform distribution. For XOS functions, we give a PAC learning algorithm that
runs in time . For submodular functions we give
an algorithm in the more demanding PMAC learning model (Balcan and Harvey,
2011) which requires a multiplicative factor approximation with
probability at least over the target distribution. Our uniform
distribution algorithm runs in time .
This is the first algorithm in the PMAC model that over the uniform
distribution can achieve a constant approximation factor arbitrarily close to 1
for all submodular functions. As follows from the lower bounds in (Feldman et
al., 2013) both of these algorithms are close to optimal. We also give
applications for proper learning, testing and agnostic learning with value
queries of these classes.Comment: Extended abstract appears in proceedings of FOCS 201
Explicit lower and upper bounds on the entangled value of multiplayer XOR games
XOR games are the simplest model in which the nonlocal properties of
entanglement manifest themselves. When there are two players, it is well known
that the bias --- the maximum advantage over random play --- of entangled
players can be at most a constant times greater than that of classical players.
Recently, P\'{e}rez-Garc\'{i}a et al. [Comm. Math. Phys. 279 (2), 2008] showed
that no such bound holds when there are three or more players: the advantage of
entangled players over classical players can become unbounded, and scale with
the number of questions in the game. Their proof relies on non-trivial results
from operator space theory, and gives a non-explicit existence proof, leading
to a game with a very large number of questions and only a loose control over
the local dimension of the players' shared entanglement.
We give a new, simple and explicit (though still probabilistic) construction
of a family of three-player XOR games which achieve a large quantum-classical
gap (QC-gap). This QC-gap is exponentially larger than the one given by
P\'{e}rez-Garc\'{i}a et. al. in terms of the size of the game, achieving a
QC-gap of order with questions per player. In terms of the
dimension of the entangled state required, we achieve the same (optimal) QC-gap
of for a state of local dimension per player. Moreover, the
optimal entangled strategy is very simple, involving observables defined by
tensor products of the Pauli matrices.
Additionally, we give the first upper bound on the maximal QC-gap in terms of
the number of questions per player, showing that our construction is only
quadratically off in that respect. Our results rely on probabilistic estimates
on the norm of random matrices and higher-order tensors which may be of
independent interest.Comment: Major improvements in presentation; results identica
Nonparametric Bayes Modeling of Populations of Networks
Replicated network data are increasingly available in many research fields.
In connectomic applications, inter-connections among brain regions are
collected for each patient under study, motivating statistical models which can
flexibly characterize the probabilistic generative mechanism underlying these
network-valued data. Available models for a single network are not designed
specifically for inference on the entire probability mass function of a
network-valued random variable and therefore lack flexibility in characterizing
the distribution of relevant topological structures. We propose a flexible
Bayesian nonparametric approach for modeling the population distribution of
network-valued data. The joint distribution of the edges is defined via a
mixture model which reduces dimensionality and efficiently incorporates network
information within each mixture component by leveraging latent space
representations. The formulation leads to an efficient Gibbs sampler and
provides simple and coherent strategies for inference and goodness-of-fit
assessments. We provide theoretical results on the flexibility of our model and
illustrate improved performance --- compared to state-of-the-art models --- in
simulations and application to human brain networks
- …