7 research outputs found
Label optimal regret bounds for online local learning
We resolve an open question from (Christiano, 2014b) posed in COLT'14
regarding the optimal dependency of the regret achievable for online local
learning on the size of the label set. In this framework the algorithm is shown
a pair of items at each step, chosen from a set of items. The learner then
predicts a label for each item, from a label set of size and receives a
real valued payoff. This is a natural framework which captures many interesting
scenarios such as collaborative filtering, online gambling, and online max cut
among others. (Christiano, 2014a) designed an efficient online learning
algorithm for this problem achieving a regret of , where
is the number of rounds. Information theoretically, one can achieve a regret of
. One of the main open questions left in this framework
concerns closing the above gap.
In this work, we provide a complete answer to the question above via two main
results. We show, via a tighter analysis, that the semi-definite programming
based algorithm of (Christiano, 2014a), in fact achieves a regret of
. Second, we show a matching computational lower bound. Namely,
we show that a polynomial time algorithm for online local learning with lower
regret would imply a polynomial time algorithm for the planted clique problem
which is widely believed to be hard. We prove a similar hardness result under a
related conjecture concerning planted dense subgraphs that we put forth. Unlike
planted clique, the planted dense subgraph problem does not have any known
quasi-polynomial time algorithms.
Computational lower bounds for online learning are relatively rare, and we
hope that the ideas developed in this work will lead to lower bounds for other
online learning scenarios as well.Comment: 13 pages; Changes from previous version: small changes to proofs of
Theorems 1 & 2, a small rewrite of introduction as well (this version is the
same as camera-ready copy in COLT '15
Average-case Hardness of RIP Certification
The restricted isometry property (RIP) for design matrices gives guarantees
for optimal recovery in sparse linear models. It is of high interest in
compressed sensing and statistical learning. This property is particularly
important for computationally efficient recovery methods. As a consequence,
even though it is in general NP-hard to check that RIP holds, there have been
substantial efforts to find tractable proxies for it. These would allow the
construction of RIP matrices and the polynomial-time verification of RIP given
an arbitrary matrix. We consider the framework of average-case certifiers, that
never wrongly declare that a matrix is RIP, while being often correct for
random instances. While there are such functions which are tractable in a
suboptimal parameter regime, we show that this is a computationally hard task
in any better regime. Our results are based on a new, weaker assumption on the
problem of detecting dense subgraphs
Spectral methods and computational trade-offs in high-dimensional statistical inference
Spectral methods have become increasingly popular in designing fast algorithms for modern highdimensional datasets. This thesis looks at several problems in which spectral methods play a central role. In some cases, we also show that such procedures have essentially the best performance among all randomised polynomial time algorithms by exhibiting statistical and computational trade-offs in those problems. In the first chapter, we prove a useful variant of the well-known Davis{Kahan theorem, which is a spectral perturbation result that allows us to bound of the distance between population eigenspaces and their sample versions. We then propose a semi-definite programming algorithm for the sparse principal component analysis (PCA) problem, and analyse its theoretical performance using the perturbation bounds we derived earlier. It turns out that the parameter regime in which our estimator is consistent is strictly smaller than the consistency regime of a minimax optimal (yet computationally intractable) estimator. We show through reduction from a well-known hard problem in computational complexity theory that the difference in consistency regimes is unavoidable for any randomised polynomial time estimator, hence revealing subtle statistical and computational trade-offs in this problem. Such computational trade-offs also exist in the problem of restricted isometry certification. Certifiers for restricted isometry properties can be used to construct design matrices for sparse linear regression problems. Similar to the sparse PCA problem, we show that there is also an intrinsic gap between the class of matrices certifiable using unrestricted algorithms and using polynomial time algorithms. Finally, we consider the problem of high-dimensional changepoint estimation, where we estimate the time of change in the mean of a high-dimensional time series with piecewise constant mean structure. Motivated by real world applications, we assume that changes only occur in a sparse subset of all coordinates. We apply a variant of the semi-definite programming algorithm in sparse PCA to aggregate the signals across different coordinates in a near optimal way so as to estimate the changepoint location as accurately as possible. Our statistical procedure shows superior performance compared to existing methods in this problem.St John's College and Cambridge Overseas Trus