8 research outputs found
Detecting High Log-Densities -- an O(n^1/4) Approximation for Densest k-Subgraph
In the Densest k-Subgraph problem, given a graph G and a parameter k, one
needs to find a subgraph of G induced on k vertices that contains the largest
number of edges. There is a significant gap between the best known upper and
lower bounds for this problem. It is NP-hard, and does not have a PTAS unless
NP has subexponential time algorithms. On the other hand, the current best
known algorithm of Feige, Kortsarz and Peleg, gives an approximation ratio of
n^(1/3-epsilon) for some specific epsilon > 0 (estimated at around 1/60).
We present an algorithm that for every epsilon > 0 approximates the Densest
k-Subgraph problem within a ratio of n^(1/4+epsilon) in time n^O(1/epsilon). In
particular, our algorithm achieves an approximation ratio of O(n^1/4) in time
n^O(log n). Our algorithm is inspired by studying an average-case version of
the problem where the goal is to distinguish random graphs from graphs with
planted dense subgraphs. The approximation ratio we achieve for the general
case matches the distinguishing ratio we obtain for this planted problem.
At a high level, our algorithms involve cleverly counting appropriately
defined trees of constant size in G, and using these counts to identify the
vertices of the dense subgraph. Our algorithm is based on the following
principle. We say that a graph G(V,E) has log-density alpha if its average
degree is Theta(|V|^alpha). The algorithmic core of our result is a family of
algorithms that output k-subgraphs of nontrivial density whenever the
log-density of the densest k-subgraph is larger than the log-density of the
host graph.Comment: 23 page
Label optimal regret bounds for online local learning
We resolve an open question from (Christiano, 2014b) posed in COLT'14
regarding the optimal dependency of the regret achievable for online local
learning on the size of the label set. In this framework the algorithm is shown
a pair of items at each step, chosen from a set of items. The learner then
predicts a label for each item, from a label set of size and receives a
real valued payoff. This is a natural framework which captures many interesting
scenarios such as collaborative filtering, online gambling, and online max cut
among others. (Christiano, 2014a) designed an efficient online learning
algorithm for this problem achieving a regret of , where
is the number of rounds. Information theoretically, one can achieve a regret of
. One of the main open questions left in this framework
concerns closing the above gap.
In this work, we provide a complete answer to the question above via two main
results. We show, via a tighter analysis, that the semi-definite programming
based algorithm of (Christiano, 2014a), in fact achieves a regret of
. Second, we show a matching computational lower bound. Namely,
we show that a polynomial time algorithm for online local learning with lower
regret would imply a polynomial time algorithm for the planted clique problem
which is widely believed to be hard. We prove a similar hardness result under a
related conjecture concerning planted dense subgraphs that we put forth. Unlike
planted clique, the planted dense subgraph problem does not have any known
quasi-polynomial time algorithms.
Computational lower bounds for online learning are relatively rare, and we
hope that the ideas developed in this work will lead to lower bounds for other
online learning scenarios as well.Comment: 13 pages; Changes from previous version: small changes to proofs of
Theorems 1 & 2, a small rewrite of introduction as well (this version is the
same as camera-ready copy in COLT '15
A Novel Approach to Finding Near-Cliques: The Triangle-Densest Subgraph Problem
Many graph mining applications rely on detecting subgraphs which are
near-cliques. There exists a dichotomy between the results in the existing work
related to this problem: on the one hand the densest subgraph problem (DSP)
which maximizes the average degree over all subgraphs is solvable in polynomial
time but for many networks fails to find subgraphs which are near-cliques. On
the other hand, formulations that are geared towards finding near-cliques are
NP-hard and frequently inapproximable due to connections with the Maximum
Clique problem.
In this work, we propose a formulation which combines the best of both
worlds: it is solvable in polynomial time and finds near-cliques when the DSP
fails. Surprisingly, our formulation is a simple variation of the DSP.
Specifically, we define the triangle densest subgraph problem (TDSP): given
, find a subset of vertices such that , where is the number of triangles induced
by the set . We provide various exact and approximation algorithms which the
solve the TDSP efficiently. Furthermore, we show how our algorithms adapt to
the more general problem of maximizing the -clique average density. Finally,
we provide empirical evidence that the TDSP should be used whenever the output
of the DSP fails to output a near-clique.Comment: 42 page
Machinery for Proving Sum-of-Squares Lower Bounds on Certification Problems
In this paper, we construct general machinery for proving Sum-of-Squares
lower bounds on certification problems by generalizing the techniques used by
Barak et al. [FOCS 2016] to prove Sum-of-Squares lower bounds for planted
clique. Using this machinery, we prove degree Sum-of-Squares
lower bounds for tensor PCA, the Wishart model of sparse PCA, and a variant of
planted clique which we call planted slightly denser subgraph.Comment: 134 page