43 research outputs found
Detecting High Log-Densities -- an O(n^1/4) Approximation for Densest k-Subgraph
In the Densest k-Subgraph problem, given a graph G and a parameter k, one
needs to find a subgraph of G induced on k vertices that contains the largest
number of edges. There is a significant gap between the best known upper and
lower bounds for this problem. It is NP-hard, and does not have a PTAS unless
NP has subexponential time algorithms. On the other hand, the current best
known algorithm of Feige, Kortsarz and Peleg, gives an approximation ratio of
n^(1/3-epsilon) for some specific epsilon > 0 (estimated at around 1/60).
We present an algorithm that for every epsilon > 0 approximates the Densest
k-Subgraph problem within a ratio of n^(1/4+epsilon) in time n^O(1/epsilon). In
particular, our algorithm achieves an approximation ratio of O(n^1/4) in time
n^O(log n). Our algorithm is inspired by studying an average-case version of
the problem where the goal is to distinguish random graphs from graphs with
planted dense subgraphs. The approximation ratio we achieve for the general
case matches the distinguishing ratio we obtain for this planted problem.
At a high level, our algorithms involve cleverly counting appropriately
defined trees of constant size in G, and using these counts to identify the
vertices of the dense subgraph. Our algorithm is based on the following
principle. We say that a graph G(V,E) has log-density alpha if its average
degree is Theta(|V|^alpha). The algorithmic core of our result is a family of
algorithms that output k-subgraphs of nontrivial density whenever the
log-density of the densest k-subgraph is larger than the log-density of the
host graph.Comment: 23 page
Diversity Maximization in Doubling Metrics
Diversity maximization is an important geometric optimization problem with many applications in recommender systems, machine learning or search engines among others. A typical diversification problem is as follows: Given a finite metric space (X,d) and a parameter k in N, find a subset of k elements of X that has maximum diversity. There are many functions that measure diversity. One of the most popular measures, called remote-clique, is the sum of the pairwise distances of the chosen elements. In this paper, we present novel results on three widely used diversity measures: Remote-clique, remote-star and remote-bipartition.
Our main result are polynomial time approximation schemes for these three diversification problems under the assumption that the metric space is doubling. This setting has been discussed in the recent literature. The existence of such a PTAS however was left open.
Our results also hold in the setting where the distances are raised to a fixed power q >= 1, giving rise to more variants of diversity functions, similar in spirit to the variations of clustering problems depending on the power applied to the pairwise distances. Finally, we provide a proof of NP-hardness for remote-clique with squared distances in doubling metric spaces
Approximating k-Forest with Resource Augmentation: A Primal-Dual Approach
In this paper, we study the -forest problem in the model of resource
augmentation. In the -forest problem, given an edge-weighted graph ,
a parameter , and a set of demand pairs , the
objective is to construct a minimum-cost subgraph that connects at least
demands. The problem is hard to approximate---the best-known approximation
ratio is . Furthermore, -forest is as hard to
approximate as the notoriously-hard densest -subgraph problem.
While the -forest problem is hard to approximate in the worst-case, we
show that with the use of resource augmentation, we can efficiently approximate
it up to a constant factor.
First, we restate the problem in terms of the number of demands that are {\em
not} connected. In particular, the objective of the -forest problem can be
viewed as to remove at most demands and find a minimum-cost subgraph that
connects the remaining demands. We use this perspective of the problem to
explain the performance of our algorithm (in terms of the augmentation) in a
more intuitive way.
Specifically, we present a polynomial-time algorithm for the -forest
problem that, for every , removes at most demands and has
cost no more than times the cost of an optimal algorithm
that removes at most demands
Weakly Submodular Functions
Submodular functions are well-studied in combinatorial optimization, game
theory and economics. The natural diminishing returns property makes them
suitable for many applications. We study an extension of monotone submodular
functions, which we call {\em weakly submodular functions}. Our extension
includes some (mildly) supermodular functions. We show that several natural
functions belong to this class and relate our class to some other recent
submodular function extensions.
We consider the optimization problem of maximizing a weakly submodular
function subject to uniform and general matroid constraints. For a uniform
matroid constraint, the "standard greedy algorithm" achieves a constant
approximation ratio where the constant (experimentally) converges to 5.95 as
the cardinality constraint increases. For a general matroid constraint, a
simple local search algorithm achieves a constant approximation ratio where the
constant (analytically) converges to 10.22 as the rank of the matroid
increases
Max-sum diversity via convex programming
Diversity maximization is an important concept in information retrieval,
computational geometry and operations research. Usually, it is a variant of the
following problem: Given a ground set, constraints, and a function
that measures diversity of a subset, the task is to select a feasible subset
such that is maximized. The \emph{sum-dispersion} function , which is the sum of the pairwise distances in , is
in this context a prominent diversification measure. The corresponding
diversity maximization is the \emph{max-sum} or \emph{sum-sum diversification}.
Many recent results deal with the design of constant-factor approximation
algorithms of diversification problems involving sum-dispersion function under
a matroid constraint. In this paper, we present a PTAS for the max-sum
diversification problem under a matroid constraint for distances
of \emph{negative type}. Distances of negative type are, for
example, metric distances stemming from the and norm, as well
as the cosine or spherical, or Jaccard distance which are popular similarity
metrics in web and image search
Efficient Approximations for the Online Dispersion Problem
The dispersion problem has been widely studied in computational geometry and facility location, and is closely related to the packing problem. The goal is to locate n points (e.g., facilities or persons) in a k-dimensional polytope, so that they are far away from each other and from the boundary of the polytope. In many real-world scenarios however, the points arrive and depart at different times, and decisions must be made without knowing future events. Therefore we study, for the first time in the literature, the online dispersion problem in Euclidean space.
There are two natural objectives when time is involved: the all-time worst-case (ATWC) problem tries to maximize the minimum distance that ever appears at any time; and the cumulative distance (CD) problem tries to maximize the integral of the minimum distance throughout the whole time interval. Interestingly, the online problems are highly non-trivial even on a segment. For cumulative distance, this remains the case even when the problem is time-dependent but offline, with all the arriving and departure times given in advance.
For the online ATWC problem on a segment, we construct a deterministic polynomial-time algorithm which is (2ln2+epsilon)-competitive, where epsilon>0 can be arbitrarily small and the algorithm\u27s running time is polynomial in 1/epsilon. We show this algorithm is actually optimal. For the same problem in a square, we provide a 1.591-competitive algorithm and a 1.183 lower-bound. Furthermore, for arbitrary k-dimensional polytopes with k>=2, we provide a 2/(1-epsilon)-competitive algorithm and a 7/6 lower-bound. All our lower-bounds come from the structure of the online problems and hold even when computational complexity is not a concern. Interestingly, for the offline CD problem in arbitrary k-dimensional polytopes, we provide a polynomial-time black-box reduction to the online ATWC problem, and the resulting competitive ratio increases by a factor of at most 2. Our techniques also apply to online dispersion problems with different boundary conditions