22 research outputs found
Set covering with our eyes closed
Given a universe of elements and a weighted collection of subsets of , the universal set cover problem is to a priori map each element to a set containing such that any set is covered by S(X)=\cup_{u\in XS(u). The aim is to find a mapping such that the cost of is as close as possible to the optimal set cover cost for . (Such problems are also called oblivious or a priori optimization problems.) Unfortunately, for every universal mapping, the cost of can be times larger than optimal if the set is adversarially chosen. In this paper we study the performance on average, when is a set of randomly chosen elements from the universe: we show how to efficiently find a universal map whose expected cost is times the expected optimal cost. In fact, we give a slightly improved analysis and show that this is the best possible. We generalize these ideas to weighted set cover and show similar guarantees to (nonmetric) facility location, where we have to balance the facility opening cost with the cost of connecting clients to the facilities. We show applications of our results to universal multicut and disc-covering problems and show how all these universal mappings give us algorithms for the stochastic online variants of the problems with the same competitive factors
Designing cost-sharing methods for Bayesian games
We study the design of cost-sharing protocols for two fundamental resource allocation problems, the Set Cover and the Steiner Tree Problem, under environments of incomplete information (Bayesian model). Our objective is to design protocols where the worst-case Bayesian Nash equilibria, have low cost, i.e. the Bayesian Price of Anarchy (PoA) is minimized. Although budget balance is a very natural requirement, it puts considerable restrictions on the design space, resulting in high PoA. We propose an alternative, relaxed requirement called budget balance in the equilibrium (BBiE).We show an interesting connection between algorithms for Oblivious Stochastic optimization problems and cost-sharing design with low PoA. We exploit this connection for both problems and we enforce approximate solutions of the stochastic problem, as Bayesian Nash equilibria, with the same guarantees on the PoA. More interestingly, we show how to obtain the same bounds on the PoA, by using anonymous posted prices which are desirable because they are easy to implement and, as we show, induce dominant strategies for the players
Robust hierarchical k-center clustering
One of the most popular and widely used methods for data clustering is hierarchical clustering. This clustering technique has proved useful to reveal interesting structure in the data in several applications ranging from computational biology to computer vision. Robustness is an important feature of a clustering technique if we require the clustering to be stable against small perturbations in the input data. In most applications, getting a clustering output that is robust against adversarial outliers or stochastic noise is a necessary condition for the applicability and effectiveness of the clustering technique. This is even more critical in hierarchical clustering where a small change at the bottom of the hierarchy may propagate all the way through to the top. Despite all the previous work [2, 3, 6, 8], our theoretical understanding of robust hierarchical clustering is still limited and several hierarchical clustering algorithms are not known to satisfy such robustness properties. In this paper, we study the limits of robust hierarchical k-center clustering by introducing the concept of universal hierarchical clustering and provide (almost) tight lower and upper bounds for the robust hierarchical k-center clustering problem with outliers and variants of the stochastic clustering problem. Most importantly we present a constant-factor approximation for optimal hierarchical k-center with at most z outliers using a universal set of at most O(z2) set of outliers and show that this result is tight. Moreover we show the necessity of using a universal set of outliers in order to compute an approximately optimal hierarchical k-center with a diffierent set of outliers for each k
When the Optimum is also Blind: a New Perspective on Universal Optimization
Consider the following variant of the set cover problem. We are given a universe U={1,...,n} and a collection of subsets C = {S_1,...,S_m} where each S_i is a subset of U. For every element u from U we need to find a set phi(u) from collection C such that u belongs to phi(u). Once we construct and fix the mapping phi from U to C a subset X from the universe U is revealed, and we need to cover all elements from X with exactly phi(X), that is {phi(u)}_{all u from X}. The goal is to find a mapping such that the cover phi(X) is as cheap as possible.
This is an example of a universal problem where the solution has to be created before the actual instance to deal with is revealed. Such problems appear naturally in some settings when we need to optimize under uncertainty and it may be actually too expensive to begin finding a good solution once the input starts being revealed. A rich body of work was devoted to investigate such problems under the regime of worst case analysis, i.e., when we measure how good the solution is by looking at the worst-case ratio: universal solution for a given instance vs optimum solution for the same instance.
As the universal solution is significantly more constrained, it is typical that such a worst-case ratio is actually quite big. One way to give a viewpoint on the problem that would be less vulnerable to such extreme worst-cases is to assume that the instance, for which we will have to create a solution, will be drawn randomly from some probability distribution. In this case one wants to minimize the expected value of the ratio: universal solution vs optimum solution. Here the bounds obtained are indeed smaller than when we compare to the worst-case ratio.
But even in this case we still compare apples to oranges as no universal solution is able to construct the optimum solution for every possible instance. What if we would compare our approximate universal solution against an optimal universal solution that obeys the same rules as we do? We show that under this viewpoint, but still in the stochastic variant, we can indeed obtain better bounds than in the expected ratio model. For example, for the set cover problem we obtain approximation which matches the approximation ratio from the classic deterministic setup. Moreover, we show this for all possible probability distributions over that have a polynomially large carrier, while all previous results pertained to a model in which elements were sampled independently. Our result is based on rounding a proper configuration IP that captures the optimal universal solution, and using tools from submodular optimization.
The same basic approach leads to improved approximation algorithms for other related problems, including Vertex Cover, Edge Cover, Directed Steiner Tree, Multicut, and Facility Location
Semi-Streaming Set Cover
This paper studies the set cover problem under the semi-streaming model. The
underlying set system is formalized in terms of a hypergraph whose
edges arrive one-by-one and the goal is to construct an edge cover with the objective of minimizing the cardinality (or cost in the weighted
case) of . We consider a parameterized relaxation of this problem, where
given some , the goal is to construct an edge -cover, namely, a subset of edges incident to all but an
-fraction of the vertices (or their benefit in the weighted case).
The key limitation imposed on the algorithm is that its space is limited to
(poly)logarithmically many bits per vertex.
Our main result is an asymptotically tight trade-off between and
the approximation ratio: We design a semi-streaming algorithm that on input
graph , constructs a succinct data structure such that for
every , an edge -cover that approximates
the optimal edge \mbox{(-)cover} within a factor of can be
extracted from (efficiently and with no additional space
requirements), where In particular for the traditional
set cover problem we obtain an -approximation. This algorithm is
proved to be best possible by establishing a family (parameterized by
) of matching lower bounds.Comment: Full version of the extended abstract that will appear in Proceedings
of ICALP 2014 track
The Query-commit Problem
In the query-commit problem we are given a graph where edges have distinct
probabilities of existing. It is possible to query the edges of the graph, and
if the queried edge exists then its endpoints are irrevocably matched. The goal
is to find a querying strategy which maximizes the expected size of the
matching obtained. This stochastic matching setup is motivated by applications
in kidney exchanges and online dating.
In this paper we address the query-commit problem from both theoretical and
experimental perspectives. First, we show that a simple class of edges can be
queried without compromising the optimality of the strategy. This property is
then used to obtain in polynomial time an optimal querying strategy when the
input graph is sparse. Next we turn our attentions to the kidney exchange
application, focusing on instances modeled over real data from existing
exchange programs. We prove that, as the number of nodes grows, almost every
instance admits a strategy which matches almost all nodes. This result supports
the intuition that more exchanges are possible on a larger pool of
patient/donors and gives theoretical justification for unifying the existing
exchange programs. Finally, we evaluate experimentally different querying
strategies over kidney exchange instances. We show that even very simple
heuristics perform fairly well, being within 1.5% of an optimal clairvoyant
strategy, that knows in advance the edges in the graph. In such a
time-sensitive application, this result motivates the use of committing
strategies
Algorithmic Complexity of Power Law Networks
It was experimentally observed that the majority of real-world networks
follow power law degree distribution. The aim of this paper is to study the
algorithmic complexity of such "typical" networks. The contribution of this
work is twofold.
First, we define a deterministic condition for checking whether a graph has a
power law degree distribution and experimentally validate it on real-world
networks. This definition allows us to derive interesting properties of power
law networks. We observe that for exponents of the degree distribution in the
range such networks exhibit double power law phenomenon that was
observed for several real-world networks. Our observation indicates that this
phenomenon could be explained by just pure graph theoretical properties.
The second aim of our work is to give a novel theoretical explanation why
many algorithms run faster on real-world data than what is predicted by
algorithmic worst-case analysis. We show how to exploit the power law degree
distribution to design faster algorithms for a number of classical P-time
problems including transitive closure, maximum matching, determinant, PageRank
and matrix inverse. Moreover, we deal with the problems of counting triangles
and finding maximum clique. Previously, it has been only shown that these
problems can be solved very efficiently on power law graphs when these graphs
are random, e.g., drawn at random from some distribution. However, it is
unclear how to relate such a theoretical analysis to real-world graphs, which
are fixed. Instead of that, we show that the randomness assumption can be
replaced with a simple condition on the degrees of adjacent vertices, which can
be used to obtain similar results. As a result, in some range of power law
exponents, we are able to solve the maximum clique problem in polynomial time,
although in general power law networks the problem is NP-complete
Greedy Algorithms for Online Survivable Network Design
In an instance of the network design problem, we are given a graph G=(V,E), an edge-cost function c:E -> R^{>= 0}, and a connectivity criterion. The goal is to find a minimum-cost subgraph H of G that meets the connectivity requirements. An important family of this class is the survivable network design problem (SNDP): given non-negative integers r_{u v} for each pair u,v in V, the solution subgraph H should contain r_{u v} edge-disjoint paths for each pair u and v.
While this problem is known to admit good approximation algorithms in the offline case, the problem is much harder in the online setting. Gupta, Krishnaswamy, and Ravi [Gupta et al., 2012] (STOC\u2709) are the first to consider the online survivable network design problem. They demonstrate an algorithm with competitive ratio of O(k log^3 n), where k=max_{u,v} r_{u v}. Note that the competitive ratio of the algorithm by Gupta et al. grows linearly in k. Since then, an important open problem in the online community [Naor et al., 2011; Gupta et al., 2012] is whether the linear dependence on k can be reduced to a logarithmic dependency.
Consider an online greedy algorithm that connects every demand by adding a minimum cost set of edges to H. Surprisingly, we show that this greedy algorithm significantly improves the competitive ratio when a congestion of 2 is allowed on the edges or when the model is stochastic. While our algorithm is fairly simple, our analysis requires a deep understanding of k-connected graphs. In particular, we prove that the greedy algorithm is O(log^2 n log k)-competitive if one satisfies every demand between u and v by r_{uv}/2 edge-disjoint paths. The spirit of our result is similar to the work of Chuzhoy and Li [Chuzhoy and Li, 2012] (FOCS\u2712), in which the authors give a polylogarithmic approximation algorithm for edge-disjoint paths with congestion 2.
Moreover, we study the greedy algorithm in the online stochastic setting. We consider the i.i.d. model, where each online demand is drawn from a single probability distribution, the unknown i.i.d. model, where every demand is drawn from a single but unknown probability distribution, and the prophet model in which online demands are drawn from (possibly) different probability distributions. Through a different analysis, we prove that a similar greedy algorithm is constant competitive for the i.i.d. and the prophet models. Also, the greedy algorithm is O(log n)-competitive for the unknown i.i.d. model, which is almost tight due to the lower bound of [Garg et al., 2008] for single connectivity