22 research outputs found

    Set covering with our eyes closed

    Get PDF
    Given a universe UU of nn elements and a weighted collection S\mathscr{S} of mm subsets of UU, the universal set cover problem is to a priori map each element u∈Uu \in U to a set S(u)∈SS(u) \in \mathscr{S} containing uu such that any set X⊆UX{\subseteq U} is covered by S(X)=\cup_{u\in XS(u). The aim is to find a mapping such that the cost of S(X)S(X) is as close as possible to the optimal set cover cost for XX. (Such problems are also called oblivious or a priori optimization problems.) Unfortunately, for every universal mapping, the cost of S(X)S(X) can be Ω(n)\Omega(\sqrt{n}) times larger than optimal if the set XX is adversarially chosen. In this paper we study the performance on average, when XX is a set of randomly chosen elements from the universe: we show how to efficiently find a universal map whose expected cost is O(log⁥mn)O(\log mn) times the expected optimal cost. In fact, we give a slightly improved analysis and show that this is the best possible. We generalize these ideas to weighted set cover and show similar guarantees to (nonmetric) facility location, where we have to balance the facility opening cost with the cost of connecting clients to the facilities. We show applications of our results to universal multicut and disc-covering problems and show how all these universal mappings give us algorithms for the stochastic online variants of the problems with the same competitive factors

    Designing cost-sharing methods for Bayesian games

    Get PDF
    We study the design of cost-sharing protocols for two fundamental resource allocation problems, the Set Cover and the Steiner Tree Problem, under environments of incomplete information (Bayesian model). Our objective is to design protocols where the worst-case Bayesian Nash equilibria, have low cost, i.e. the Bayesian Price of Anarchy (PoA) is minimized. Although budget balance is a very natural requirement, it puts considerable restrictions on the design space, resulting in high PoA. We propose an alternative, relaxed requirement called budget balance in the equilibrium (BBiE).We show an interesting connection between algorithms for Oblivious Stochastic optimization problems and cost-sharing design with low PoA. We exploit this connection for both problems and we enforce approximate solutions of the stochastic problem, as Bayesian Nash equilibria, with the same guarantees on the PoA. More interestingly, we show how to obtain the same bounds on the PoA, by using anonymous posted prices which are desirable because they are easy to implement and, as we show, induce dominant strategies for the players

    Robust hierarchical k-center clustering

    Get PDF
    One of the most popular and widely used methods for data clustering is hierarchical clustering. This clustering technique has proved useful to reveal interesting structure in the data in several applications ranging from computational biology to computer vision. Robustness is an important feature of a clustering technique if we require the clustering to be stable against small perturbations in the input data. In most applications, getting a clustering output that is robust against adversarial outliers or stochastic noise is a necessary condition for the applicability and effectiveness of the clustering technique. This is even more critical in hierarchical clustering where a small change at the bottom of the hierarchy may propagate all the way through to the top. Despite all the previous work [2, 3, 6, 8], our theoretical understanding of robust hierarchical clustering is still limited and several hierarchical clustering algorithms are not known to satisfy such robustness properties. In this paper, we study the limits of robust hierarchical k-center clustering by introducing the concept of universal hierarchical clustering and provide (almost) tight lower and upper bounds for the robust hierarchical k-center clustering problem with outliers and variants of the stochastic clustering problem. Most importantly we present a constant-factor approximation for optimal hierarchical k-center with at most z outliers using a universal set of at most O(z2) set of outliers and show that this result is tight. Moreover we show the necessity of using a universal set of outliers in order to compute an approximately optimal hierarchical k-center with a diffierent set of outliers for each k

    When the Optimum is also Blind: a New Perspective on Universal Optimization

    Get PDF
    Consider the following variant of the set cover problem. We are given a universe U={1,...,n} and a collection of subsets C = {S_1,...,S_m} where each S_i is a subset of U. For every element u from U we need to find a set phi(u) from collection C such that u belongs to phi(u). Once we construct and fix the mapping phi from U to C a subset X from the universe U is revealed, and we need to cover all elements from X with exactly phi(X), that is {phi(u)}_{all u from X}. The goal is to find a mapping such that the cover phi(X) is as cheap as possible. This is an example of a universal problem where the solution has to be created before the actual instance to deal with is revealed. Such problems appear naturally in some settings when we need to optimize under uncertainty and it may be actually too expensive to begin finding a good solution once the input starts being revealed. A rich body of work was devoted to investigate such problems under the regime of worst case analysis, i.e., when we measure how good the solution is by looking at the worst-case ratio: universal solution for a given instance vs optimum solution for the same instance. As the universal solution is significantly more constrained, it is typical that such a worst-case ratio is actually quite big. One way to give a viewpoint on the problem that would be less vulnerable to such extreme worst-cases is to assume that the instance, for which we will have to create a solution, will be drawn randomly from some probability distribution. In this case one wants to minimize the expected value of the ratio: universal solution vs optimum solution. Here the bounds obtained are indeed smaller than when we compare to the worst-case ratio. But even in this case we still compare apples to oranges as no universal solution is able to construct the optimum solution for every possible instance. What if we would compare our approximate universal solution against an optimal universal solution that obeys the same rules as we do? We show that under this viewpoint, but still in the stochastic variant, we can indeed obtain better bounds than in the expected ratio model. For example, for the set cover problem we obtain HnH_n approximation which matches the approximation ratio from the classic deterministic setup. Moreover, we show this for all possible probability distributions over UU that have a polynomially large carrier, while all previous results pertained to a model in which elements were sampled independently. Our result is based on rounding a proper configuration IP that captures the optimal universal solution, and using tools from submodular optimization. The same basic approach leads to improved approximation algorithms for other related problems, including Vertex Cover, Edge Cover, Directed Steiner Tree, Multicut, and Facility Location

    Semi-Streaming Set Cover

    Full text link
    This paper studies the set cover problem under the semi-streaming model. The underlying set system is formalized in terms of a hypergraph G=(V,E)G = (V, E) whose edges arrive one-by-one and the goal is to construct an edge cover F⊆EF \subseteq E with the objective of minimizing the cardinality (or cost in the weighted case) of FF. We consider a parameterized relaxation of this problem, where given some 0≀ϔ<10 \leq \epsilon < 1, the goal is to construct an edge (1−ϔ)(1 - \epsilon)-cover, namely, a subset of edges incident to all but an Ï”\epsilon-fraction of the vertices (or their benefit in the weighted case). The key limitation imposed on the algorithm is that its space is limited to (poly)logarithmically many bits per vertex. Our main result is an asymptotically tight trade-off between Ï”\epsilon and the approximation ratio: We design a semi-streaming algorithm that on input graph GG, constructs a succinct data structure D\mathcal{D} such that for every 0≀ϔ<10 \leq \epsilon < 1, an edge (1−ϔ)(1 - \epsilon)-cover that approximates the optimal edge \mbox{(11-)cover} within a factor of f(Ï”,n)f(\epsilon, n) can be extracted from D\mathcal{D} (efficiently and with no additional space requirements), where f(Ï”,n)={O(1/Ï”),if ϔ>1/nO(n),otherwise . f(\epsilon, n) = \left\{ \begin{array}{ll} O (1 / \epsilon), & \text{if } \epsilon > 1 / \sqrt{n} \\ O (\sqrt{n}), & \text{otherwise} \end{array} \right. \, . In particular for the traditional set cover problem we obtain an O(n)O(\sqrt{n})-approximation. This algorithm is proved to be best possible by establishing a family (parameterized by Ï”\epsilon) of matching lower bounds.Comment: Full version of the extended abstract that will appear in Proceedings of ICALP 2014 track

    The Query-commit Problem

    Full text link
    In the query-commit problem we are given a graph where edges have distinct probabilities of existing. It is possible to query the edges of the graph, and if the queried edge exists then its endpoints are irrevocably matched. The goal is to find a querying strategy which maximizes the expected size of the matching obtained. This stochastic matching setup is motivated by applications in kidney exchanges and online dating. In this paper we address the query-commit problem from both theoretical and experimental perspectives. First, we show that a simple class of edges can be queried without compromising the optimality of the strategy. This property is then used to obtain in polynomial time an optimal querying strategy when the input graph is sparse. Next we turn our attentions to the kidney exchange application, focusing on instances modeled over real data from existing exchange programs. We prove that, as the number of nodes grows, almost every instance admits a strategy which matches almost all nodes. This result supports the intuition that more exchanges are possible on a larger pool of patient/donors and gives theoretical justification for unifying the existing exchange programs. Finally, we evaluate experimentally different querying strategies over kidney exchange instances. We show that even very simple heuristics perform fairly well, being within 1.5% of an optimal clairvoyant strategy, that knows in advance the edges in the graph. In such a time-sensitive application, this result motivates the use of committing strategies

    Algorithmic Complexity of Power Law Networks

    Full text link
    It was experimentally observed that the majority of real-world networks follow power law degree distribution. The aim of this paper is to study the algorithmic complexity of such "typical" networks. The contribution of this work is twofold. First, we define a deterministic condition for checking whether a graph has a power law degree distribution and experimentally validate it on real-world networks. This definition allows us to derive interesting properties of power law networks. We observe that for exponents of the degree distribution in the range [1,2][1,2] such networks exhibit double power law phenomenon that was observed for several real-world networks. Our observation indicates that this phenomenon could be explained by just pure graph theoretical properties. The second aim of our work is to give a novel theoretical explanation why many algorithms run faster on real-world data than what is predicted by algorithmic worst-case analysis. We show how to exploit the power law degree distribution to design faster algorithms for a number of classical P-time problems including transitive closure, maximum matching, determinant, PageRank and matrix inverse. Moreover, we deal with the problems of counting triangles and finding maximum clique. Previously, it has been only shown that these problems can be solved very efficiently on power law graphs when these graphs are random, e.g., drawn at random from some distribution. However, it is unclear how to relate such a theoretical analysis to real-world graphs, which are fixed. Instead of that, we show that the randomness assumption can be replaced with a simple condition on the degrees of adjacent vertices, which can be used to obtain similar results. As a result, in some range of power law exponents, we are able to solve the maximum clique problem in polynomial time, although in general power law networks the problem is NP-complete

    Greedy Algorithms for Online Survivable Network Design

    Get PDF
    In an instance of the network design problem, we are given a graph G=(V,E), an edge-cost function c:E -> R^{>= 0}, and a connectivity criterion. The goal is to find a minimum-cost subgraph H of G that meets the connectivity requirements. An important family of this class is the survivable network design problem (SNDP): given non-negative integers r_{u v} for each pair u,v in V, the solution subgraph H should contain r_{u v} edge-disjoint paths for each pair u and v. While this problem is known to admit good approximation algorithms in the offline case, the problem is much harder in the online setting. Gupta, Krishnaswamy, and Ravi [Gupta et al., 2012] (STOC\u2709) are the first to consider the online survivable network design problem. They demonstrate an algorithm with competitive ratio of O(k log^3 n), where k=max_{u,v} r_{u v}. Note that the competitive ratio of the algorithm by Gupta et al. grows linearly in k. Since then, an important open problem in the online community [Naor et al., 2011; Gupta et al., 2012] is whether the linear dependence on k can be reduced to a logarithmic dependency. Consider an online greedy algorithm that connects every demand by adding a minimum cost set of edges to H. Surprisingly, we show that this greedy algorithm significantly improves the competitive ratio when a congestion of 2 is allowed on the edges or when the model is stochastic. While our algorithm is fairly simple, our analysis requires a deep understanding of k-connected graphs. In particular, we prove that the greedy algorithm is O(log^2 n log k)-competitive if one satisfies every demand between u and v by r_{uv}/2 edge-disjoint paths. The spirit of our result is similar to the work of Chuzhoy and Li [Chuzhoy and Li, 2012] (FOCS\u2712), in which the authors give a polylogarithmic approximation algorithm for edge-disjoint paths with congestion 2. Moreover, we study the greedy algorithm in the online stochastic setting. We consider the i.i.d. model, where each online demand is drawn from a single probability distribution, the unknown i.i.d. model, where every demand is drawn from a single but unknown probability distribution, and the prophet model in which online demands are drawn from (possibly) different probability distributions. Through a different analysis, we prove that a similar greedy algorithm is constant competitive for the i.i.d. and the prophet models. Also, the greedy algorithm is O(log n)-competitive for the unknown i.i.d. model, which is almost tight due to the lower bound of [Garg et al., 2008] for single connectivity
    corecore