15,758 research outputs found

    Online Disjoint Set Cover Without Prior Knowledge

    Get PDF
    The disjoint set cover (DSC) problem is a fundamental combinatorial optimization problem concerned with partitioning the (hyper)edges of a hypergraph into (pairwise disjoint) clusters so that the number of clusters that cover all nodes is maximized. In its online version, the edges arrive one-by-one and should be assigned to clusters in an irrevocable fashion without knowing the future edges. This paper investigates the competitiveness of online DSC algorithms. Specifically, we develop the first (randomized) online DSC algorithm that guarantees a poly-logarithmic (O(log^{2} n)) competitive ratio without prior knowledge of the hypergraph\u27s minimum degree. On the negative side, we prove that the competitive ratio of any randomized online DSC algorithm must be at least Omega((log n)/(log log n)) (even if the online algorithm does know the minimum degree in advance), thus establishing the first lower bound on the competitive ratio of randomized online DSC algorithms

    The Online Disjoint Set Cover Problem and its Applications

    Full text link
    Given a universe UU of nn elements and a collection of subsets S\mathcal{S} of UU, the maximum disjoint set cover problem (DSCP) is to partition S\mathcal{S} into as many set covers as possible, where a set cover is defined as a collection of subsets whose union is UU. We consider the online DSCP, in which the subsets arrive one by one (possibly in an order chosen by an adversary), and must be irrevocably assigned to some partition on arrival with the objective of minimizing the competitive ratio. The competitive ratio of an online DSCP algorithm AA is defined as the maximum ratio of the number of disjoint set covers obtained by the optimal offline algorithm to the number of disjoint set covers obtained by AA across all inputs. We propose an online algorithm for solving the DSCP with competitive ratio lnn\ln n. We then show a lower bound of Ω(lnn)\Omega(\sqrt{\ln n}) on the competitive ratio for any online DSCP algorithm. The online disjoint set cover problem has wide ranging applications in practice, including the online crowd-sourcing problem, the online coverage lifetime maximization problem in wireless sensor networks, and in online resource allocation problems.Comment: To appear in IEEE INFOCOM 201

    Explain3D: Explaining Disagreements in Disjoint Datasets

    Get PDF
    Data plays an important role in applications, analytic processes, and many aspects of human activity. As data grows in size and complexity, we are met with an imperative need for tools that promote understanding and explanations over data-related operations. Data management research on explanations has focused on the assumption that data resides in a single dataset, under one common schema. But the reality of today's data is that it is frequently un-integrated, coming from different sources with different schemas. When different datasets provide different answers to semantically similar questions, understanding the reasons for the discrepancies is challenging and cannot be handled by the existing single-dataset solutions. In this paper, we propose Explain3D, a framework for explaining the disagreements across disjoint datasets (3D). Explain3D focuses on identifying the reasons for the differences in the results of two semantically similar queries operating on two datasets with potentially different schemas. Our framework leverages the queries to perform a semantic mapping across the relevant parts of their provenance; discrepancies in this mapping point to causes of the queries' differences. Exploiting the queries gives Explain3D an edge over traditional schema matching and record linkage techniques, which are query-agnostic. Our work makes the following contributions: (1) We formalize the problem of deriving optimal explanations for the differences of the results of semantically similar queries over disjoint datasets. (2) We design a 3-stage framework for solving the optimal explanation problem. (3) We develop a smart-partitioning optimizer that improves the efficiency of the framework by orders of magnitude. (4)~We experiment with real-world and synthetic data to demonstrate that Explain3D can derive precise explanations efficiently
    corecore