15,758 research outputs found
Online Disjoint Set Cover Without Prior Knowledge
The disjoint set cover (DSC) problem is a fundamental combinatorial optimization problem concerned with partitioning the (hyper)edges of a hypergraph into (pairwise disjoint) clusters so that the number of clusters that cover all nodes is maximized. In its online version, the edges arrive one-by-one and should be assigned to clusters in an irrevocable fashion without knowing the future edges. This paper investigates the competitiveness of online DSC algorithms. Specifically, we develop the first (randomized) online DSC algorithm that guarantees a poly-logarithmic (O(log^{2} n)) competitive ratio without prior knowledge of the hypergraph\u27s minimum degree. On the negative side, we prove that the competitive ratio of any randomized online DSC algorithm must be at least Omega((log n)/(log log n)) (even if the online algorithm does know the minimum degree in advance), thus establishing the first lower bound on the competitive ratio of randomized online DSC algorithms
The Online Disjoint Set Cover Problem and its Applications
Given a universe of elements and a collection of subsets
of , the maximum disjoint set cover problem (DSCP) is to
partition into as many set covers as possible, where a set cover
is defined as a collection of subsets whose union is . We consider the
online DSCP, in which the subsets arrive one by one (possibly in an order
chosen by an adversary), and must be irrevocably assigned to some partition on
arrival with the objective of minimizing the competitive ratio. The competitive
ratio of an online DSCP algorithm is defined as the maximum ratio of the
number of disjoint set covers obtained by the optimal offline algorithm to the
number of disjoint set covers obtained by across all inputs. We propose an
online algorithm for solving the DSCP with competitive ratio . We then
show a lower bound of on the competitive ratio for any
online DSCP algorithm. The online disjoint set cover problem has wide ranging
applications in practice, including the online crowd-sourcing problem, the
online coverage lifetime maximization problem in wireless sensor networks, and
in online resource allocation problems.Comment: To appear in IEEE INFOCOM 201
Explain3D: Explaining Disagreements in Disjoint Datasets
Data plays an important role in applications, analytic processes, and many
aspects of human activity. As data grows in size and complexity, we are met
with an imperative need for tools that promote understanding and explanations
over data-related operations. Data management research on explanations has
focused on the assumption that data resides in a single dataset, under one
common schema. But the reality of today's data is that it is frequently
un-integrated, coming from different sources with different schemas. When
different datasets provide different answers to semantically similar questions,
understanding the reasons for the discrepancies is challenging and cannot be
handled by the existing single-dataset solutions.
In this paper, we propose Explain3D, a framework for explaining the
disagreements across disjoint datasets (3D). Explain3D focuses on identifying
the reasons for the differences in the results of two semantically similar
queries operating on two datasets with potentially different schemas. Our
framework leverages the queries to perform a semantic mapping across the
relevant parts of their provenance; discrepancies in this mapping point to
causes of the queries' differences. Exploiting the queries gives Explain3D an
edge over traditional schema matching and record linkage techniques, which are
query-agnostic. Our work makes the following contributions: (1) We formalize
the problem of deriving optimal explanations for the differences of the results
of semantically similar queries over disjoint datasets. (2) We design a 3-stage
framework for solving the optimal explanation problem. (3) We develop a
smart-partitioning optimizer that improves the efficiency of the framework by
orders of magnitude. (4)~We experiment with real-world and synthetic data to
demonstrate that Explain3D can derive precise explanations efficiently
- …