6,685 research outputs found

    The Query-commit Problem

    Full text link
    In the query-commit problem we are given a graph where edges have distinct probabilities of existing. It is possible to query the edges of the graph, and if the queried edge exists then its endpoints are irrevocably matched. The goal is to find a querying strategy which maximizes the expected size of the matching obtained. This stochastic matching setup is motivated by applications in kidney exchanges and online dating. In this paper we address the query-commit problem from both theoretical and experimental perspectives. First, we show that a simple class of edges can be queried without compromising the optimality of the strategy. This property is then used to obtain in polynomial time an optimal querying strategy when the input graph is sparse. Next we turn our attentions to the kidney exchange application, focusing on instances modeled over real data from existing exchange programs. We prove that, as the number of nodes grows, almost every instance admits a strategy which matches almost all nodes. This result supports the intuition that more exchanges are possible on a larger pool of patient/donors and gives theoretical justification for unifying the existing exchange programs. Finally, we evaluate experimentally different querying strategies over kidney exchange instances. We show that even very simple heuristics perform fairly well, being within 1.5% of an optimal clairvoyant strategy, that knows in advance the edges in the graph. In such a time-sensitive application, this result motivates the use of committing strategies

    Explain3D: Explaining Disagreements in Disjoint Datasets

    Get PDF
    Data plays an important role in applications, analytic processes, and many aspects of human activity. As data grows in size and complexity, we are met with an imperative need for tools that promote understanding and explanations over data-related operations. Data management research on explanations has focused on the assumption that data resides in a single dataset, under one common schema. But the reality of today's data is that it is frequently un-integrated, coming from different sources with different schemas. When different datasets provide different answers to semantically similar questions, understanding the reasons for the discrepancies is challenging and cannot be handled by the existing single-dataset solutions. In this paper, we propose Explain3D, a framework for explaining the disagreements across disjoint datasets (3D). Explain3D focuses on identifying the reasons for the differences in the results of two semantically similar queries operating on two datasets with potentially different schemas. Our framework leverages the queries to perform a semantic mapping across the relevant parts of their provenance; discrepancies in this mapping point to causes of the queries' differences. Exploiting the queries gives Explain3D an edge over traditional schema matching and record linkage techniques, which are query-agnostic. Our work makes the following contributions: (1) We formalize the problem of deriving optimal explanations for the differences of the results of semantically similar queries over disjoint datasets. (2) We design a 3-stage framework for solving the optimal explanation problem. (3) We develop a smart-partitioning optimizer that improves the efficiency of the framework by orders of magnitude. (4)~We experiment with real-world and synthetic data to demonstrate that Explain3D can derive precise explanations efficiently

    Bi-Criteria and Approximation Algorithms for Restricted Matchings

    Full text link
    In this work we study approximation algorithms for the \textit{Bounded Color Matching} problem (a.k.a. Restricted Matching problem) which is defined as follows: given a graph in which each edge ee has a color cec_e and a profit pe∈Q+p_e \in \mathbb{Q}^+, we want to compute a maximum (cardinality or profit) matching in which no more than wj∈Z+w_j \in \mathbb{Z}^+ edges of color cjc_j are present. This kind of problems, beside the theoretical interest on its own right, emerges in multi-fiber optical networking systems, where we interpret each unique wavelength that can travel through the fiber as a color class and we would like to establish communication between pairs of systems. We study approximation and bi-criteria algorithms for this problem which are based on linear programming techniques and, in particular, on polyhedral characterizations of the natural linear formulation of the problem. In our setting, we allow violations of the bounds wjw_j and we model our problem as a bi-criteria problem: we have two objectives to optimize namely (a) to maximize the profit (maximum matching) while (b) minimizing the violation of the color bounds. We prove how we can "beat" the integrality gap of the natural linear programming formulation of the problem by allowing only a slight violation of the color bounds. In particular, our main result is \textit{constant} approximation bounds for both criteria of the corresponding bi-criteria optimization problem

    Ignorance is Almost Bliss: Near-Optimal Stochastic Matching With Few Queries

    Full text link
    The stochastic matching problem deals with finding a maximum matching in a graph whose edges are unknown but can be accessed via queries. This is a special case of stochastic kk-set packing, where the problem is to find a maximum packing of sets, each of which exists with some probability. In this paper, we provide edge and set query algorithms for these two problems, respectively, that provably achieve some fraction of the omniscient optimal solution. Our main theoretical result for the stochastic matching (i.e., 22-set packing) problem is the design of an \emph{adaptive} algorithm that queries only a constant number of edges per vertex and achieves a (1−ϔ)(1-\epsilon) fraction of the omniscient optimal solution, for an arbitrarily small Ï”>0\epsilon>0. Moreover, this adaptive algorithm performs the queries in only a constant number of rounds. We complement this result with a \emph{non-adaptive} (i.e., one round of queries) algorithm that achieves a (0.5−ϔ)(0.5 - \epsilon) fraction of the omniscient optimum. We also extend both our results to stochastic kk-set packing by designing an adaptive algorithm that achieves a (2k−ϔ)(\frac{2}{k} - \epsilon) fraction of the omniscient optimal solution, again with only O(1)O(1) queries per element. This guarantee is close to the best known polynomial-time approximation ratio of 3k+1−ϔ\frac{3}{k+1} -\epsilon for the \emph{deterministic} kk-set packing problem [Furer and Yu, 2013] We empirically explore the application of (adaptations of) these algorithms to the kidney exchange problem, where patients with end-stage renal failure swap willing but incompatible donors. We show on both generated data and on real data from the first 169 match runs of the UNOS nationwide kidney exchange that even a very small number of non-adaptive edge queries per vertex results in large gains in expected successful matches

    Almost Optimal Stochastic Weighted Matching With Few Queries

    Full text link
    We consider the {\em stochastic matching} problem. An edge-weighted general (i.e., not necessarily bipartite) graph G(V,E)G(V, E) is given in the input, where each edge in EE is {\em realized} independently with probability pp; the realization is initially unknown, however, we are able to {\em query} the edges to determine whether they are realized. The goal is to query only a small number of edges to find a {\em realized matching} that is sufficiently close to the maximum matching among all realized edges. This problem has received a considerable attention during the past decade due to its numerous real-world applications in kidney-exchange, matchmaking services, online labor markets, and advertisements. Our main result is an {\em adaptive} algorithm that for any arbitrarily small Ï”>0\epsilon > 0, finds a (1−ϔ)(1-\epsilon)-approximation in expectation, by querying only O(1)O(1) edges per vertex. We further show that our approach leads to a (1/2−ϔ)(1/2-\epsilon)-approximate {\em non-adaptive} algorithm that also queries only O(1)O(1) edges per vertex. Prior to our work, no nontrivial approximation was known for weighted graphs using a constant per-vertex budget. The state-of-the-art adaptive (resp. non-adaptive) algorithm of Maehara and Yamaguchi [SODA 2018] achieves a (1−ϔ)(1-\epsilon)-approximation (resp. (1/2−ϔ)(1/2-\epsilon)-approximation) by querying up to O(wlog⁥n)O(w\log{n}) edges per vertex where ww denotes the maximum integer edge-weight. Our result is a substantial improvement over this bound and has an appealing message: No matter what the structure of the input graph is, one can get arbitrarily close to the optimum solution by querying only a constant number of edges per vertex. To obtain our results, we introduce novel properties of a generalization of {\em augmenting paths} to weighted matchings that may be of independent interest
    • 

    corecore