6,685 research outputs found
The Query-commit Problem
In the query-commit problem we are given a graph where edges have distinct
probabilities of existing. It is possible to query the edges of the graph, and
if the queried edge exists then its endpoints are irrevocably matched. The goal
is to find a querying strategy which maximizes the expected size of the
matching obtained. This stochastic matching setup is motivated by applications
in kidney exchanges and online dating.
In this paper we address the query-commit problem from both theoretical and
experimental perspectives. First, we show that a simple class of edges can be
queried without compromising the optimality of the strategy. This property is
then used to obtain in polynomial time an optimal querying strategy when the
input graph is sparse. Next we turn our attentions to the kidney exchange
application, focusing on instances modeled over real data from existing
exchange programs. We prove that, as the number of nodes grows, almost every
instance admits a strategy which matches almost all nodes. This result supports
the intuition that more exchanges are possible on a larger pool of
patient/donors and gives theoretical justification for unifying the existing
exchange programs. Finally, we evaluate experimentally different querying
strategies over kidney exchange instances. We show that even very simple
heuristics perform fairly well, being within 1.5% of an optimal clairvoyant
strategy, that knows in advance the edges in the graph. In such a
time-sensitive application, this result motivates the use of committing
strategies
Explain3D: Explaining Disagreements in Disjoint Datasets
Data plays an important role in applications, analytic processes, and many
aspects of human activity. As data grows in size and complexity, we are met
with an imperative need for tools that promote understanding and explanations
over data-related operations. Data management research on explanations has
focused on the assumption that data resides in a single dataset, under one
common schema. But the reality of today's data is that it is frequently
un-integrated, coming from different sources with different schemas. When
different datasets provide different answers to semantically similar questions,
understanding the reasons for the discrepancies is challenging and cannot be
handled by the existing single-dataset solutions.
In this paper, we propose Explain3D, a framework for explaining the
disagreements across disjoint datasets (3D). Explain3D focuses on identifying
the reasons for the differences in the results of two semantically similar
queries operating on two datasets with potentially different schemas. Our
framework leverages the queries to perform a semantic mapping across the
relevant parts of their provenance; discrepancies in this mapping point to
causes of the queries' differences. Exploiting the queries gives Explain3D an
edge over traditional schema matching and record linkage techniques, which are
query-agnostic. Our work makes the following contributions: (1) We formalize
the problem of deriving optimal explanations for the differences of the results
of semantically similar queries over disjoint datasets. (2) We design a 3-stage
framework for solving the optimal explanation problem. (3) We develop a
smart-partitioning optimizer that improves the efficiency of the framework by
orders of magnitude. (4)~We experiment with real-world and synthetic data to
demonstrate that Explain3D can derive precise explanations efficiently
Bi-Criteria and Approximation Algorithms for Restricted Matchings
In this work we study approximation algorithms for the \textit{Bounded Color
Matching} problem (a.k.a. Restricted Matching problem) which is defined as
follows: given a graph in which each edge has a color and a profit
, we want to compute a maximum (cardinality or profit)
matching in which no more than edges of color are
present. This kind of problems, beside the theoretical interest on its own
right, emerges in multi-fiber optical networking systems, where we interpret
each unique wavelength that can travel through the fiber as a color class and
we would like to establish communication between pairs of systems. We study
approximation and bi-criteria algorithms for this problem which are based on
linear programming techniques and, in particular, on polyhedral
characterizations of the natural linear formulation of the problem. In our
setting, we allow violations of the bounds and we model our problem as a
bi-criteria problem: we have two objectives to optimize namely (a) to maximize
the profit (maximum matching) while (b) minimizing the violation of the color
bounds. We prove how we can "beat" the integrality gap of the natural linear
programming formulation of the problem by allowing only a slight violation of
the color bounds. In particular, our main result is \textit{constant}
approximation bounds for both criteria of the corresponding bi-criteria
optimization problem
Ignorance is Almost Bliss: Near-Optimal Stochastic Matching With Few Queries
The stochastic matching problem deals with finding a maximum matching in a
graph whose edges are unknown but can be accessed via queries. This is a
special case of stochastic -set packing, where the problem is to find a
maximum packing of sets, each of which exists with some probability. In this
paper, we provide edge and set query algorithms for these two problems,
respectively, that provably achieve some fraction of the omniscient optimal
solution.
Our main theoretical result for the stochastic matching (i.e., -set
packing) problem is the design of an \emph{adaptive} algorithm that queries
only a constant number of edges per vertex and achieves a
fraction of the omniscient optimal solution, for an arbitrarily small
. Moreover, this adaptive algorithm performs the queries in only a
constant number of rounds. We complement this result with a \emph{non-adaptive}
(i.e., one round of queries) algorithm that achieves a
fraction of the omniscient optimum. We also extend both our results to
stochastic -set packing by designing an adaptive algorithm that achieves a
fraction of the omniscient optimal solution, again
with only queries per element. This guarantee is close to the best known
polynomial-time approximation ratio of for the
\emph{deterministic} -set packing problem [Furer and Yu, 2013]
We empirically explore the application of (adaptations of) these algorithms
to the kidney exchange problem, where patients with end-stage renal failure
swap willing but incompatible donors. We show on both generated data and on
real data from the first 169 match runs of the UNOS nationwide kidney exchange
that even a very small number of non-adaptive edge queries per vertex results
in large gains in expected successful matches
Recommended from our members
A review of portfolio planning: Models and systems
In this chapter, we first provide an overview of a number of portfolio planning models
which have been proposed and investigated over the last forty years. We revisit the
mean-variance (M-V) model of Markowitz and the construction of the risk-return
efficient frontier. A piecewise linear approximation of the problem through a
reformulation involving diagonalisation of the quadratic form into a variable
separable function is also considered. A few other models, such as, the Mean
Absolute Deviation (MAD), the Weighted Goal Programming (WGP) and the
Minimax (MM) model which use alternative metrics for risk are also introduced,
compared and contrasted. Recently asymmetric measures of risk have gained in
importance; we consider a generic representation and a number of alternative
symmetric and asymmetric measures of risk which find use in the evaluation of
portfolios. There are a number of modelling and computational considerations which
have been introduced into practical portfolio planning problems. These include: (a)
buy-in thresholds for assets, (b) restriction on the number of assets (cardinality
constraints), (c) transaction roundlot restrictions. Practical portfolio models may also
include (d) dedication of cashflow streams, and, (e) immunization which involves
duration matching and convexity constraints. The modelling issues in respect of these
features are discussed. Many of these features lead to discrete restrictions involving
zero-one and general integer variables which make the resulting model a quadratic
mixed-integer programming model (QMIP). The QMIP is a NP-hard problem; the
algorithms and solution methods for this class of problems are also discussed. The
issues of preparing the analytic data (financial datamarts) for this family of portfolio
planning problems are examined. We finally present computational results which
provide some indication of the state-of-the-art in the solution of portfolio optimisation
problems
Almost Optimal Stochastic Weighted Matching With Few Queries
We consider the {\em stochastic matching} problem. An edge-weighted general
(i.e., not necessarily bipartite) graph is given in the input, where
each edge in is {\em realized} independently with probability ; the
realization is initially unknown, however, we are able to {\em query} the edges
to determine whether they are realized. The goal is to query only a small
number of edges to find a {\em realized matching} that is sufficiently close to
the maximum matching among all realized edges. This problem has received a
considerable attention during the past decade due to its numerous real-world
applications in kidney-exchange, matchmaking services, online labor markets,
and advertisements.
Our main result is an {\em adaptive} algorithm that for any arbitrarily small
, finds a -approximation in expectation, by
querying only edges per vertex. We further show that our approach leads
to a -approximate {\em non-adaptive} algorithm that also
queries only edges per vertex. Prior to our work, no nontrivial
approximation was known for weighted graphs using a constant per-vertex budget.
The state-of-the-art adaptive (resp. non-adaptive) algorithm of Maehara and
Yamaguchi [SODA 2018] achieves a -approximation (resp.
-approximation) by querying up to edges per
vertex where denotes the maximum integer edge-weight. Our result is a
substantial improvement over this bound and has an appealing message: No matter
what the structure of the input graph is, one can get arbitrarily close to the
optimum solution by querying only a constant number of edges per vertex.
To obtain our results, we introduce novel properties of a generalization of
{\em augmenting paths} to weighted matchings that may be of independent
interest
- âŠ