15,418 research outputs found
Almost Optimal Stochastic Weighted Matching With Few Queries
We consider the {\em stochastic matching} problem. An edge-weighted general
(i.e., not necessarily bipartite) graph is given in the input, where
each edge in is {\em realized} independently with probability ; the
realization is initially unknown, however, we are able to {\em query} the edges
to determine whether they are realized. The goal is to query only a small
number of edges to find a {\em realized matching} that is sufficiently close to
the maximum matching among all realized edges. This problem has received a
considerable attention during the past decade due to its numerous real-world
applications in kidney-exchange, matchmaking services, online labor markets,
and advertisements.
Our main result is an {\em adaptive} algorithm that for any arbitrarily small
, finds a -approximation in expectation, by
querying only edges per vertex. We further show that our approach leads
to a -approximate {\em non-adaptive} algorithm that also
queries only edges per vertex. Prior to our work, no nontrivial
approximation was known for weighted graphs using a constant per-vertex budget.
The state-of-the-art adaptive (resp. non-adaptive) algorithm of Maehara and
Yamaguchi [SODA 2018] achieves a -approximation (resp.
-approximation) by querying up to edges per
vertex where denotes the maximum integer edge-weight. Our result is a
substantial improvement over this bound and has an appealing message: No matter
what the structure of the input graph is, one can get arbitrarily close to the
optimum solution by querying only a constant number of edges per vertex.
To obtain our results, we introduce novel properties of a generalization of
{\em augmenting paths} to weighted matchings that may be of independent
interest
The Query-commit Problem
In the query-commit problem we are given a graph where edges have distinct
probabilities of existing. It is possible to query the edges of the graph, and
if the queried edge exists then its endpoints are irrevocably matched. The goal
is to find a querying strategy which maximizes the expected size of the
matching obtained. This stochastic matching setup is motivated by applications
in kidney exchanges and online dating.
In this paper we address the query-commit problem from both theoretical and
experimental perspectives. First, we show that a simple class of edges can be
queried without compromising the optimality of the strategy. This property is
then used to obtain in polynomial time an optimal querying strategy when the
input graph is sparse. Next we turn our attentions to the kidney exchange
application, focusing on instances modeled over real data from existing
exchange programs. We prove that, as the number of nodes grows, almost every
instance admits a strategy which matches almost all nodes. This result supports
the intuition that more exchanges are possible on a larger pool of
patient/donors and gives theoretical justification for unifying the existing
exchange programs. Finally, we evaluate experimentally different querying
strategies over kidney exchange instances. We show that even very simple
heuristics perform fairly well, being within 1.5% of an optimal clairvoyant
strategy, that knows in advance the edges in the graph. In such a
time-sensitive application, this result motivates the use of committing
strategies
Ignorance is Almost Bliss: Near-Optimal Stochastic Matching With Few Queries
The stochastic matching problem deals with finding a maximum matching in a
graph whose edges are unknown but can be accessed via queries. This is a
special case of stochastic -set packing, where the problem is to find a
maximum packing of sets, each of which exists with some probability. In this
paper, we provide edge and set query algorithms for these two problems,
respectively, that provably achieve some fraction of the omniscient optimal
solution.
Our main theoretical result for the stochastic matching (i.e., -set
packing) problem is the design of an \emph{adaptive} algorithm that queries
only a constant number of edges per vertex and achieves a
fraction of the omniscient optimal solution, for an arbitrarily small
. Moreover, this adaptive algorithm performs the queries in only a
constant number of rounds. We complement this result with a \emph{non-adaptive}
(i.e., one round of queries) algorithm that achieves a
fraction of the omniscient optimum. We also extend both our results to
stochastic -set packing by designing an adaptive algorithm that achieves a
fraction of the omniscient optimal solution, again
with only queries per element. This guarantee is close to the best known
polynomial-time approximation ratio of for the
\emph{deterministic} -set packing problem [Furer and Yu, 2013]
We empirically explore the application of (adaptations of) these algorithms
to the kidney exchange problem, where patients with end-stage renal failure
swap willing but incompatible donors. We show on both generated data and on
real data from the first 169 match runs of the UNOS nationwide kidney exchange
that even a very small number of non-adaptive edge queries per vertex results
in large gains in expected successful matches
Neural Vector Spaces for Unsupervised Information Retrieval
We propose the Neural Vector Space Model (NVSM), a method that learns
representations of documents in an unsupervised manner for news article
retrieval. In the NVSM paradigm, we learn low-dimensional representations of
words and documents from scratch using gradient descent and rank documents
according to their similarity with query representations that are composed from
word representations. We show that NVSM performs better at document ranking
than existing latent semantic vector space methods. The addition of NVSM to a
mixture of lexical language models and a state-of-the-art baseline vector space
model yields a statistically significant increase in retrieval effectiveness.
Consequently, NVSM adds a complementary relevance signal. Next to semantic
matching, we find that NVSM performs well in cases where lexical matching is
needed.
NVSM learns a notion of term specificity directly from the document
collection without feature engineering. We also show that NVSM learns
regularities related to Luhn significance. Finally, we give advice on how to
deploy NVSM in situations where model selection (e.g., cross-validation) is
infeasible. We find that an unsupervised ensemble of multiple models trained
with different hyperparameter values performs better than a single
cross-validated model. Therefore, NVSM can safely be used for ranking documents
without supervised relevance judgments.Comment: TOIS 201
Term-Specific Eigenvector-Centrality in Multi-Relation Networks
Fuzzy matching and ranking are two information retrieval techniques widely used in web search. Their application to structured data, however, remains an open problem. This article investigates how eigenvector-centrality can be used for approximate matching in multi-relation graphs, that is, graphs where connections of many different types may exist. Based on an extension of the PageRank matrix, eigenvectors representing the distribution of a term after propagating term weights between related data items are computed. The result is an index which takes the document structure into account and can be used with standard document retrieval techniques. As the scheme takes the shape of an index transformation, all necessary calculations are performed during index tim
Born to trade: a genetically evolved keyword bidder for sponsored search
In sponsored search auctions, advertisers choose a set of keywords based on products they wish to market. They bid for advertising slots that will be displayed on the search results page when a user submits a query containing the keywords that the advertiser selected. Deciding how much to bid is a real challenge: if the bid is too low with respect to the bids of other advertisers, the ad might not get displayed in a favorable position; a bid that is too high on the other hand might not be profitable either, since the attracted number of conversions might not be enough to compensate for the high cost per click.
In this paper we propose a genetically evolved keyword bidding strategy that decides how much to bid for each query based on historical data such as the position obtained on the previous day. In light of the fact that our approach does not implement any particular expert knowledge on keyword auctions, it did remarkably well in the Trading Agent Competition at IJCAI2009
- …