15,418 research outputs found

    Almost Optimal Stochastic Weighted Matching With Few Queries

    Full text link
    We consider the {\em stochastic matching} problem. An edge-weighted general (i.e., not necessarily bipartite) graph G(V,E)G(V, E) is given in the input, where each edge in EE is {\em realized} independently with probability pp; the realization is initially unknown, however, we are able to {\em query} the edges to determine whether they are realized. The goal is to query only a small number of edges to find a {\em realized matching} that is sufficiently close to the maximum matching among all realized edges. This problem has received a considerable attention during the past decade due to its numerous real-world applications in kidney-exchange, matchmaking services, online labor markets, and advertisements. Our main result is an {\em adaptive} algorithm that for any arbitrarily small ϵ>0\epsilon > 0, finds a (1ϵ)(1-\epsilon)-approximation in expectation, by querying only O(1)O(1) edges per vertex. We further show that our approach leads to a (1/2ϵ)(1/2-\epsilon)-approximate {\em non-adaptive} algorithm that also queries only O(1)O(1) edges per vertex. Prior to our work, no nontrivial approximation was known for weighted graphs using a constant per-vertex budget. The state-of-the-art adaptive (resp. non-adaptive) algorithm of Maehara and Yamaguchi [SODA 2018] achieves a (1ϵ)(1-\epsilon)-approximation (resp. (1/2ϵ)(1/2-\epsilon)-approximation) by querying up to O(wlogn)O(w\log{n}) edges per vertex where ww denotes the maximum integer edge-weight. Our result is a substantial improvement over this bound and has an appealing message: No matter what the structure of the input graph is, one can get arbitrarily close to the optimum solution by querying only a constant number of edges per vertex. To obtain our results, we introduce novel properties of a generalization of {\em augmenting paths} to weighted matchings that may be of independent interest

    The Query-commit Problem

    Full text link
    In the query-commit problem we are given a graph where edges have distinct probabilities of existing. It is possible to query the edges of the graph, and if the queried edge exists then its endpoints are irrevocably matched. The goal is to find a querying strategy which maximizes the expected size of the matching obtained. This stochastic matching setup is motivated by applications in kidney exchanges and online dating. In this paper we address the query-commit problem from both theoretical and experimental perspectives. First, we show that a simple class of edges can be queried without compromising the optimality of the strategy. This property is then used to obtain in polynomial time an optimal querying strategy when the input graph is sparse. Next we turn our attentions to the kidney exchange application, focusing on instances modeled over real data from existing exchange programs. We prove that, as the number of nodes grows, almost every instance admits a strategy which matches almost all nodes. This result supports the intuition that more exchanges are possible on a larger pool of patient/donors and gives theoretical justification for unifying the existing exchange programs. Finally, we evaluate experimentally different querying strategies over kidney exchange instances. We show that even very simple heuristics perform fairly well, being within 1.5% of an optimal clairvoyant strategy, that knows in advance the edges in the graph. In such a time-sensitive application, this result motivates the use of committing strategies

    Ignorance is Almost Bliss: Near-Optimal Stochastic Matching With Few Queries

    Full text link
    The stochastic matching problem deals with finding a maximum matching in a graph whose edges are unknown but can be accessed via queries. This is a special case of stochastic kk-set packing, where the problem is to find a maximum packing of sets, each of which exists with some probability. In this paper, we provide edge and set query algorithms for these two problems, respectively, that provably achieve some fraction of the omniscient optimal solution. Our main theoretical result for the stochastic matching (i.e., 22-set packing) problem is the design of an \emph{adaptive} algorithm that queries only a constant number of edges per vertex and achieves a (1ϵ)(1-\epsilon) fraction of the omniscient optimal solution, for an arbitrarily small ϵ>0\epsilon>0. Moreover, this adaptive algorithm performs the queries in only a constant number of rounds. We complement this result with a \emph{non-adaptive} (i.e., one round of queries) algorithm that achieves a (0.5ϵ)(0.5 - \epsilon) fraction of the omniscient optimum. We also extend both our results to stochastic kk-set packing by designing an adaptive algorithm that achieves a (2kϵ)(\frac{2}{k} - \epsilon) fraction of the omniscient optimal solution, again with only O(1)O(1) queries per element. This guarantee is close to the best known polynomial-time approximation ratio of 3k+1ϵ\frac{3}{k+1} -\epsilon for the \emph{deterministic} kk-set packing problem [Furer and Yu, 2013] We empirically explore the application of (adaptations of) these algorithms to the kidney exchange problem, where patients with end-stage renal failure swap willing but incompatible donors. We show on both generated data and on real data from the first 169 match runs of the UNOS nationwide kidney exchange that even a very small number of non-adaptive edge queries per vertex results in large gains in expected successful matches

    Neural Vector Spaces for Unsupervised Information Retrieval

    Get PDF
    We propose the Neural Vector Space Model (NVSM), a method that learns representations of documents in an unsupervised manner for news article retrieval. In the NVSM paradigm, we learn low-dimensional representations of words and documents from scratch using gradient descent and rank documents according to their similarity with query representations that are composed from word representations. We show that NVSM performs better at document ranking than existing latent semantic vector space methods. The addition of NVSM to a mixture of lexical language models and a state-of-the-art baseline vector space model yields a statistically significant increase in retrieval effectiveness. Consequently, NVSM adds a complementary relevance signal. Next to semantic matching, we find that NVSM performs well in cases where lexical matching is needed. NVSM learns a notion of term specificity directly from the document collection without feature engineering. We also show that NVSM learns regularities related to Luhn significance. Finally, we give advice on how to deploy NVSM in situations where model selection (e.g., cross-validation) is infeasible. We find that an unsupervised ensemble of multiple models trained with different hyperparameter values performs better than a single cross-validated model. Therefore, NVSM can safely be used for ranking documents without supervised relevance judgments.Comment: TOIS 201

    Term-Specific Eigenvector-Centrality in Multi-Relation Networks

    Get PDF
    Fuzzy matching and ranking are two information retrieval techniques widely used in web search. Their application to structured data, however, remains an open problem. This article investigates how eigenvector-centrality can be used for approximate matching in multi-relation graphs, that is, graphs where connections of many different types may exist. Based on an extension of the PageRank matrix, eigenvectors representing the distribution of a term after propagating term weights between related data items are computed. The result is an index which takes the document structure into account and can be used with standard document retrieval techniques. As the scheme takes the shape of an index transformation, all necessary calculations are performed during index tim

    Born to trade: a genetically evolved keyword bidder for sponsored search

    Get PDF
    In sponsored search auctions, advertisers choose a set of keywords based on products they wish to market. They bid for advertising slots that will be displayed on the search results page when a user submits a query containing the keywords that the advertiser selected. Deciding how much to bid is a real challenge: if the bid is too low with respect to the bids of other advertisers, the ad might not get displayed in a favorable position; a bid that is too high on the other hand might not be profitable either, since the attracted number of conversions might not be enough to compensate for the high cost per click. In this paper we propose a genetically evolved keyword bidding strategy that decides how much to bid for each query based on historical data such as the position obtained on the previous day. In light of the fact that our approach does not implement any particular expert knowledge on keyword auctions, it did remarkably well in the Trading Agent Competition at IJCAI2009
    corecore