2,398 research outputs found
Scalable Semantic Matching of Queries to Ads in Sponsored Search Advertising
Sponsored search represents a major source of revenue for web search engines.
This popular advertising model brings a unique possibility for advertisers to
target users' immediate intent communicated through a search query, usually by
displaying their ads alongside organic search results for queries deemed
relevant to their products or services. However, due to a large number of
unique queries it is challenging for advertisers to identify all such relevant
queries. For this reason search engines often provide a service of advanced
matching, which automatically finds additional relevant queries for advertisers
to bid on. We present a novel advanced matching approach based on the idea of
semantic embeddings of queries and ads. The embeddings were learned using a
large data set of user search sessions, consisting of search queries, clicked
ads and search links, while utilizing contextual information such as dwell time
and skipped ads. To address the large-scale nature of our problem, both in
terms of data and vocabulary size, we propose a novel distributed algorithm for
training of the embeddings. Finally, we present an approach for overcoming a
cold-start problem associated with new ads and queries. We report results of
editorial evaluation and online tests on actual search traffic. The results
show that our approach significantly outperforms baselines in terms of
relevance, coverage, and incremental revenue. Lastly, we open-source learned
query embeddings to be used by researchers in computational advertising and
related fields.Comment: 10 pages, 4 figures, 39th International ACM SIGIR Conference on
Research and Development in Information Retrieval, SIGIR 2016, Pisa, Ital
A Graph-Based Semantics Workbench for Concurrent Asynchronous Programs
A number of novel programming languages and libraries have been proposed that
offer simpler-to-use models of concurrency than threads. It is challenging,
however, to devise execution models that successfully realise their
abstractions without forfeiting performance or introducing unintended
behaviours. This is exemplified by SCOOP---a concurrent object-oriented
message-passing language---which has seen multiple semantics proposed and
implemented over its evolution. We propose a "semantics workbench" with fully
and semi-automatic tools for SCOOP, that can be used to analyse and compare
programs with respect to different execution models. We demonstrate its use in
checking the consistency of semantics by applying it to a set of representative
programs, and highlighting a deadlock-related discrepancy between the principal
execution models of the language. Our workbench is based on a modular and
parameterisable graph transformation semantics implemented in the GROOVE tool.
We discuss how graph transformations are leveraged to atomically model
intricate language abstractions, and how the visual yet algebraic nature of the
model can be used to ascertain soundness.Comment: Accepted for publication in the proceedings of FASE 2016 (to appear
Term graph rewriting and garbage collection using opfibrations
AbstractThe categorical semantics of (an abstract version of) the general term graph rewriting language DACTL is investigated. The operational semantics is reformulated in order to reveal its universal properties. The technical dissonance between the matchings of left-hand sides of rules to redexes, and the properties of rewrite rules themselves, is taken as the impetus for expressing the core of the model as a Grothendieck opfibration of a category of general rewrites over a base of general rewrite rules. Garbage collection is examined in this framework in order to reconcile the treatment with earlier approaches. It is shown that term rewriting has particularly good garbage-theoretic properties that do not generalise to all cases of graph rewriting and that this has been a stumbling block for aspects of some earlier models for graph rewriting
Query2doc: Query Expansion with Large Language Models
This paper introduces a simple yet effective query expansion approach,
denoted as query2doc, to improve both sparse and dense retrieval systems. The
proposed method first generates pseudo-documents by few-shot prompting large
language models (LLMs), and then expands the query with generated
pseudo-documents. LLMs are trained on web-scale text corpora and are adept at
knowledge memorization. The pseudo-documents from LLMs often contain highly
relevant information that can aid in query disambiguation and guide the
retrievers. Experimental results demonstrate that query2doc boosts the
performance of BM25 by 3% to 15% on ad-hoc IR datasets, such as MS-MARCO and
TREC DL, without any model fine-tuning. Furthermore, our method also benefits
state-of-the-art dense retrievers in terms of both in-domain and out-of-domain
results.Comment: 9 page
Constructive Tensor Field Theory: The Model
We build constructively the simplest tensor field theory which requires some
renormalization, namely the rank three tensor theory with quartic interactions
and propagator inverse of the Laplacian on . This superrenormalizable
tensor field theory has a power counting almost similar to ordinary .
Our construction uses the multiscale loop vertex expansion (MLVE) recently
introduced in the context of an analogous vector model. However to prove
analyticity and Borel summability of this model requires new estimates on the
intermediate field integration, which is now of matrix rather than of scalar
type.Comment: 24 pages, 5 figures. Substantially improved version. Version v1 is
correct but treats a model which is simplified at the level of the two point
function. This version treats the full model, without any simplificatio
- …