11,778 research outputs found
Efficient & Effective Selective Query Rewriting with Efficiency Predictions
To enhance effectiveness, a user's query can be rewritten internally by the search engine in many ways, for example by applying proximity, or by expanding the query with related terms. However, approaches that benefit effectiveness often have a negative impact on efficiency, which has impacts upon the user satisfaction, if the query is excessively slow. In this paper, we propose a novel framework for using the predicted execution time of various query rewritings to select between alternatives on a per-query basis, in a manner that ensures both effectiveness and efficiency. In particular, we propose the prediction of the execution time of ephemeral (e.g., proximity) posting lists generated from uni-gram inverted index posting lists, which are used in establishing the permissible query rewriting alternatives that may execute in the allowed time. Experiments examining both the effectiveness and efficiency of the proposed approach demonstrate that a 49% decrease in mean response time (and 62% decrease in 95th-percentile response time) can be attained without significantly hindering the effectiveness of the search engine
Beyond Keywords and Relevance: A Personalized Ad Retrieval Framework in E-Commerce Sponsored Search
On most sponsored search platforms, advertisers bid on some keywords for
their advertisements (ads). Given a search request, ad retrieval module
rewrites the query into bidding keywords, and uses these keywords as keys to
select Top N ads through inverted indexes. In this way, an ad will not be
retrieved even if queries are related when the advertiser does not bid on
corresponding keywords. Moreover, most ad retrieval approaches regard rewriting
and ad-selecting as two separated tasks, and focus on boosting relevance
between search queries and ads. Recently, in e-commerce sponsored search more
and more personalized information has been introduced, such as user profiles,
long-time and real-time clicks. Personalized information makes ad retrieval
able to employ more elements (e.g. real-time clicks) as search signals and
retrieval keys, however it makes ad retrieval more difficult to measure ads
retrieved through different signals. To address these problems, we propose a
novel ad retrieval framework beyond keywords and relevance in e-commerce
sponsored search. Firstly, we employ historical ad click data to initialize a
hierarchical network representing signals, keys and ads, in which personalized
information is introduced. Then we train a model on top of the hierarchical
network by learning the weights of edges. Finally we select the best edges
according to the model, boosting RPM/CTR. Experimental results on our
e-commerce platform demonstrate that our ad retrieval framework achieves good
performance
Simulation Subsumption or DĂ©jĂ vu on the Web
Simulation unification is a special kind of unification adapted to retrieving semi-structured data on the Web. This article introduces simulation subsumption, or containment, that is, query subsumption under simulation unification. Simulation subsumption is crucial in general for query optimization, in particular for optimizing pattern-based search engines, and for the termination of recursive rule-based web languages such as the XML and RDF query language Xcerpt. This paper first motivates and formalizes simulation subsumption. Then, it establishes decidability of simulation subsumption for advanced query patterns featuring descendant constructs, regular expressions, negative subterms (or subterm exclusions), and multiple variable occurrences. Finally, we show that subsumption between two query terms can be decided in O(n!n) where n is the sum of the sizes of both query terms
Scalable Semantic Matching of Queries to Ads in Sponsored Search Advertising
Sponsored search represents a major source of revenue for web search engines.
This popular advertising model brings a unique possibility for advertisers to
target users' immediate intent communicated through a search query, usually by
displaying their ads alongside organic search results for queries deemed
relevant to their products or services. However, due to a large number of
unique queries it is challenging for advertisers to identify all such relevant
queries. For this reason search engines often provide a service of advanced
matching, which automatically finds additional relevant queries for advertisers
to bid on. We present a novel advanced matching approach based on the idea of
semantic embeddings of queries and ads. The embeddings were learned using a
large data set of user search sessions, consisting of search queries, clicked
ads and search links, while utilizing contextual information such as dwell time
and skipped ads. To address the large-scale nature of our problem, both in
terms of data and vocabulary size, we propose a novel distributed algorithm for
training of the embeddings. Finally, we present an approach for overcoming a
cold-start problem associated with new ads and queries. We report results of
editorial evaluation and online tests on actual search traffic. The results
show that our approach significantly outperforms baselines in terms of
relevance, coverage, and incremental revenue. Lastly, we open-source learned
query embeddings to be used by researchers in computational advertising and
related fields.Comment: 10 pages, 4 figures, 39th International ACM SIGIR Conference on
Research and Development in Information Retrieval, SIGIR 2016, Pisa, Ital
View Selection in Semantic Web Databases
We consider the setting of a Semantic Web database, containing both explicit
data encoded in RDF triples, and implicit data, implied by the RDF semantics.
Based on a query workload, we address the problem of selecting a set of views
to be materialized in the database, minimizing a combination of query
processing, view storage, and view maintenance costs. Starting from an existing
relational view selection method, we devise new algorithms for recommending
view sets, and show that they scale significantly beyond the existing
relational ones when adapted to the RDF context. To account for implicit
triples in query answers, we propose a novel RDF query reformulation algorithm
and an innovative way of incorporating it into view selection in order to avoid
a combinatorial explosion in the complexity of the selection process. The
interest of our techniques is demonstrated through a set of experiments.Comment: VLDB201
Semantic Query Reformulation in Social PDMS
We consider social peer-to-peer data management systems (PDMS), where each
peer maintains both semantic mappings between its schema and some
acquaintances, and social links with peer friends. In this context,
reformulating a query from a peer's schema into other peer's schemas is a hard
problem, as it may generate as many rewritings as the set of mappings from that
peer to the outside and transitively on, by eventually traversing the entire
network. However, not all the obtained rewritings are relevant to a given
query. In this paper, we address this problem by inspecting semantic mappings
and social links to find only relevant rewritings. We propose a new notion of
'relevance' of a query with respect to a mapping, and, based on this notion, a
new semantic query reformulation approach for social PDMS, which achieves great
accuracy and flexibility. To find rapidly the most interesting mappings, we
combine several techniques: (i) social links are expressed as FOAF (Friend of a
Friend) links to characterize peer's friendship and compact mapping summaries
are used to obtain mapping descriptions; (ii) local semantic views are special
views that contain information about external mappings; and (iii) gossiping
techniques improve the search of relevant mappings. Our experimental
evaluation, based on a prototype on top of PeerSim and a simulated network
demonstrate that our solution yields greater recall, compared to traditional
query translation approaches proposed in the literature.Comment: 29 pages, 8 figures, query rewriting in PDM
- …