11,778 research outputs found

    Efficient & Effective Selective Query Rewriting with Efficiency Predictions

    Get PDF
    To enhance effectiveness, a user's query can be rewritten internally by the search engine in many ways, for example by applying proximity, or by expanding the query with related terms. However, approaches that benefit effectiveness often have a negative impact on efficiency, which has impacts upon the user satisfaction, if the query is excessively slow. In this paper, we propose a novel framework for using the predicted execution time of various query rewritings to select between alternatives on a per-query basis, in a manner that ensures both effectiveness and efficiency. In particular, we propose the prediction of the execution time of ephemeral (e.g., proximity) posting lists generated from uni-gram inverted index posting lists, which are used in establishing the permissible query rewriting alternatives that may execute in the allowed time. Experiments examining both the effectiveness and efficiency of the proposed approach demonstrate that a 49% decrease in mean response time (and 62% decrease in 95th-percentile response time) can be attained without significantly hindering the effectiveness of the search engine

    Beyond Keywords and Relevance: A Personalized Ad Retrieval Framework in E-Commerce Sponsored Search

    Full text link
    On most sponsored search platforms, advertisers bid on some keywords for their advertisements (ads). Given a search request, ad retrieval module rewrites the query into bidding keywords, and uses these keywords as keys to select Top N ads through inverted indexes. In this way, an ad will not be retrieved even if queries are related when the advertiser does not bid on corresponding keywords. Moreover, most ad retrieval approaches regard rewriting and ad-selecting as two separated tasks, and focus on boosting relevance between search queries and ads. Recently, in e-commerce sponsored search more and more personalized information has been introduced, such as user profiles, long-time and real-time clicks. Personalized information makes ad retrieval able to employ more elements (e.g. real-time clicks) as search signals and retrieval keys, however it makes ad retrieval more difficult to measure ads retrieved through different signals. To address these problems, we propose a novel ad retrieval framework beyond keywords and relevance in e-commerce sponsored search. Firstly, we employ historical ad click data to initialize a hierarchical network representing signals, keys and ads, in which personalized information is introduced. Then we train a model on top of the hierarchical network by learning the weights of edges. Finally we select the best edges according to the model, boosting RPM/CTR. Experimental results on our e-commerce platform demonstrate that our ad retrieval framework achieves good performance

    Simulation Subsumption or DĂ©jĂ  vu on the Web

    Get PDF
    Simulation unification is a special kind of unification adapted to retrieving semi-structured data on the Web. This article introduces simulation subsumption, or containment, that is, query subsumption under simulation unification. Simulation subsumption is crucial in general for query optimization, in particular for optimizing pattern-based search engines, and for the termination of recursive rule-based web languages such as the XML and RDF query language Xcerpt. This paper first motivates and formalizes simulation subsumption. Then, it establishes decidability of simulation subsumption for advanced query patterns featuring descendant constructs, regular expressions, negative subterms (or subterm exclusions), and multiple variable occurrences. Finally, we show that subsumption between two query terms can be decided in O(n!n) where n is the sum of the sizes of both query terms

    Scalable Semantic Matching of Queries to Ads in Sponsored Search Advertising

    Full text link
    Sponsored search represents a major source of revenue for web search engines. This popular advertising model brings a unique possibility for advertisers to target users' immediate intent communicated through a search query, usually by displaying their ads alongside organic search results for queries deemed relevant to their products or services. However, due to a large number of unique queries it is challenging for advertisers to identify all such relevant queries. For this reason search engines often provide a service of advanced matching, which automatically finds additional relevant queries for advertisers to bid on. We present a novel advanced matching approach based on the idea of semantic embeddings of queries and ads. The embeddings were learned using a large data set of user search sessions, consisting of search queries, clicked ads and search links, while utilizing contextual information such as dwell time and skipped ads. To address the large-scale nature of our problem, both in terms of data and vocabulary size, we propose a novel distributed algorithm for training of the embeddings. Finally, we present an approach for overcoming a cold-start problem associated with new ads and queries. We report results of editorial evaluation and online tests on actual search traffic. The results show that our approach significantly outperforms baselines in terms of relevance, coverage, and incremental revenue. Lastly, we open-source learned query embeddings to be used by researchers in computational advertising and related fields.Comment: 10 pages, 4 figures, 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2016, Pisa, Ital

    View Selection in Semantic Web Databases

    Get PDF
    We consider the setting of a Semantic Web database, containing both explicit data encoded in RDF triples, and implicit data, implied by the RDF semantics. Based on a query workload, we address the problem of selecting a set of views to be materialized in the database, minimizing a combination of query processing, view storage, and view maintenance costs. Starting from an existing relational view selection method, we devise new algorithms for recommending view sets, and show that they scale significantly beyond the existing relational ones when adapted to the RDF context. To account for implicit triples in query answers, we propose a novel RDF query reformulation algorithm and an innovative way of incorporating it into view selection in order to avoid a combinatorial explosion in the complexity of the selection process. The interest of our techniques is demonstrated through a set of experiments.Comment: VLDB201

    Semantic Query Reformulation in Social PDMS

    Full text link
    We consider social peer-to-peer data management systems (PDMS), where each peer maintains both semantic mappings between its schema and some acquaintances, and social links with peer friends. In this context, reformulating a query from a peer's schema into other peer's schemas is a hard problem, as it may generate as many rewritings as the set of mappings from that peer to the outside and transitively on, by eventually traversing the entire network. However, not all the obtained rewritings are relevant to a given query. In this paper, we address this problem by inspecting semantic mappings and social links to find only relevant rewritings. We propose a new notion of 'relevance' of a query with respect to a mapping, and, based on this notion, a new semantic query reformulation approach for social PDMS, which achieves great accuracy and flexibility. To find rapidly the most interesting mappings, we combine several techniques: (i) social links are expressed as FOAF (Friend of a Friend) links to characterize peer's friendship and compact mapping summaries are used to obtain mapping descriptions; (ii) local semantic views are special views that contain information about external mappings; and (iii) gossiping techniques improve the search of relevant mappings. Our experimental evaluation, based on a prototype on top of PeerSim and a simulated network demonstrate that our solution yields greater recall, compared to traditional query translation approaches proposed in the literature.Comment: 29 pages, 8 figures, query rewriting in PDM
    • …
    corecore