3 research outputs found

    Large-scale Generative Query Autocompletion

    Get PDF
    Query Autocompletion (QAC) systems are interactive tools that assist a searcher in entering a query given a partial query prefix. Existing QAC research -- with a number of notable exceptions --relies upon large existing query logs from which to extract historical queries. These queries are then ordered by some ranking algorithm as candidate completions, given the query prefix. Given the numerous search environments (e.g. enterprises, personal or secured data repositories) in which large query logs are unavailable, the need for synthetic -- or generative -- QAC systems will become increasingly important. Generative QAC systems may be used to augment traditional query-based approaches, and/or entirely replace them in certain privacy sensitive applications. Even in commercial Web search engines, a significant proportion (up to 15%) of queries issued daily have never been seen previously, meaning there will always be opportunity to assist users in formulating queries which have not occurred historically. In this paper, we describe a system that can construct generative QAC suggestions within a user-acceptable timeframe (~58ms), and report on a series of experiments over three publicly available, large-scale question sets that investigate different aspects of the system's performance

    Efficient Neural Query Auto Completion

    Full text link
    Query Auto Completion (QAC), as the starting point of information retrieval tasks, is critical to user experience. Generally it has two steps: generating completed query candidates according to query prefixes, and ranking them based on extracted features. Three major challenges are observed for a query auto completion system: (1) QAC has a strict online latency requirement. For each keystroke, results must be returned within tens of milliseconds, which poses a significant challenge in designing sophisticated language models for it. (2) For unseen queries, generated candidates are of poor quality as contextual information is not fully utilized. (3) Traditional QAC systems heavily rely on handcrafted features such as the query candidate frequency in search logs, lacking sufficient semantic understanding of the candidate. In this paper, we propose an efficient neural QAC system with effective context modeling to overcome these challenges. On the candidate generation side, this system uses as much information as possible in unseen prefixes to generate relevant candidates, increasing the recall by a large margin. On the candidate ranking side, an unnormalized language model is proposed, which effectively captures deep semantics of queries. This approach presents better ranking performance over state-of-the-art neural ranking methods and reduces ∼\sim95\% latency compared to neural language modeling methods. The empirical results on public datasets show that our model achieves a good balance between accuracy and efficiency. This system is served in LinkedIn job search with significant product impact observed.Comment: Accepted at CIKM 202

    Modeling concepts and their relationships for corpus-based query auto-completion

    Get PDF
    AbstractQuery auto-completion helps users to formulate their information needs by providing suggestion lists at every typed key. This task is commonly addressed by exploiting query logs and the approaches proposed in the literature fit well in web scale scenarios, where usually huge amounts of past user queries can be analyzed to provide reliable suggestions. However, when query logs are not available, e.g. in enterprise or desktop search engines, these methods are not applicable at all. To face these challenging scenarios, we present a novel corpus-based approach which exploits the textual content of an indexed document collection in order to dynamically generate query completions. Our method extracts informative text fragments from the corpus and it combines them using a probabilistic graphical model in order to capture the relationships between the extracted concepts. Using this approach, it is possible to automatically complete partial queries with significant suggestions related to the keywords already entered by the user without requiring the analysis of the past queries. We evaluate our system through a user study on two different real-world document collections. The experiments show that our method is able to provide meaningful completions outperforming the state-of-the art approach
    corecore