2,926 research outputs found

    An Intent Taxonomy of Legal Case Retrieval

    Full text link
    Legal case retrieval is a special Information Retrieval~(IR) task focusing on legal case documents. Depending on the downstream tasks of the retrieved case documents, users' information needs in legal case retrieval could be significantly different from those in Web search and traditional ad-hoc retrieval tasks. While there are several studies that retrieve legal cases based on text similarity, the underlying search intents of legal retrieval users, as shown in this paper, are more complicated than that yet mostly unexplored. To this end, we present a novel hierarchical intent taxonomy of legal case retrieval. It consists of five intent types categorized by three criteria, i.e., search for Particular Case(s), Characterization, Penalty, Procedure, and Interest. The taxonomy was constructed transparently and evaluated extensively through interviews, editorial user studies, and query log analysis. Through a laboratory user study, we reveal significant differences in user behavior and satisfaction under different search intents in legal case retrieval. Furthermore, we apply the proposed taxonomy to various downstream legal retrieval tasks, e.g., result ranking and satisfaction prediction, and demonstrate its effectiveness. Our work provides important insights into the understanding of user intents in legal case retrieval and potentially leads to better retrieval techniques in the legal domain, such as intent-aware ranking strategies and evaluation methodologies.Comment: 28 pages, work in proces

    Intent-aware search result diversification

    Full text link
    Search result diversification has gained momentum as a way to tackle ambiguous queries. An effective approach to this problem is to explicitly model the possible aspects underlying a query, in order to maximise the estimated relevance of the retrieved documents with respect to the different aspects. However, such aspects themselves may represent information needs with rather distinct intents (e.g., informational or navigational). Hence, a diverse ranking could benefit from applying intent-aware retrieval models when estimating the relevance of documents to different aspects. In this paper, we propose to diversify the results retrieved for a given query, by learning the appropriateness of different retrieval models for each of the aspects underlying this query. Thorough experiments within the evaluation framework provided by the diversity task of the TREC 2009 and 2010 Web tracks show that the proposed approach can significantly improve state-of-the-art diversification approaches

    Intent Models for Contextualising and Diversifying Query Suggestions

    Full text link
    The query suggestion or auto-completion mechanisms help users to type less while interacting with a search engine. A basic approach that ranks suggestions according to their frequency in the query logs is suboptimal. Firstly, many candidate queries with the same prefix can be removed as redundant. Secondly, the suggestions can also be personalised based on the user's context. These two directions to improve the aforementioned mechanisms' quality can be in opposition: while the latter aims to promote suggestions that address search intents that a user is likely to have, the former aims to diversify the suggestions to cover as many intents as possible. We introduce a contextualisation framework that utilises a short-term context using the user's behaviour within the current search session, such as the previous query, the documents examined, and the candidate query suggestions that the user has discarded. This short-term context is used to contextualise and diversify the ranking of query suggestions, by modelling the user's information need as a mixture of intent-specific user models. The evaluation is performed offline on a set of approximately 1.0M test user sessions. Our results suggest that the proposed approach significantly improves query suggestions compared to the baseline approach.Comment: A short version of this paper was presented at CIKM 201

    Why People Search for Images using Web Search Engines

    Get PDF
    What are the intents or goals behind human interactions with image search engines? Knowing why people search for images is of major concern to Web image search engines because user satisfaction may vary as intent varies. Previous analyses of image search behavior have mostly been query-based, focusing on what images people search for, rather than intent-based, that is, why people search for images. To date, there is no thorough investigation of how different image search intents affect users' search behavior. In this paper, we address the following questions: (1)Why do people search for images in text-based Web image search systems? (2)How does image search behavior change with user intent? (3)Can we predict user intent effectively from interactions during the early stages of a search session? To this end, we conduct both a lab-based user study and a commercial search log analysis. We show that user intents in image search can be grouped into three classes: Explore/Learn, Entertain, and Locate/Acquire. Our lab-based user study reveals different user behavior patterns under these three intents, such as first click time, query reformulation, dwell time and mouse movement on the result page. Based on user interaction features during the early stages of an image search session, that is, before mouse scroll, we develop an intent classifier that is able to achieve promising results for classifying intents into our three intent classes. Given that all features can be obtained online and unobtrusively, the predicted intents can provide guidance for choosing ranking methods immediately after scrolling

    Query Understanding in the Age of Large Language Models

    Full text link
    Querying, conversing, and controlling search and information-seeking interfaces using natural language are fast becoming ubiquitous with the rise and adoption of large-language models (LLM). In this position paper, we describe a generic framework for interactive query-rewriting using LLMs. Our proposal aims to unfold new opportunities for improved and transparent intent understanding while building high-performance retrieval systems using LLMs. A key aspect of our framework is the ability of the rewriter to fully specify the machine intent by the search engine in natural language that can be further refined, controlled, and edited before the final retrieval phase. The ability to present, interact, and reason over the underlying machine intent in natural language has profound implications on transparency, ranking performance, and a departure from the traditional way in which supervised signals were collected for understanding intents. We detail the concept, backed by initial experiments, along with open questions for this interactive query understanding framework.Comment: Accepted to GENIR(SIGIR'23

    Stochastic Query Covering for Fast Approximate Document Retrieval

    Get PDF
    We design algorithms that, given a collection of documents and a distribution over user queries, return a small subset of the document collection in such a way that we can efficiently provide high-quality answers to user queries using only the selected subset. This approach has applications when space is a constraint or when the query-processing time increases significantly with the size of the collection. We study our algorithms through the lens of stochastic analysis and prove that even though they use only a small fraction of the entire collection, they can provide answers to most user queries, achieving a performance close to the optimal. To complement our theoretical findings, we experimentally show the versatility of our approach by considering two important cases in the context of Web search. In the first case, we favor the retrieval of documents that are relevant to the query, whereas in the second case we aim for document diversification. Both the theoretical and the experimental analysis provide strong evidence of the potential value of query covering in diverse application scenarios

    Evaluating the retrieval effectiveness of Web search engines using a representative query sample

    Full text link
    Search engine retrieval effectiveness studies are usually small-scale, using only limited query samples. Furthermore, queries are selected by the researchers. We address these issues by taking a random representative sample of 1,000 informational and 1,000 navigational queries from a major German search engine and comparing Google's and Bing's results based on this sample. Jurors were found through crowdsourcing, data was collected using specialised software, the Relevance Assessment Tool (RAT). We found that while Google outperforms Bing in both query types, the difference in the performance for informational queries was rather low. However, for navigational queries, Google found the correct answer in 95.3 per cent of cases whereas Bing only found the correct answer 76.6 per cent of the time. We conclude that search engine performance on navigational queries is of great importance, as users in this case can clearly identify queries that have returned correct results. So, performance on this query type may contribute to explaining user satisfaction with search engines
    • …
    corecore