254 research outputs found

    An Efficient Top-k Query Scheme Based on Multilayer Grouping

    Get PDF
    The top-k query is to find the k data that has the highest scores from a candidate dataset. Sorting is a common method to find out top-k results. However, most of existing methods are not efficient enough. To remove this issue, we propose an efficient top-k query scheme based on multilayer grouping. First, we find the reference item by computing the average score of the candidate dataset. Second, we group the candidate dataset into three datasets: winner set, middle set and loser set based on the reference item. Third, we further group the winner set to the second-layer three datasets according to k value. And so on, until the data number of winner set is close to k value. Meanwhile, if k value is larger than the data number of winner set, we directly return the winner set to the user as a part of top-k results almost without sorting. In this case, we also return the top results with the highest scores from the middle set almost without sorting. Based on above innovations, we almost minimize the sorting. Experimental results show that our scheme significantly outperforms the current classical method on the performance of memory consumption and top-k query

    Offline Evaluation via Human Preference Judgments: A Dueling Bandits Problem

    Get PDF
    The dramatic improvements in core information retrieval tasks engendered by neural rankers create a need for novel evaluation methods. If every ranker returns highly relevant items in the top ranks, it becomes difficult to recognize meaningful differences between them and to build reusable test collections. Several recent papers explore pairwise preference judgments as an alternative to traditional graded relevance assessments. Rather than viewing items one at a time, assessors view items side-by-side and indicate the one that provides the better response to a query, allowing fine-grained distinctions. If we employ preference judgments to identify the probably best items for each query, we can measure rankers by their ability to place these items as high as possible. I frame the problem of finding best items as a dueling bandits problem. While many papers explore dueling bandits for online ranker evaluation via interleaving, they have not been considered as a framework for offline evaluation via human preference judgments. I review the literature for possible solutions. For human preference judgments, any usable algorithm must tolerate ties since two items may appear nearly equal to assessors. It must minimize the number of judgments required for any specific pair since each such comparison requires an independent assessor. Since the theoretical guarantees provided by most algorithms depend on assumptions that are not satisfied by human preference judgments, I simulate selected algorithms on representative test cases to provide insight into their practical utility. In contrast to the previous paper presented at SIGIR 2022 [87], I include more theoretical analysis and experimental results in this work. Based on the simulations, two algorithms stand out for their potential. I proceed with the method of Clarke et al. [20], and the simulations suggest modifications to further improve its performance. Using the modified algorithm, over 10,000 preference judgments for pools derived from submissions to the TREC 2021 Deep Learning Track are collected, confirming its suitability. We test the idea of best-item evaluation and suggest ideas for further theoretical and practical progress

    Engineering Crowdsourced Stream Processing Systems

    Full text link
    A crowdsourced stream processing system (CSP) is a system that incorporates crowdsourced tasks in the processing of a data stream. This can be seen as enabling crowdsourcing work to be applied on a sample of large-scale data at high speed, or equivalently, enabling stream processing to employ human intelligence. It also leads to a substantial expansion of the capabilities of data processing systems. Engineering a CSP system requires the combination of human and machine computation elements. From a general systems theory perspective, this means taking into account inherited as well as emerging properties from both these elements. In this paper, we position CSP systems within a broader taxonomy, outline a series of design principles and evaluation metrics, present an extensible framework for their design, and describe several design patterns. We showcase the capabilities of CSP systems by performing a case study that applies our proposed framework to the design and analysis of a real system (AIDR) that classifies social media messages during time-critical crisis events. Results show that compared to a pure stream processing system, AIDR can achieve a higher data classification accuracy, while compared to a pure crowdsourcing solution, the system makes better use of human workers by requiring much less manual work effort

    Spatial Keyword Querying: Ranking Evaluation and Efficient Query Processing

    Get PDF

    Going Beyond Relevance: Role of effort in Information Retrieval

    Get PDF
    The primary focus of Information Retrieval (IR) systems has been to optimize for Relevance. Existing approaches to rank documents or evaluate IR systems does not account for “user effort”. Currently, judges only determine whether the information provided in a given document would satisfy the underlying information need in a query. The current mechanism of obtaining relevance judgments does not account for time and effort that an end user must put forth to consume its content. While a judge may spend a lot of time assessing a document, an impatient user may not devote the same amount of time and effort to consume its content. This problem is exacerbated on smaller devices like mobile. While on mobile or tablets, with limited interaction, users may not put in too much effort in finding information. This thesis characterizes and incorporates effort in Information Retrieval. Comparison of explicit and implicit relevance judgments across several datasets reveals that certain documents are marked relevant by the judges but are of low utility to an end user. Experiments indicate that document-level effort features can reliably predict the mismatch between dwell time and judging time of documents. Explicit and preference-based judgments were collected to determine which factors associated with effort agreed the most with user satisfaction. The ability to locate relevant information or findability was found to be in highest agreement with preference judgments. Findability judgments were also gathered to study the association of different annotator, query or document related properties with effort judgments. We also investigate how can existing systems be optimized for relevance and effort. Finally, we investigate the role of effort on smaller devices with the help of cost-benefit models
    corecore