19 research outputs found

    DCU-TCD@LogCLEF 2010: re-ranking document collections and query performance estimation

    Get PDF
    This paper describes the collaborative participation of Dublin City University and Trinity College Dublin in LogCLEF 2010. Two sets of experiments were conducted. First, different aspects of the TEL query logs were analysed after extracting user sessions of consecutive queries on a topic. The relation between the queries and their length (number of terms) and position (first query or further reformulations) was examined in a session with respect to query performance estimators such as query scope, IDF-based measures, simplified query clarity score, and average inverse document collection frequency. Results of this analysis suggest that only some estimator values show a correlation with query length or position in the TEL logs (e.g. similarity score between collection and query). Second, the relation between three attributes was investigated: the user's country (detected from IP address), the query language, and the interface language. The investigation aimed to explore the influence of the three attributes on the user's collection selection. Moreover, the investigation involved assigning different weights to the three attributes in a scoring function that was used to re-rank the collections displayed to the user according to the language and country. The results of the collection re-ranking show a significant improvement in Mean Average Precision (MAP) over the original collection ranking of TEL. The results also indicate that the query language and interface language have more in uence than the user's country on the collections selected by the users

    Restructuring Sparse High Dimensional Data for Effective Retrieval

    Get PDF
    The task in text retrieval is to find the subset of a collection of documents relevant to a user's information request, usually expressed as a set of words. Classically, documents and queries are represented as vectors of word counts. In its simplest form, relevance is defined to be the dot product between a document and a query vector--a measure of the number of common terms. A central difficulty in text retrieval is that the presence or absence of a word is not sufficient to determine relevance to a query. Linear dimensionality reduction has been proposed as a technique for extracting underlying structure from the document collection. In some domains (such as vision) dimensionality reduction reduces computational complexity. In text retrieval it is more often used to improve retrieval performance. We propose an alternative and novel technique that produces sparse representations constructed from sets of highly-related words. Documents and queries are represented by their distance to these sets. and relevance is measured by the number of common clusters. This technique significantly improves retrieval performance, is efficient to compute and shares properties with the optimal linear projection operator and the independent components of documents

    A Study of Collection-Based Features for Adapting the Balance Parameter in Pseudo Relevance Feedback.

    Get PDF
    Pseudo-relevance feedback (PRF) is an effective technique to improve the ad-hoc retrieval performance. For PRF methods, how to optimize the balance parameter between the original query model and feedback model is an important but difficult problem. Traditionally, the balance parameter is often manually tested and set to a fixed value across collections and queries. However, due to the difference among collections and individual queries, this parameter should be tuned differently. Recent research has studied various query based and feedback documents based features to predict the optimal balance parameter for each query on a specific collection, through a learning approach based on logistic regression. In this paper, we hypothesize that characteristics of collections are also important for the prediction. We propose and systematically investigate a series of collection- based features for queries, feedback documents and candidate expansion terms. The experiments show that our method is competitive in improving retrieval performance and particularly for cross-collection prediction, in comparison with the state-of-the-art approaches

    A statistical significance testing approach for measuring term burstiness with applications to domain-specific terminology extraction

    Full text link
    A term in a corpus is said to be ``bursty'' (or overdispersed) when its occurrences are concentrated in few out of many documents. In this paper, we propose Residual Inverse Collection Frequency (RICF), a statistical significance test inspired heuristic for quantifying term burstiness. The chi-squared test is, to our knowledge, the sole test of statistical significance among existing term burstiness measures. Chi-squared test term burstiness scores are computed from the collection frequency statistic (i.e., the proportion that a specified term constitutes in relation to all terms within a corpus). However, the document frequency of a term (i.e., the proportion of documents within a corpus in which a specific term occurs) is exploited by certain other widely used term burstiness measures. RICF addresses this shortcoming of the chi-squared test by virtue of its term burstiness scores systematically incorporating both the collection frequency and document frequency statistics. We evaluate the RICF measure on a domain-specific technical terminology extraction task using the GENIA Term corpus benchmark, which comprises 2,000 annotated biomedical article abstracts. RICF generally outperformed the chi-squared test in terms of precision at k score with percent improvements of 0.00% (P@10), 6.38% (P@50), 6.38% (P@100), 2.27% (P@500), 2.61% (P@1000), and 1.90% (P@5000). Furthermore, RICF performance was competitive with the performances of other well-established measures of term burstiness. Based on these findings, we consider our contributions in this paper as a promising starting point for future exploration in leveraging statistical significance testing in text analysis.Comment: 19 pages, 1 figure, 6 table

    The VisOR System: Testing the Utility of User Interface Components for Feature-based Searching in Video Retrieval Software

    Get PDF
    This study uses a test video retrieval system, VisOR, to assess the value of user interface components that provide feature-based searching on automatically-extracted visual and auditory features. In particular, the study attempts to find a) whether sliders that allow users to adjust the relative weights of individual features improve performance on search tasks, b) which features prove the most useful in conducting normal search tasks, c) whether feature-based searching is difficult for the typical used, and d) whether color and brightness-based searching enables users to find exact-match shots especially quickly. Seventeen subjects completed 14 search tasks each. For a), it was discovered that the weight sliders had no significant effect on performance. For b), it was found the keywords, Indoors/Outdoors, and Cityscape/Landscape proved most useful. For c), user questionnaires indicated no special difficulty or frustration. For d), it was found that users who regularly use color and brightness components for searching consistently found exact-match shots more quickly than others

    A Comparative Study and Analysis of Query Performance Prediction Algorithms to Improve their Reproducibility

    Get PDF
    Una delle sfide principali nella valutazione all’interno dell’Information Retrieval è rappresentata dal costo richiesto dalla valutazione stessa, sia online che offline. Pertanto, negli ultimi anni diversi sforzi sono stati dedicati al compito svolto dalla Query Performance Prediction (QPP). QPP ha come obiettivo quello di stimare la qualità di un sistema quando viene utilizzato per recuperare documenti in risposta a una data query, basandosi su diverse fonti di informazione come la query, i documenti o i punteggi di similarità forniti dal sistema di Information Retrieval. Negli ultimi anni sono stati progettati diversi modelli di QPP pre e post-retrieval, ma raramente sono stati testati nelle stesse condizioni sperimentali. L’obiettivo del nostro lavoro è molteplice: sviluppare una struttura unificante che includa diversi approcci QPP presenti nello stato dell’arte e usare tale struttura per valutare la riproducibilità degli approcci QPP implementati. I nostri risultati illustrano che siamo in grado di raggiungere un alto grado di riproducibilità, con quattordici metodi diversi riprodotti correttamente e risultati di performance paragonabili a quelli originali.One of the primary challenges in Information Retrieval evaluation is represented by the cost of carrying out either online or offline evaluation. Therefore, in recent years several endeavors have been devoted to the Query Performance Prediction (QPP) task. QPP aims to estimate the quality of a system when used to retrieve documents in response to a given query, relying on different sources of information such as the query, the documents or the similarity scores provided by the Information Retrieval system. In the last years several pre and post-retrieval QPP models have been designed, but rarely tested under the same experimental conditions. The objective of our work is multifold: we develop a unifying framework that includes several state-of-the-art QPP approaches and use such framework to assess the reproducibility of such QPP approaches. Our findings illustrate that we are able to achieve a high degree of reproducibility, with fourteen different methods correctly reproduced and performance results comparable to the original ones
    corecore