17 research outputs found

    Query performance prediction for information retrieval based on covering topic score

    Get PDF
    We present a statistical method called Covering Topic Score (CTS) to predict query performance for information retrieval. Estimation is based on how well the topic of a user's query is covered by documents retrieved from a certain retrieval system. Our approach is conceptually simple and intuitive, and can be easily extended to incorporate features beyond bag-of-words such as phrases and proximity of terms. Experiments demonstrate that CTS significantly correlates with query performance in a variety of TREC test collections, and in particular CTS gains more prediction power benefiting from features of phrases and proximity of terms. We compare CTS with previous state-of-the-art methods for query performance prediction including clarity score and robustness score. Our experimental results show that CTS consistently performs better than, or at least as well as, these other methods. In addition to its high effectiveness, CTS is also shown to have very low computational complexity, meaning that it can be practical for real applications

    Time-Sensitive User Profile for Optimizing Search Personlization

    Get PDF
    International audienceThanks to social Web services, Web search engines have the opportunity to afford personalized search results that better fit the user’s information needs and interests. To achieve this goal, many personalized search approaches explore user’s social Web interactions to extract his preferences and interests, and use them to model his profile. In our approach, the user profile is implicitly represented as a vector of weighted terms which correspond to the user’s interests extracted from his online social activities. As the user interests may change over time, we propose to weight profiles terms not only according to the content of these activities but also by considering the freshness. More precisely, the weights are adjusted with a temporal feature. In order to evaluate our approach, we model the user profile according to data collected from Twitter. Then, we rerank initial search results accurately to the user profile. Moreover, we proved the significance of adding a temporal feature by comparing our method with baselines models that does not consider the user profile dynamics

    Precision prediction based on ranked list coherence

    No full text
    We introduce a statistical measure of the coherence of a list of documents called the clarity score. Starting with a document list ranked by the query-likelihood retrieval model, we demonstrate the score's relationship to query ambiguity with respect to the collection. We also show that the clarity score is correlated with the average precision of a query and lay the groundwork for useful predictions by discussing a method of setting decision thresholds automatically. We then show that passage-based clarity scores correlate with average-precision measures of ranked lists of passages, where a passage is judged relevant if it contains correct answer text, which extends the basic method to passage-based systems. Next, we introduce variants of document-based clarity scores to improve the robustness, applicability, and predictive ability of clarity scores. In particular, we introduce the ranked list clarity score that can be computed with only a ranked list of documents, and the weighted clarity score where query terms contribute more than other terms. Finally, we show an approach to predicting queries that perform poorly on query expansion that uses techniques expanding on the ideas presented earlier

    Performance prediction in recommender systems

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-22362-4_37Proceedings of 19th International Conference, UMAP 2011, Girona, Spain, July 11-15, 2011.Research on Recommender Systems has barely explored the issue of adapting a recommendation strategy to the user’s information available at a certain time. In this thesis, we introduce a component that allows building dynamic recommendation strategies, by reformulating the performance prediction problem in the area of Information Retrieval to that of recommender systems. More specifically, we investigate a number of adaptations of the query clarity predictor in order to infer the ambiguity in user and item profiles. The properties of each predictor are empirically studied by, first, checking the correlation of the predictor output with a performance measure, and second, by incorporating a performance predictor into a recommender system to produce a dynamic strategy. Depending on how the predictor is integrated with the system, we explore two different applications: dynamic user neighbour weighting and hybrid recommendation. The performance of such dynamic strategies is examined and compared with that of static ones.This work was supported by the Spanish Ministry of Science and Innovation (TIN2008-06566-C04-02) and Dirección General de Universidades e Investigación de la Comunidad de Madrid and Universidad Autónoma de Madrid (CCG10-UAM/TIC-5877

    Predicting Query Performance by Query-Drift Estimation

    No full text
    Predicting query performance, that is, the effectiveness of a search performed in response to a query, is a highly important and challenging problem. Our novel approach to addressing this challenge is based on estimating the potential amount of query drift in the result list, i.e., the presence (and dominance) of aspects or topics not related to the query in top-retrieved documents. We argue that query-drift can potentially be estimated by measuring the diversity (e.g., standard deviation) of the retrieval scores of these documents. Empirical evaluation demonstrates the prediction effectiveness of our approach for several retrieval models. Specifically, the prediction success is better, over most tested TREC corpora, than that of state-of-the-art prediction methods

    An Empirical Study of Query Specificity

    No full text
    Abstract. We analyse the statistical behavior of query-associated quantities in query-logs, namely, the sum and mean of IDF of query terms, otherwise known as query specificity and query mean specificity. We narrow down the possibilities for modeling their distributions to gamma, log-normal, or log-logistic, depending on query length and on whether the sum or the mean is considered. The results have applications in query performance prediction and artificial query generation.

    Predicting the Performance of Recommender Systems: An Information Theoretic Approach

    No full text
    Proceedings of Third International Conference, ICTIR 2011, Bertinoro, Italy, September 12-14, 2011.The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-23318-0_5Performance prediction is an appealing problem in Recommender Systems, as it enables an array of strategies for deciding when to deliver or hold back recommendations based on their foreseen accuracy. The problem, however, has been barely addressed explicitly in the area. In this paper, we propose adaptations of query clarity techniques from ad-hoc Information Retrieval to define performance predictors in the context of Recommender Systems, which we refer to as user clarity. Our experiments show positive results with different user clarity models in terms of the correlation with single recommender’s performance. Empiric results show significant dependency between this correlation and the recommendation method at hand, as well as competitive results in terms of average correlation.This work was supported by the Spanish Ministry of Science and Innovation (TIN2008-06566-C04-02), University Autónoma de Madrid and the Community of Madrid (CCG10-UAM/TIC-5877

    A Performance Prediction Approach to Enhance Collaborative Filtering Performance

    No full text
    Performance prediction has gained increasing attention in the IR field since the half of the past decade and has become an established research topic in the field. The present work restates the problem in the area of Collaborative Filtering (CF), where it has barely been researched so far. We investigate the adaptation of clarity-based query performance predictors to predict neighbor performance in CF. A predictor is proposed and introduced in a kNN CF algorithm to produce a dynamic variant where neighbor ratings are weighted based on their predicted performance. The properties of the predictor are empirically studied by, first, checking the correlation of the predictor output with a proposed measure of neighbor performance. Then, the performance of the dynamic kNN variant is examined on different sparsity and neighborhood size conditions, where the variant consistently outperforms the baseline algorithm, with increasing difference on small neighborhoods

    Predicting Neighbor Goodness in Collaborative Filtering

    No full text
    Performance prediction has gained increasing attention in the IR field since the half of the past decade and has become an established research topic in the field. The present work restates the problem in the subarea of Collaborative Filtering (CF), where it has barely been researched so far. We investigate the adaptation of clarity-based query performance predictors to define predictors of neighbor performance in CF. The proposed predictors are introduced in a memory-based CF algorithm to produce a dynamic variant where neighbor ratings are weighted based on their predicted performance. The approach is tested with encouraging empirical results, as the dynamic variants consistently outperform the baseline algorithms, with increasing difference on small neighborhoods
    corecore