Search CORE

82,951 research outputs found

Predicting SPARQL Query Performance

Author: Gandon Fabien
Hasan Rakebul
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

International audienceWe address the problem of predicting SPARQL query performance. We use machine learning techniques to learn SPARQL query performance from previously executed queries. We show how to model SPARQL queries as feature vectors, and use k -nearest neighbors regression and Support Vector Machine with the nu-SVR kernel to accurately (R^2 value of 0.98526) predict SPARQL query execution time

CiteSeerX

Crossref

HAL-UNICE

INRIA a CCSD electronic archive server

HAL Descartes

HAL-Rennes 1

IRGAN: A Minimax Game for Unifying Generative and Discriminative Information Retrieval Models

Author: Gong Yu
Wang Benyou
Wang Jun
Xu Yinghui
Yu Lantao
Zhang Dell
Zhang Peng
Zhang Weinan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 07/08/2017
Field of study

This paper provides a unified account of two schools of thinking in information retrieval modelling: the generative retrieval focusing on predicting relevant documents given a query, and the discriminative retrieval focusing on predicting relevancy given a query-document pair. We propose a game theoretical minimax game to iteratively optimise both models. On one hand, the discriminative model, aiming to mine signals from labelled and unlabelled data, provides guidance to train the generative model towards fitting the underlying relevance distribution over documents given the query. On the other hand, the generative model, acting as an attacker to the current discriminative model, generates difficult examples for the discriminative model in an adversarial way by minimising its discrimination objective. With the competition between these two models, we show that the unified framework takes advantage of both schools of thinking: (i) the generative model learns to fit the relevance distribution over documents via the signals from the discriminative model, and (ii) the discriminative model is able to exploit the unlabelled data selected by the generative model to achieve a better estimation for document ranking. Our experimental results have demonstrated significant performance gains as much as 23.96% on Precision@5 and 15.50% on MAP over strong baselines in a variety of applications including web search, item recommendation, and question answering.Comment: 12 pages; appendix adde

arXiv.org e-Print Archive

Crossref

UCL Discovery

Birkbeck Institutional Research Online

Voting for candidates: adapting data fusion techniques for an expert search task

Author: MacDonald C.
Ounis I.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2006
Field of study

In an expert search task, the users' need is to identify people who have relevant expertise to a topic of interest. An expert search system predicts and ranks the expertise of a set of candidate persons with respect to the users' query. In this paper, we propose a novel approach for predicting and ranking candidate expertise with respect to a query. We see the problem of ranking experts as a voting problem, which we model by adapting eleven data fusion techniques.We investigate the effectiveness of the voting approach and the associated data fusion techniques across a range of document weighting models, in the context of the TREC 2005 Enterprise track. The evaluation results show that the voting paradigm is very effective, without using any collection specific heuristics. Moreover, we show that improving the quality of the underlying document representation can significantly improve the retrieval performance of the data fusion techniques on an expert search task. In particular, we demonstrate that applying field-based weighting models improves the ranking of candidates. Finally, we demonstrate that the relative performance of the adapted data fusion techniques for the proposed approach is stable regardless of the used weighting models

Enlighten

Predicting IR Personalization Performance using Pre-retrieval Query Predictors

Author: Campos Ibáñez Luis Miguel
Fernández Luna Juan Manuel
Huete Juan F.
Vicente-López Eduardo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/01/2018
Field of study

Personalization generally improves the performance of queries but in a few cases it may also harms it. If we are able to predict and therefore to disable personalization for those situations, the overall performance will be higher and users will be more satisfied with personalized systems. We use some state-of-the-art pre-retrieval query performance predictors and propose some others including the user profile information for the previous purpose. We study the correlations among these predictors and the difference between the personalized and the original queries. We also use classification and regression techniques to improve the results and finally reach a bit more than one third of the maximum ideal performance. We think this is a good starting point within this research line, which certainly needs more effort and improvements.This work has been supported by the Spanish Andalusian “Consejerı́a de Innovación, Ciencia y Empresa” postdoctoral phase of project P09-TIC-4526, the Spanish “Ministerio de Economı́a y Competitividad” projects TIN2013-42741-P and TIN2016-77902-C3-2-P, and the European Regional Development Fund (ERDF-FEDER)

Crossref

Repositorio Institucional Universidad de Granada

Predicting IR Personalization Performance using Pre-retrieval Query Predictors

Author: de Campos Luis M.
Fernández-Luna Juan M.
Huete Juan F.
Vicente-López Eduardo
Publication venue
Publication date: 24/01/2024
Field of study

arXiv.org e-Print Archive

Using a Medical Thesaurus to Predict Query Difficulty

Author: Boudin Florian
Dawes Martin
Nie Jian-Yun
Publication venue: HAL CCSD
Publication date: 01/04/2012
Field of study

International audienceEstimating query performance is the task of predicting the quality of results returned by a search engine in response to a query. In this paper, we focus on pre-retrieval prediction methods for the medical domain. We propose a novel predictor that exploits a thesaurus to as- certain how difficult queries are. In our experiments, we show that our predictor outperforms the state-of-the-art methods that do not use a thesaurus