215 research outputs found
Evaluation of information retrieval systems using structural equation modeling
The interpretation of the experimental data collected by testing systems across input datasets and model parameters is of strategic importance for system design and implementation. In particular, finding relationships between variables and detecting the latent variables affecting retrieval performance can provide designers, engineers and experimenters with useful if not necessary information about how a system is performing. This paper discusses the use of Structural Equation Modeling (SEM) in providing an in-depth explanation of evaluation results and an explanation of failures and successes of a system; in particular, we focus on the case of evaluation of Information Retrieval systems
Using Learning to Rank Approach to Promoting Diversity for Biomedical Information Retrieval with Wikipedia
In most of the traditional information retrieval (IR) models, the independent
relevance assumption is taken, which assumes the relevance of a document is
independent of other documents. However, the pitfall of this is the high redundancy
and low diversity of retrieval result. This has been seen in many scenarios, especially
in biomedical IR, where the information need of one query may refer to different
aspects. Promoting diversity in IR takes the relationship between documents into
account. Unlike previous studies, we tackle this problem in the learning to rank
perspective. The main challenges are how to find salient features for biomedical data
and how to integrate dynamic features into the ranking model. To address these
challenges, Wikipedia is used to detect topics of documents for generating diversity
biased features. A combined model is proposed and studied to learn a diversified
ranking result. Experiment results show the proposed method outperforms baseline
models
Automatic text summarization using pathfinder network scaling
Contém uma errataTese de Mestrado. Inteligência Artificial e Sistemas Inteligentes. Faculdade de Engenharia. Universidade do Porto, Faculdade de Economia. Universidade do Porto. 200
ExpFinder: An Ensemble Expert Finding Model Integrating -gram Vector Space Model and CO-HITS
Finding an expert plays a crucial role in driving successful collaborations
and speeding up high-quality research development and innovations. However, the
rapid growth of scientific publications and digital expertise data makes
identifying the right experts a challenging problem. Existing approaches for
finding experts given a topic can be categorised into information retrieval
techniques based on vector space models, document language models, and
graph-based models. In this paper, we propose , a new
ensemble model for expert finding, that integrates a novel -gram vector
space model, denoted as VSM, and a graph-based model, denoted as
\textit{\muCO-HITS}, that is a proposed variation of the CO-HITS algorithm.
The key of VSM is to exploit recent inverse document frequency weighting
method for -gram words and incorporates VSM into
\textit{\muCO-HITS} to achieve expert finding. We comprehensively evaluate
on four different datasets from the academic domains in
comparison with six different expert finding models. The evaluation results
show that is a highly effective model for expert finding,
substantially outperforming all the compared models in 19% to 160.2%.Comment: 15 pages, 18 figures, "for source code on Github, see
https://github.com/Yongbinkang/ExpFinder", "Submitted to IEEE Transactions on
Knowledge and Data Engineering
- …