356 research outputs found
From Evaluating to Forecasting Performance: How to Turn Information Retrieval, Natural Language Processing and Recommender Systems into Predictive Sciences
We describe the state-of-the-art in performance modeling and prediction for Information Retrieval
(IR), Natural Language Processing (NLP) and Recommender Systems (RecSys) along with its
shortcomings and strengths. We present a framework for further research, identifying five major
problem areas: understanding measures, performance analysis, making underlying assumptions
explicit, identifying application features determining performance, and the development of prediction
models describing the relationship between assumptions, features and resulting performanc
Relevance Assessments for Web Search Evaluation: Should We Randomise or Prioritise the Pooled Documents? (CORRECTED VERSION)
In the context of depth- pooling for constructing web search test
collections, we compare two approaches to ordering pooled documents for
relevance assessors: the prioritisation strategy (PRI) used widely at NTCIR,
and the simple randomisation strategy (RND). In order to address research
questions regarding PRI and RND, we have constructed and released the WWW3E8
data set, which contains eight independent relevance labels for 32,375
topic-document pairs, i.e., a total of 259,000 labels. Four of the eight
relevance labels were obtained from PRI-based pools; the other four were
obtained from RND-based pools. Using WWW3E8, we compare PRI and RND in terms of
inter-assessor agreement, system ranking agreement, and robustness to new
systems that did not contribute to the pools. We also utilise an assessor
activity log we obtained as a byproduct of WWW3E8 to compare the two strategies
in terms of assessment efficiency.Comment: 30 pages. This is a corrected version of an open-access TOIS paper (
https://dl.acm.org/doi/pdf/10.1145/3494833
Current Challenges and Visions in Music Recommender Systems Research
Music recommender systems (MRS) have experienced a boom in recent years,
thanks to the emergence and success of online streaming services, which
nowadays make available almost all music in the world at the user's fingertip.
While today's MRS considerably help users to find interesting music in these
huge catalogs, MRS research is still facing substantial challenges. In
particular when it comes to build, incorporate, and evaluate recommendation
strategies that integrate information beyond simple user--item interactions or
content-based descriptors, but dig deep into the very essence of listener
needs, preferences, and intentions, MRS research becomes a big endeavor and
related publications quite sparse.
The purpose of this trends and survey article is twofold. We first identify
and shed light on what we believe are the most pressing challenges MRS research
is facing, from both academic and industry perspectives. We review the state of
the art towards solving these challenges and discuss its limitations. Second,
we detail possible future directions and visions we contemplate for the further
evolution of the field. The article should therefore serve two purposes: giving
the interested reader an overview of current challenges in MRS research and
providing guidance for young researchers by identifying interesting, yet
under-researched, directions in the field
"Why Should I Review This Paper?" Unifying Semantic, Topic, and Citation Factors for Paper-Reviewer Matching
As many academic conferences are overwhelmed by a rapidly increasing number
of paper submissions, automatically finding appropriate reviewers for each
submission becomes a more urgent need than ever. Various factors have been
considered by previous attempts on this task to measure the expertise relevance
between a paper and a reviewer, including whether the paper is semantically
close to, shares topics with, and cites previous papers of the reviewer.
However, the majority of previous studies take only one of these factors into
account, leading to an incomprehensive evaluation of paper-reviewer relevance.
To bridge this gap, in this paper, we propose a unified model for
paper-reviewer matching that jointly captures semantic, topic, and citation
factors. In the unified model, a contextualized language model backbone is
shared by all factors to learn common knowledge, while instruction tuning is
introduced to characterize the uniqueness of each factor by producing
factor-aware paper embeddings. Experiments on four datasets (one of which is
newly contributed by us) across different fields, including machine learning,
computer vision, information retrieval, and data mining, consistently validate
the effectiveness of our proposed UniPR model in comparison with
state-of-the-art paper-reviewer matching methods and scientific pre-trained
language models
Neural Vector Spaces for Unsupervised Information Retrieval
We propose the Neural Vector Space Model (NVSM), a method that learns
representations of documents in an unsupervised manner for news article
retrieval. In the NVSM paradigm, we learn low-dimensional representations of
words and documents from scratch using gradient descent and rank documents
according to their similarity with query representations that are composed from
word representations. We show that NVSM performs better at document ranking
than existing latent semantic vector space methods. The addition of NVSM to a
mixture of lexical language models and a state-of-the-art baseline vector space
model yields a statistically significant increase in retrieval effectiveness.
Consequently, NVSM adds a complementary relevance signal. Next to semantic
matching, we find that NVSM performs well in cases where lexical matching is
needed.
NVSM learns a notion of term specificity directly from the document
collection without feature engineering. We also show that NVSM learns
regularities related to Luhn significance. Finally, we give advice on how to
deploy NVSM in situations where model selection (e.g., cross-validation) is
infeasible. We find that an unsupervised ensemble of multiple models trained
with different hyperparameter values performs better than a single
cross-validated model. Therefore, NVSM can safely be used for ranking documents
without supervised relevance judgments.Comment: TOIS 201
Bias-Aware Design for Informed Decisions: Raising Awareness of Self-Selection Bias in User Ratings and Reviews
People often take user ratings and reviews into consideration when shopping
for products or services online. However, such user-generated data contains
self-selection bias that could affect people decisions and it is hard to
resolve this issue completely by algorithms. In this work, we propose to raise
the awareness of the self-selection bias by making three types of information
concerning user ratings and reviews transparent. We distill these three pieces
of information (reviewers experience, the extremity of emotion, and reported
aspects) from the definition of self-selection bias and exploration of related
literature. We further conduct an online survey to assess the perceptions of
the usefulness of such information and identify the exact facets people care
about in their decision process. Then, we propose a visual design to make such
details behind user reviews transparent and integrate the design into an
experimental website for evaluation. The results of a between-subjects study
demonstrate that our bias-aware design significantly increases the awareness of
bias and their satisfaction with decision-making. We further offer a series of
design implications for improving information transparency and awareness of
bias in user-generated content
- …