Search CORE

70 research outputs found

Critically Examining the "Neural Hype": Weak Baselines and the Additivity of Effectiveness Gains from Neural Ranking Models

Author: Lin Jimmy
Lu Kuang
Yang Peilin
Yang Wei
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 17/09/2019
Field of study

Is neural IR mostly hype? In a recent SIGIR Forum article, Lin expressed skepticism that neural ranking models were actually improving ad hoc retrieval effectiveness in limited data scenarios. He provided anecdotal evidence that authors of neural IR papers demonstrate "wins" by comparing against weak baselines. This paper provides a rigorous evaluation of those claims in two ways: First, we conducted a meta-analysis of papers that have reported experimental results on the TREC Robust04 test collection. We do not find evidence of an upward trend in effectiveness over time. In fact, the best reported results are from a decade ago and no recent neural approach comes close. Second, we applied five recent neural models to rerank the strong baselines that Lin used to make his arguments. A significant improvement was observed for one of the models, demonstrating additivity in gains. While there appears to be merit to neural IR approaches, at least some of the gains reported in the literature appear illusory.Comment: Published in the Proceedings of the 42nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2019

arXiv.org e-Print Archive

Crossref

Training Curricula for Open Domain Answer Re-Ranking

Author: Chen X.
Collobert R.
Craswell Nick
Devlin Jacob
Hashemi Helia
Lin Jimmy
Nguyen Tri
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 21/05/2020
Field of study

In precision-oriented tasks like answer ranking, it is more important to rank many relevant answers highly than to retrieve all relevant answers. It follows that a good ranking strategy would be to learn how to identify the easiest correct answers first (i.e., assign a high ranking score to answers that have characteristics that usually indicate relevance, and a low ranking score to those with characteristics that do not), before incorporating more complex logic to handle difficult cases (e.g., semantic matching or reasoning). In this work, we apply this idea to the training of neural answer rankers using curriculum learning. We propose several heuristics to estimate the difficulty of a given training sample. We show that the proposed heuristics can be used to build a training curriculum that down-weights difficult samples early in the training process. As the training process progresses, our approach gradually shifts to weighting all samples equally, regardless of difficulty. We present a comprehensive evaluation of our proposed idea on three answer ranking datasets. Results show that our approach leads to superior performance of two leading neural ranking architectures, namely BERT and ConvKNRM, using both pointwise and pairwise losses. When applied to a BERT-based ranker, our method yields up to a 4% improvement in MRR and a 9% improvement in P@1 (compared to the model trained without a curriculum). This results in models that can achieve comparable performance to more expensive state-of-the-art techniques.Comment: Accepted at SIGIR 2020 (long

arXiv.org e-Print Archive

Crossref

Critically Examining the Claimed Value of Convolutions over User-Item Embedding Maps for Recommender Systems

Author: Bennett James
Defferrard Michaël
den Oord Aaron Van
Hernández-Lobato José Miguel
Johnson Christopher C
Krizhevsky Alex
Levy Mark
Lin Jimmy
Rendle Steffen
Rendle Steffen
Sabour Sara
Zachary
Zhang Shuai
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2020
Field of study

In recent years, algorithm research in the area of recommender systems has shifted from matrix factorization techniques and their latent factor models to neural approaches. However, given the proven power of latent factor models, some newer neural approaches incorporate them within more complex network architectures. One specific idea, recently put forward by several researchers, is to consider potential correlations between the latent factors, i.e., embeddings, by applying convolutions over the user-item interaction map. However, contrary to what is claimed in these articles, such interaction maps do not share the properties of images where Convolutional Neural Networks (CNNs) are particularly useful. In this work, we show through analytical considerations and empirical evaluations that the claimed gains reported in the literature cannot be attributed to the ability of CNNs to model embedding correlations, as argued in the original papers. Moreover, additional performance evaluations show that all of the examined recent CNN-based models are outperformed by existing non-neural machine learning techniques or traditional nearest-neighbor approaches. On a more general level, our work points to major methodological issues in recommender systems research.Comment: Source code available here: https://github.com/MaurizioFD/RecSys2019_DeepLearning_Evaluatio

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Politecnico di Milano

Replication of collaborative filtering generative adversarial networks on recommender systems

Author: Cremonesi Paolo
Ferrari Dacrema Maurizio
Perez Maurera Fernando Benjamin
Publication venue: CEUR Workshop Proceedings
Publication date: 01/01/2022
Field of study

CFGAN and its family of models (TagRec, MTPR, and CRGAN) learn to generate personalized and fake-but-realistic preferences for top-N recommendations by solely using previous interactions. The work discusses the impact of certain differences between the CFGAN framework and the model used in the original evaluation. The absence of random noise and the use of real user profiles as condition vectors leaves the generator prone to learn a degenerate solution in which the output vector is identical to the input vector, therefore, behaving essentially as a simple auto-encoder. This work further expands the experimental analysis comparing CFGAN against a selection of simple and well-known properly optimized baselines, observing that CFGAN is not consistently competitive against them despite its high computational cost

Archivio istituzionale della ricerca - Politecnico di Milano

Replication of recommender systems with impressions

Author: Cremonesi Paolo
Ferrari Dacrema Maurizio
Perez Maurera Fernando Benjamin
Publication venue: CEUR-WS
Publication date: 01/01/2022
Field of study

Impressions are a novel data type in Recommender Systems containing the previously-exposed items, i.e., what was shown on-screen. Due to their novelty, the current literature lacks a characterization of impressions, and replications of previous experiments. Also, previous research works have mainly used impressions in industrial contexts or recommender systems competitions, such as the ACM RecSys Challenges. This work is part of an ongoing study about impressions in recommender systems. It presents an evaluation of impressions recommenders on current open datasets, comparing not only the recommendation quality of impressions recommenders against strong baselines, but also determining if previous progress claims can be replicated

Archivio istituzionale della ricerca - Politecnico di Milano

Training Curricula for Open Domain Answer Re-Ranking

Author: Frieder O.
Goharian N.
MacAvaney S.
Nardini F. M.
Perego R.
Tonellotto N.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2020
Field of study

Archivio della Ricerca - Università di Pisa