378 research outputs found

    Critically Examining the "Neural Hype": Weak Baselines and the Additivity of Effectiveness Gains from Neural Ranking Models

    Full text link
    Is neural IR mostly hype? In a recent SIGIR Forum article, Lin expressed skepticism that neural ranking models were actually improving ad hoc retrieval effectiveness in limited data scenarios. He provided anecdotal evidence that authors of neural IR papers demonstrate "wins" by comparing against weak baselines. This paper provides a rigorous evaluation of those claims in two ways: First, we conducted a meta-analysis of papers that have reported experimental results on the TREC Robust04 test collection. We do not find evidence of an upward trend in effectiveness over time. In fact, the best reported results are from a decade ago and no recent neural approach comes close. Second, we applied five recent neural models to rerank the strong baselines that Lin used to make his arguments. A significant improvement was observed for one of the models, demonstrating additivity in gains. While there appears to be merit to neural IR approaches, at least some of the gains reported in the literature appear illusory.Comment: Published in the Proceedings of the 42nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2019

    Training Curricula for Open Domain Answer Re-Ranking

    Full text link
    In precision-oriented tasks like answer ranking, it is more important to rank many relevant answers highly than to retrieve all relevant answers. It follows that a good ranking strategy would be to learn how to identify the easiest correct answers first (i.e., assign a high ranking score to answers that have characteristics that usually indicate relevance, and a low ranking score to those with characteristics that do not), before incorporating more complex logic to handle difficult cases (e.g., semantic matching or reasoning). In this work, we apply this idea to the training of neural answer rankers using curriculum learning. We propose several heuristics to estimate the difficulty of a given training sample. We show that the proposed heuristics can be used to build a training curriculum that down-weights difficult samples early in the training process. As the training process progresses, our approach gradually shifts to weighting all samples equally, regardless of difficulty. We present a comprehensive evaluation of our proposed idea on three answer ranking datasets. Results show that our approach leads to superior performance of two leading neural ranking architectures, namely BERT and ConvKNRM, using both pointwise and pairwise losses. When applied to a BERT-based ranker, our method yields up to a 4% improvement in MRR and a 9% improvement in P@1 (compared to the model trained without a curriculum). This results in models that can achieve comparable performance to more expensive state-of-the-art techniques.Comment: Accepted at SIGIR 2020 (long

    Separating the dynamical effects of climate change and ozone depletion. Part I: Southern Hemisphere stratosphere

    Get PDF
    A version of the Canadian Middle Atmosphere Model that is coupled to an ocean is used to investigate the separate effects of climate change and ozone depletion on the dynamics of the Southern Hemisphere (SH) stratosphere. This is achieved by performing three sets of simulations extending from 1960 to 2099: 1) greenhouse gases (GHGs) fixed at 1960 levels and ozone depleting substances (ODSs) varying in time, 2) ODSs fixed at 1960 levels and GHGs varying in time, and 3) both GHGs and ODSs varying in time. The response of various dynamical quantities to theGHGand ODS forcings is shown to be additive; that is, trends computed from the sum of the first two simulations are equal to trends from the third. Additivity is shown to hold for the zonal mean zonal wind and temperature, the mass flux into and out of the stratosphere, and the latitudinally averaged wave drag in SH spring and summer, as well as for final warming dates. Ozone depletion and recovery causes seasonal changes in lower-stratosphere mass flux, with reduced polar downwelling in the past followed by increased downwelling in the future in SH spring, and the reverse in SH summer. These seasonal changes are attributed to changes in wave drag caused by ozone-induced changes in the zonal mean zonal winds. Climate change, on the other hand, causes a steady decrease in wave drag during SH spring, which delays the breakdown of the vortex, resulting in increased wave drag in summe

    Replication of collaborative filtering generative adversarial networks on recommender systems

    Get PDF
    CFGAN and its family of models (TagRec, MTPR, and CRGAN) learn to generate personalized and fake-but-realistic preferences for top-N recommendations by solely using previous interactions. The work discusses the impact of certain differences between the CFGAN framework and the model used in the original evaluation. The absence of random noise and the use of real user profiles as condition vectors leaves the generator prone to learn a degenerate solution in which the output vector is identical to the input vector, therefore, behaving essentially as a simple auto-encoder. This work further expands the experimental analysis comparing CFGAN against a selection of simple and well-known properly optimized baselines, observing that CFGAN is not consistently competitive against them despite its high computational cost

    Critically Examining the Claimed Value of Convolutions over User-Item Embedding Maps for Recommender Systems

    Full text link
    In recent years, algorithm research in the area of recommender systems has shifted from matrix factorization techniques and their latent factor models to neural approaches. However, given the proven power of latent factor models, some newer neural approaches incorporate them within more complex network architectures. One specific idea, recently put forward by several researchers, is to consider potential correlations between the latent factors, i.e., embeddings, by applying convolutions over the user-item interaction map. However, contrary to what is claimed in these articles, such interaction maps do not share the properties of images where Convolutional Neural Networks (CNNs) are particularly useful. In this work, we show through analytical considerations and empirical evaluations that the claimed gains reported in the literature cannot be attributed to the ability of CNNs to model embedding correlations, as argued in the original papers. Moreover, additional performance evaluations show that all of the examined recent CNN-based models are outperformed by existing non-neural machine learning techniques or traditional nearest-neighbor approaches. On a more general level, our work points to major methodological issues in recommender systems research.Comment: Source code available here: https://github.com/MaurizioFD/RecSys2019_DeepLearning_Evaluatio

    Replication of recommender systems with impressions

    Get PDF
    Impressions are a novel data type in Recommender Systems containing the previously-exposed items, i.e., what was shown on-screen. Due to their novelty, the current literature lacks a characterization of impressions, and replications of previous experiments. Also, previous research works have mainly used impressions in industrial contexts or recommender systems competitions, such as the ACM RecSys Challenges. This work is part of an ongoing study about impressions in recommender systems. It presents an evaluation of impressions recommenders on current open datasets, comparing not only the recommendation quality of impressions recommenders against strong baselines, but also determining if previous progress claims can be replicated

    Going Beyond Linear Mode Connectivity: The Layerwise Linear Feature Connectivity

    Full text link
    Recent work has revealed many intriguing empirical phenomena in neural network training, despite the poorly understood and highly complex loss landscapes and training dynamics. One of these phenomena, Linear Mode Connectivity (LMC), has gained considerable attention due to the intriguing observation that different solutions can be connected by a linear path in the parameter space while maintaining near-constant training and test losses. In this work, we introduce a stronger notion of linear connectivity, Layerwise Linear Feature Connectivity (LLFC), which says that the feature maps of every layer in different trained networks are also linearly connected. We provide comprehensive empirical evidence for LLFC across a wide range of settings, demonstrating that whenever two trained networks satisfy LMC (via either spawning or permutation methods), they also satisfy LLFC in nearly all the layers. Furthermore, we delve deeper into the underlying factors contributing to LLFC, which reveal new insights into the spawning and permutation approaches. The study of LLFC transcends and advances our understanding of LMC by adopting a feature-learning perspective.Comment: 25 pages, 23 figure
    • …
    corecore