Search CORE

24,822 research outputs found

DE-PACRR: Exploring Layers Inside the PACRR Model

Author: Hui Kai
Yates Andrew
Publication venue
Publication date: 01/01/2017
Field of study

Recent neural IR models have demonstrated deep learning's utility in ad-hoc information retrieval. However, deep models have a reputation for being black boxes, and the roles of a neural IR model's components may not be obvious at first glance. In this work, we attempt to shed light on the inner workings of a recently proposed neural IR model, namely the PACRR model, by visualizing the output of intermediate layers and by investigating the relationship between intermediate weights and the ultimate relevance score produced. We highlight several insights, hoping that such insights will be generally applicable.Comment: Neu-IR 2017 SIGIR Workshop on Neural Information Retrieva

arXiv.org e-Print Archive

MPG.PuRe

End-to-End Neural Ad-hoc Ranking with Kernel Pooling

Author: Callan Jamie
Dai Zhuyun
Liu Zhiyuan
Power Russell
Xiong Chenyan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 20/06/2017
Field of study

This paper proposes K-NRM, a kernel based neural model for document ranking. Given a query and a set of documents, K-NRM uses a translation matrix that models word-level similarities via word embeddings, a new kernel-pooling technique that uses kernels to extract multi-level soft match features, and a learning-to-rank layer that combines those features into the final ranking score. The whole model is trained end-to-end. The ranking layer learns desired feature patterns from the pairwise ranking loss. The kernels transfer the feature patterns into soft-match targets at each similarity level and enforce them on the translation matrix. The word embeddings are tuned accordingly so that they can produce the desired soft matches. Experiments on a commercial search engine's query log demonstrate the improvements of K-NRM over prior feature-based and neural-based states-of-the-art, and explain the source of K-NRM's advantage: Its kernel-guided embedding encodes a similarity metric tailored for matching query words to document words, and provides effective multi-level soft matches

arXiv.org e-Print Archive

Crossref

The Deep Weight Prior

Author: Ashukha Arsenii
Atanov Andrei
Struminsky Kirill
Vetrov Dmitry
Welling Max
Publication venue
Publication date: 18/02/2019
Field of study

Bayesian inference is known to provide a general framework for incorporating prior knowledge or specific properties into machine learning models via carefully choosing a prior distribution. In this work, we propose a new type of prior distributions for convolutional neural networks, deep weight prior (DWP), that exploit generative models to encourage a specific structure of trained convolutional filters e.g., spatial correlations of weights. We define DWP in the form of an implicit distribution and propose a method for variational inference with such type of implicit priors. In experiments, we show that DWP improves the performance of Bayesian neural networks when training data are limited, and initialization of weights with samples from DWP accelerates training of conventional convolutional neural networks.Comment: TL;DR: The deep weight prior learns a generative model for kernels of convolutional neural networks, that acts as a prior distribution while training on new dataset

arXiv.org e-Print Archive

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Learning Convolutional Text Representations for Visual Question Answering

Author: Ji Shuiwang
Wang Zhengyang
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 18/04/2018
Field of study

Visual question answering is a recently proposed artificial intelligence task that requires a deep understanding of both images and texts. In deep learning, images are typically modeled through convolutional neural networks, and texts are typically modeled through recurrent neural networks. While the requirement for modeling images is similar to traditional computer vision tasks, such as object recognition and image classification, visual question answering raises a different need for textual representation as compared to other natural language processing tasks. In this work, we perform a detailed analysis on natural language questions in visual question answering. Based on the analysis, we propose to rely on convolutional neural networks for learning textual representations. By exploring the various properties of convolutional neural networks specialized for text data, such as width and depth, we present our "CNN Inception + Gate" model. We show that our model improves question representations and thus the overall accuracy of visual question answering models. We also show that the text representation requirement in visual question answering is more complicated and comprehensive than that in conventional natural language processing tasks, making it a better task to evaluate textual representation methods. Shallow models like fastText, which can obtain comparable results with deep learning models in tasks like text classification, are not suitable in visual question answering.Comment: Conference paper at SDM 2018. https://github.com/divelab/sva

arXiv.org e-Print Archive

Crossref