181,020 research outputs found
Hybrid Collaborative Filtering with Autoencoders
Collaborative Filtering aims at exploiting the feedback of users to provide
personalised recommendations. Such algorithms look for latent variables in a
large sparse matrix of ratings. They can be enhanced by adding side information
to tackle the well-known cold start problem. While Neu-ral Networks have
tremendous success in image and speech recognition, they have received less
attention in Collaborative Filtering. This is all the more surprising that
Neural Networks are able to discover latent variables in large and
heterogeneous datasets. In this paper, we introduce a Collaborative Filtering
Neural network architecture aka CFN which computes a non-linear Matrix
Factorization from sparse rating inputs and side information. We show
experimentally on the MovieLens and Douban dataset that CFN outper-forms the
state of the art and benefits from side information. We provide an
implementation of the algorithm as a reusable plugin for Torch, a popular
Neural Network framework
Dual Attention Networks for Visual Reference Resolution in Visual Dialog
Visual dialog (VisDial) is a task which requires an AI agent to answer a
series of questions grounded in an image. Unlike in visual question answering
(VQA), the series of questions should be able to capture a temporal context
from a dialog history and exploit visually-grounded information. A problem
called visual reference resolution involves these challenges, requiring the
agent to resolve ambiguous references in a given question and find the
references in a given image. In this paper, we propose Dual Attention Networks
(DAN) for visual reference resolution. DAN consists of two kinds of attention
networks, REFER and FIND. Specifically, REFER module learns latent
relationships between a given question and a dialog history by employing a
self-attention mechanism. FIND module takes image features and reference-aware
representations (i.e., the output of REFER module) as input, and performs
visual grounding via bottom-up attention mechanism. We qualitatively and
quantitatively evaluate our model on VisDial v1.0 and v0.9 datasets, showing
that DAN outperforms the previous state-of-the-art model by a significant
margin.Comment: EMNLP 201
- …