38,918 research outputs found
An Adaptive Locally Connected Neuron Model: Focusing Neuron
This paper presents a new artificial neuron model capable of learning its
receptive field in the topological domain of inputs. The model provides
adaptive and differentiable local connectivity (plasticity) applicable to any
domain. It requires no other tool than the backpropagation algorithm to learn
its parameters which control the receptive field locations and apertures. This
research explores whether this ability makes the neuron focus on informative
inputs and yields any advantage over fully connected neurons. The experiments
include tests of focusing neuron networks of one or two hidden layers on
synthetic and well-known image recognition data sets. The results demonstrated
that the focusing neurons can move their receptive fields towards more
informative inputs. In the simple two-hidden layer networks, the focusing
layers outperformed the dense layers in the classification of the 2D spatial
data sets. Moreover, the focusing networks performed better than the dense
networks even when 70 of the weights were pruned. The tests on
convolutional networks revealed that using focusing layers instead of dense
layers for the classification of convolutional features may work better in some
data sets.Comment: 45 pages, a national patent filed, submitted to Turkish Patent
Office, No: -2017/17601, Date: 09.11.201
Learning to Skim Text
Recurrent Neural Networks are showing much promise in many sub-areas of
natural language processing, ranging from document classification to machine
translation to automatic question answering. Despite their promise, many
recurrent models have to read the whole text word by word, making it slow to
handle long documents. For example, it is difficult to use a recurrent network
to read a book and answer questions about it. In this paper, we present an
approach of reading text while skipping irrelevant information if needed. The
underlying model is a recurrent network that learns how far to jump after
reading a few words of the input text. We employ a standard policy gradient
method to train the model to make discrete jumping decisions. In our benchmarks
on four different tasks, including number prediction, sentiment analysis, news
article classification and automatic Q\&A, our proposed model, a modified LSTM
with jumping, is up to 6 times faster than the standard sequential LSTM, while
maintaining the same or even better accuracy
Describing Videos by Exploiting Temporal Structure
Recent progress in using recurrent neural networks (RNNs) for image
description has motivated the exploration of their application for video
description. However, while images are static, working with videos requires
modeling their dynamic temporal structure and then properly integrating that
information into a natural language description. In this context, we propose an
approach that successfully takes into account both the local and global
temporal structure of videos to produce descriptions. First, our approach
incorporates a spatial temporal 3-D convolutional neural network (3-D CNN)
representation of the short temporal dynamics. The 3-D CNN representation is
trained on video action recognition tasks, so as to produce a representation
that is tuned to human motion and behavior. Second we propose a temporal
attention mechanism that allows to go beyond local temporal modeling and learns
to automatically select the most relevant temporal segments given the
text-generating RNN. Our approach exceeds the current state-of-art for both
BLEU and METEOR metrics on the Youtube2Text dataset. We also present results on
a new, larger and more challenging dataset of paired video and natural language
descriptions.Comment: Accepted to ICCV15. This version comes with code release and
supplementary materia
- …