38,918 research outputs found

    An Adaptive Locally Connected Neuron Model: Focusing Neuron

    Full text link
    This paper presents a new artificial neuron model capable of learning its receptive field in the topological domain of inputs. The model provides adaptive and differentiable local connectivity (plasticity) applicable to any domain. It requires no other tool than the backpropagation algorithm to learn its parameters which control the receptive field locations and apertures. This research explores whether this ability makes the neuron focus on informative inputs and yields any advantage over fully connected neurons. The experiments include tests of focusing neuron networks of one or two hidden layers on synthetic and well-known image recognition data sets. The results demonstrated that the focusing neurons can move their receptive fields towards more informative inputs. In the simple two-hidden layer networks, the focusing layers outperformed the dense layers in the classification of the 2D spatial data sets. Moreover, the focusing networks performed better than the dense networks even when 70%\% of the weights were pruned. The tests on convolutional networks revealed that using focusing layers instead of dense layers for the classification of convolutional features may work better in some data sets.Comment: 45 pages, a national patent filed, submitted to Turkish Patent Office, No: -2017/17601, Date: 09.11.201

    Learning to Skim Text

    Full text link
    Recurrent Neural Networks are showing much promise in many sub-areas of natural language processing, ranging from document classification to machine translation to automatic question answering. Despite their promise, many recurrent models have to read the whole text word by word, making it slow to handle long documents. For example, it is difficult to use a recurrent network to read a book and answer questions about it. In this paper, we present an approach of reading text while skipping irrelevant information if needed. The underlying model is a recurrent network that learns how far to jump after reading a few words of the input text. We employ a standard policy gradient method to train the model to make discrete jumping decisions. In our benchmarks on four different tasks, including number prediction, sentiment analysis, news article classification and automatic Q\&A, our proposed model, a modified LSTM with jumping, is up to 6 times faster than the standard sequential LSTM, while maintaining the same or even better accuracy

    Describing Videos by Exploiting Temporal Structure

    Full text link
    Recent progress in using recurrent neural networks (RNNs) for image description has motivated the exploration of their application for video description. However, while images are static, working with videos requires modeling their dynamic temporal structure and then properly integrating that information into a natural language description. In this context, we propose an approach that successfully takes into account both the local and global temporal structure of videos to produce descriptions. First, our approach incorporates a spatial temporal 3-D convolutional neural network (3-D CNN) representation of the short temporal dynamics. The 3-D CNN representation is trained on video action recognition tasks, so as to produce a representation that is tuned to human motion and behavior. Second we propose a temporal attention mechanism that allows to go beyond local temporal modeling and learns to automatically select the most relevant temporal segments given the text-generating RNN. Our approach exceeds the current state-of-art for both BLEU and METEOR metrics on the Youtube2Text dataset. We also present results on a new, larger and more challenging dataset of paired video and natural language descriptions.Comment: Accepted to ICCV15. This version comes with code release and supplementary materia

    Annotated Bibliography: Anticipation

    Get PDF
    corecore