Search CORE

23,328 research outputs found

Learning recurrent representations for hierarchical behavior modeling

Author: Branson Kristin
Eyjolfsdottir Eyrun
Perona Pietro
Yue Yisong
Publication venue
Publication date: 15/11/2016
Field of study

We propose a framework for detecting action patterns from motion sequences and modeling the sensory-motor relationship of animals, using a generative recurrent neural network. The network has a discriminative part (classifying actions) and a generative part (predicting motion), whose recurrent cells are laterally connected, allowing higher levels of the network to represent high level phenomena. We test our framework on two types of data, fruit fly behavior and online handwriting. Our results show that 1) taking advantage of unlabeled sequences, by predicting future motion, significantly improves action detection performance when training labels are scarce, 2) the network learns to represent high level phenomena such as writer identity and fly gender, without supervision, and 3) simulated motion trajectories, generated by treating motion prediction as input to the network, look realistic and may be used to qualitatively evaluate whether the model has learnt generative control rules

arXiv.org e-Print Archive

Caltech Authors

Revisiting the Hierarchical Multiscale LSTM

Author: Alishahi Afra
Chrupała Grzegorz
Côté Marc-Alexandre
Kádár Ákos
Publication venue
Publication date: 01/01/2018
Field of study

Hierarchical Multiscale LSTM (Chung et al., 2016a) is a state-of-the-art language model that learns interpretable structure from character-level input. Such models can provide fertile ground for (cognitive) computational linguistics studies. However, the high complexity of the architecture, training procedure and implementations might hinder its applicability. We provide a detailed reproduction and ablation study of the architecture, shedding light on some of the potential caveats of re-purposing complex deep-learning architectures. We further show that simplifying certain aspects of the architecture can in fact improve its performance. We also investigate the linguistic units (segments) learned by various levels of the model, and argue that their quality does not correlate with the overall performance of the model on language modeling.Comment: To appear in COLING 2018 (reproduction track

arXiv.org e-Print Archive

Tilburg University Repository