793 research outputs found
Visual Imitation Learning with Recurrent Siamese Networks
It would be desirable for a reinforcement learning (RL) based agent to learn
behaviour by merely watching a demonstration. However, defining rewards that
facilitate this goal within the RL paradigm remains a challenge. Here we
address this problem with Siamese networks, trained to compute distances
between observed behaviours and the agent's behaviours. Given a desired motion
such Siamese networks can be used to provide a reward signal to an RL agent via
the distance between the desired motion and the agent's motion. We experiment
with an RNN-based comparator model that can compute distances in space and time
between motion clips while training an RL policy to minimize this distance.
Through experimentation, we have had also found that the inclusion of
multi-task data and an additional image encoding loss helps enforce the
temporal consistency. These two components appear to balance reward for
matching a specific instance of behaviour versus that behaviour in general.
Furthermore, we focus here on a particularly challenging form of this problem
where only a single demonstration is provided for a given task -- the one-shot
learning setting. We demonstrate our approach on humanoid agents in both 2D
with degrees of freedom (DoF) and 3D with DoF.Comment: PrePrin
Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning
A lot of the recent success in natural language processing (NLP) has been
driven by distributed vector representations of words trained on large amounts
of text in an unsupervised manner. These representations are typically used as
general purpose features for words across a range of NLP problems. However,
extending this success to learning representations of sequences of words, such
as sentences, remains an open problem. Recent work has explored unsupervised as
well as supervised learning techniques with different training objectives to
learn general purpose fixed-length sentence representations. In this work, we
present a simple, effective multi-task learning framework for sentence
representations that combines the inductive biases of diverse training
objectives in a single model. We train this model on several data sources with
multiple training objectives on over 100 million sentences. Extensive
experiments demonstrate that sharing a single recurrent sentence encoder across
weakly related tasks leads to consistent improvements over previous methods. We
present substantial improvements in the context of transfer learning and
low-resource settings using our learned general-purpose representations.Comment: Accepted at ICLR 201
Structure Learning for Neural Module Networks
Neural Module Networks, originally proposed for the task of visual question
answering, are a class of neural network architectures that involve
human-specified neural modules, each designed for a specific form of reasoning.
In current formulations of such networks only the parameters of the neural
modules and/or the order of their execution is learned. In this work, we
further expand this approach and also learn the underlying internal structure
of modules in terms of the ordering and combination of simple and elementary
arithmetic operators. Our results show that one is indeed able to
simultaneously learn both internal module structure and module sequencing
without extra supervisory signals for module execution sequencing. With this
approach, we report performance comparable to models using hand-designed
modules
Renormalization view on resonance proliferation between many-body localized phases
Topology and many-body localization (MBL) have opened new avenues for
preserving quantum information at finite energy density. Resonant
delocalization plays a crucial role in destabilizing these phenomena. In this
work, we study the statistical properties of many-body resonances in a
disordered interacting Ising model - which can host symmetry protected
topological order - using a Clifford circuit encoding of the real space
renormalization group which allows the resonant properties of the wave
functions to be efficiently characterized. Our findings show that both the
trivial and topologically ordered MBL phases remain stable to the resonances,
but in the vicinity of the transition between them localization is destabilized
by resonance proliferation. Diverging susceptibility towards the development of
an avalanche instability suggests an intervening ergodic phase. We are also
able to access the local integrals of motion in the MBL phases and identify the
topological edge-mode operators in the ordered phase. Our results have
important implications for the stability of MBL and phase transitions between
distinct MBL phases with and without symmetries.Comment: 13 pages, 11 figure
Deep Complex Networks
At present, the vast majority of building blocks, techniques, and
architectures for deep learning are based on real-valued operations and
representations. However, recent work on recurrent neural networks and older
fundamental theoretical analysis suggests that complex numbers could have a
richer representational capacity and could also facilitate noise-robust memory
retrieval mechanisms. Despite their attractive properties and potential for
opening up entirely new neural architectures, complex-valued deep neural
networks have been marginalized due to the absence of the building blocks
required to design such models. In this work, we provide the key atomic
components for complex-valued deep neural networks and apply them to
convolutional feed-forward networks and convolutional LSTMs. More precisely, we
rely on complex convolutions and present algorithms for complex
batch-normalization, complex weight initialization strategies for
complex-valued neural nets and we use them in experiments with end-to-end
training schemes. We demonstrate that such complex-valued models are
competitive with their real-valued counterparts. We test deep complex models on
several computer vision tasks, on music transcription using the MusicNet
dataset and on Speech Spectrum Prediction using the TIMIT dataset. We achieve
state-of-the-art performance on these audio-related tasks
- …