23,123 research outputs found
The Neural Representation Benchmark and its Evaluation on Brain and Machine
A key requirement for the development of effective learning representations
is their evaluation and comparison to representations we know to be effective.
In natural sensory domains, the community has viewed the brain as a source of
inspiration and as an implicit benchmark for success. However, it has not been
possible to directly test representational learning algorithms directly against
the representations contained in neural systems. Here, we propose a new
benchmark for visual representations on which we have directly tested the
neural representation in multiple visual cortical areas in macaque (utilizing
data from [Majaj et al., 2012]), and on which any computer vision algorithm
that produces a feature space can be tested. The benchmark measures the
effectiveness of the neural or machine representation by computing the
classification loss on the ordered eigendecomposition of a kernel matrix
[Montavon et al., 2011]. In our analysis we find that the neural representation
in visual area IT is superior to visual area V4. In our analysis of
representational learning algorithms, we find that three-layer models approach
the representational performance of V4 and the algorithm in [Le et al., 2012]
surpasses the performance of V4. Impressively, we find that a recent supervised
algorithm [Krizhevsky et al., 2012] achieves performance comparable to that of
IT for an intermediate level of image variation difficulty, and surpasses IT at
a higher difficulty level. We believe this result represents a major milestone:
it is the first learning algorithm we have found that exceeds our current
estimate of IT representation performance. We hope that this benchmark will
assist the community in matching the representational performance of visual
cortex and will serve as an initial rallying point for further correspondence
between representations derived in brains and machines.Comment: The v1 version contained incorrectly computed kernel analysis curves
and KA-AUC values for V4, IT, and the HT-L3 models. They have been corrected
in this versio
SchNet: A continuous-filter convolutional neural network for modeling quantum interactions
Deep learning has the potential to revolutionize quantum chemistry as it is
ideally suited to learn representations for structured data and speed up the
exploration of chemical space. While convolutional neural networks have proven
to be the first choice for images, audio and video data, the atoms in molecules
are not restricted to a grid. Instead, their precise locations contain
essential physical information, that would get lost if discretized. Thus, we
propose to use continuous-filter convolutional layers to be able to model local
correlations without requiring the data to lie on a grid. We apply those layers
in SchNet: a novel deep learning architecture modeling quantum interactions in
molecules. We obtain a joint model for the total energy and interatomic forces
that follows fundamental quantum-chemical principles. This includes
rotationally invariant energy predictions and a smooth, differentiable
potential energy surface. Our architecture achieves state-of-the-art
performance for benchmarks of equilibrium molecules and molecular dynamics
trajectories. Finally, we introduce a more challenging benchmark with chemical
and structural variations that suggests the path for further work
Rapid Visual Categorization is not Guided by Early Salience-Based Selection
The current dominant visual processing paradigm in both human and machine
research is the feedforward, layered hierarchy of neural-like processing
elements. Within this paradigm, visual saliency is seen by many to have a
specific role, namely that of early selection. Early selection is thought to
enable very fast visual performance by limiting processing to only the most
salient candidate portions of an image. This strategy has led to a plethora of
saliency algorithms that have indeed improved processing time efficiency in
machine algorithms, which in turn have strengthened the suggestion that human
vision also employs a similar early selection strategy. However, at least one
set of critical tests of this idea has never been performed with respect to the
role of early selection in human vision. How would the best of the current
saliency models perform on the stimuli used by experimentalists who first
provided evidence for this visual processing paradigm? Would the algorithms
really provide correct candidate sub-images to enable fast categorization on
those same images? Do humans really need this early selection for their
impressive performance? Here, we report on a new series of tests of these
questions whose results suggest that it is quite unlikely that such an early
selection process has any role in human rapid visual categorization.Comment: 22 pages, 9 figure
Lifelong Learning of Spatiotemporal Representations with Dual-Memory Recurrent Self-Organization
Artificial autonomous agents and robots interacting in complex environments
are required to continually acquire and fine-tune knowledge over sustained
periods of time. The ability to learn from continuous streams of information is
referred to as lifelong learning and represents a long-standing challenge for
neural network models due to catastrophic forgetting. Computational models of
lifelong learning typically alleviate catastrophic forgetting in experimental
scenarios with given datasets of static images and limited complexity, thereby
differing significantly from the conditions artificial agents are exposed to.
In more natural settings, sequential information may become progressively
available over time and access to previous experience may be restricted. In
this paper, we propose a dual-memory self-organizing architecture for lifelong
learning scenarios. The architecture comprises two growing recurrent networks
with the complementary tasks of learning object instances (episodic memory) and
categories (semantic memory). Both growing networks can expand in response to
novel sensory experience: the episodic memory learns fine-grained
spatiotemporal representations of object instances in an unsupervised fashion
while the semantic memory uses task-relevant signals to regulate structural
plasticity levels and develop more compact representations from episodic
experience. For the consolidation of knowledge in the absence of external
sensory input, the episodic memory periodically replays trajectories of neural
reactivations. We evaluate the proposed model on the CORe50 benchmark dataset
for continuous object recognition, showing that we significantly outperform
current methods of lifelong learning in three different incremental learning
scenario
- …