4,119 research outputs found
Complex-Valued Autoencoders
Autoencoders are unsupervised machine learning circuits whose learning goal
is to minimize a distortion measure between inputs and outputs. Linear
autoencoders can be defined over any field and only real-valued linear
autoencoder have been studied so far. Here we study complex-valued linear
autoencoders where the components of the training vectors and adjustable
matrices are defined over the complex field with the norm. We provide
simpler and more general proofs that unify the real-valued and complex-valued
cases, showing that in both cases the landscape of the error function is
invariant under certain groups of transformations. The landscape has no local
minima, a family of global minima associated with Principal Component Analysis,
and many families of saddle points associated with orthogonal projections onto
sub-space spanned by sub-optimal subsets of eigenvectors of the covariance
matrix. The theory yields several iterative, convergent, learning algorithms, a
clear understanding of the generalization properties of the trained
autoencoders, and can equally be applied to the hetero-associative case when
external targets are provided. Partial results on deep architecture as well as
the differential geometry of autoencoders are also presented. The general
framework described here is useful to classify autoencoders and identify
general common properties that ought to be investigated for each class,
illuminating some of the connections between information theory, unsupervised
learning, clustering, Hebbian learning, and autoencoders.Comment: Final version, journal ref adde
Complex-Valued Autoencoders for Object Discovery
Object-centric representations form the basis of human perception and enable
us to reason about the world and to systematically generalize to new settings.
Currently, most machine learning work on unsupervised object discovery focuses
on slot-based approaches, which explicitly separate the latent representations
of individual objects. While the result is easily interpretable, it usually
requires the design of involved architectures. In contrast to this, we propose
a distributed approach to object-centric representations: the Complex
AutoEncoder. Following a coding scheme theorized to underlie object
representations in biological neurons, its complex-valued activations represent
two messages: their magnitudes express the presence of a feature, while the
relative phase differences between neurons express which features should be
bound together to create joint object representations. We show that this simple
and efficient approach achieves better reconstruction performance than an
equivalent real-valued autoencoder on simple multi-object datasets.
Additionally, we show that it achieves competitive unsupervised object
discovery performance to a SlotAttention model on two datasets, and manages to
disentangle objects in a third dataset where SlotAttention fails - all while
being 7-70 times faster to train
Contrastive Training of Complex-Valued Autoencoders for Object Discovery
Current state-of-the-art object-centric models use slots and attention-based
routing for binding. However, this class of models has several conceptual
limitations: the number of slots is hardwired; all slots have equal capacity;
training has high computational cost; there are no object-level relational
factors within slots. Synchrony-based models in principle can address these
limitations by using complex-valued activations which store binding information
in their phase components. However, working examples of such synchrony-based
models have been developed only very recently, and are still limited to toy
grayscale datasets and simultaneous storage of less than three objects in
practice. Here we introduce architectural modifications and a novel contrastive
learning method that greatly improve the state-of-the-art synchrony-based
model. For the first time, we obtain a class of synchrony-based models capable
of discovering objects in an unsupervised manner in multi-object color datasets
and simultaneously representing more than three objectsComment: 26 pages, 14 figure
Gated networks: an inventory
Gated networks are networks that contain gating connections, in which the
outputs of at least two neurons are multiplied. Initially, gated networks were
used to learn relationships between two input sources, such as pixels from two
images. More recently, they have been applied to learning activity recognition
or multi-modal representations. The aims of this paper are threefold: 1) to
explain the basic computations in gated networks to the non-expert, while
adopting a standpoint that insists on their symmetric nature. 2) to serve as a
quick reference guide to the recent literature, by providing an inventory of
applications of these networks, as well as recent extensions to the basic
architecture. 3) to suggest future research directions and applications.Comment: Unpublished manuscript, 17 page
- …