25 research outputs found
Zero-bias autoencoders and the benefits of co-adapting features
Regularized training of an autoencoder typically results in hidden unit
biases that take on large negative values. We show that negative biases are a
natural result of using a hidden layer whose responsibility is to both
represent the input data and act as a selection mechanism that ensures sparsity
of the representation. We then show that negative biases impede the learning of
data distributions whose intrinsic dimensionality is high. We also propose a
new activation function that decouples the two roles of the hidden layer and
that allows us to learn representations on data with very high intrinsic
dimensionality, where standard autoencoders typically fail. Since the decoupled
activation function acts like an implicit regularizer, the model can be trained
by minimizing the reconstruction error of training data, without requiring any
additional regularization
"Mental Rotation" by Optimizing Transforming Distance
The human visual system is able to recognize objects despite transformations
that can drastically alter their appearance. To this end, much effort has been
devoted to the invariance properties of recognition systems. Invariance can be
engineered (e.g. convolutional nets), or learned from data explicitly (e.g.
temporal coherence) or implicitly (e.g. by data augmentation). One idea that
has not, to date, been explored is the integration of latent variables which
permit a search over a learned space of transformations. Motivated by evidence
that people mentally simulate transformations in space while comparing
examples, so-called "mental rotation", we propose a transforming distance.
Here, a trained relational model actively transforms pairs of examples so that
they are maximally similar in some feature space yet respect the learned
transformational constraints. We apply our method to nearest-neighbour problems
on the Toronto Face Database and NORB
Predictive Encoding of Contextual Relationships for Perceptual Inference, Interpolation and Prediction
We propose a new neurally-inspired model that can learn to encode the global
relationship context of visual events across time and space and to use the
contextual information to modulate the analysis by synthesis process in a
predictive coding framework. The model learns latent contextual representations
by maximizing the predictability of visual events based on local and global
contextual information through both top-down and bottom-up processes. In
contrast to standard predictive coding models, the prediction error in this
model is used to update the contextual representation but does not alter the
feedforward input for the next layer, and is thus more consistent with
neurophysiological observations. We establish the computational feasibility of
this model by demonstrating its ability in several aspects. We show that our
model can outperform state-of-art performances of gated Boltzmann machines
(GBM) in estimation of contextual information. Our model can also interpolate
missing events or predict future events in image sequences while simultaneously
estimating contextual information. We show it achieves state-of-art
performances in terms of prediction accuracy in a variety of tasks and
possesses the ability to interpolate missing frames, a function that is lacking
in GBM
Exemplar-Centered Supervised Shallow Parametric Data Embedding
Metric learning methods for dimensionality reduction in combination with
k-Nearest Neighbors (kNN) have been extensively deployed in many
classification, data embedding, and information retrieval applications.
However, most of these approaches involve pairwise training data comparisons,
and thus have quadratic computational complexity with respect to the size of
training set, preventing them from scaling to fairly big datasets. Moreover,
during testing, comparing test data against all the training data points is
also expensive in terms of both computational cost and resources required.
Furthermore, previous metrics are either too constrained or too expressive to
be well learned. To effectively solve these issues, we present an
exemplar-centered supervised shallow parametric data embedding model, using a
Maximally Collapsing Metric Learning (MCML) objective. Our strategy learns a
shallow high-order parametric embedding function and compares training/test
data only with learned or precomputed exemplars, resulting in a cost function
with linear computational complexity for both training and testing. We also
empirically demonstrate, using several benchmark datasets, that for
classification in two-dimensional embedding space, our approach not only gains
speedup of kNN by hundreds of times, but also outperforms state-of-the-art
supervised embedding approaches.Comment: accepted to IJCAI201