1,601 research outputs found
Representation Learning: A Review and New Perspectives
The success of machine learning algorithms generally depends on data
representation, and we hypothesize that this is because different
representations can entangle and hide more or less the different explanatory
factors of variation behind the data. Although specific domain knowledge can be
used to help design representations, learning with generic priors can also be
used, and the quest for AI is motivating the design of more powerful
representation-learning algorithms implementing such priors. This paper reviews
recent work in the area of unsupervised feature learning and deep learning,
covering advances in probabilistic models, auto-encoders, manifold learning,
and deep networks. This motivates longer-term unanswered questions about the
appropriate objectives for learning good representations, for computing
representations (i.e., inference), and the geometrical connections between
representation learning, density estimation and manifold learning
Contrastive Multimodal Learning for Emergence of Graphical Sensory-Motor Communication
In this paper, we investigate whether artificial agents can develop a shared
language in an ecological setting where communication relies on a sensory-motor
channel. To this end, we introduce the Graphical Referential Game (GREG) where
a speaker must produce a graphical utterance to name a visual referent object
while a listener has to select the corresponding object among distractor
referents, given the delivered message. The utterances are drawing images
produced using dynamical motor primitives combined with a sketching library. To
tackle GREG we present CURVES: a multimodal contrastive deep learning mechanism
that represents the energy (alignment) between named referents and utterances
generated through gradient ascent on the learned energy landscape. We
demonstrate that CURVES not only succeeds at solving the GREG but also enables
agents to self-organize a language that generalizes to feature compositions
never seen during training. In addition to evaluating the communication
performance of our approach, we also explore the structure of the emerging
language. Specifically, we show that the resulting language forms a coherent
lexicon shared between agents and that basic compositional rules on the
graphical productions could not explain the compositional generalization
Improving neural networks by preventing co-adaptation of feature detectors
When a large feedforward neural network is trained on a small training set,
it typically performs poorly on held-out test data. This "overfitting" is
greatly reduced by randomly omitting half of the feature detectors on each
training case. This prevents complex co-adaptations in which a feature detector
is only helpful in the context of several other specific feature detectors.
Instead, each neuron learns to detect a feature that is generally helpful for
producing the correct answer given the combinatorially large variety of
internal contexts in which it must operate. Random "dropout" gives big
improvements on many benchmark tasks and sets new records for speech and object
recognition
Learning generative texture models with extended Fields-of-Experts
We evaluate the ability of the popular Field-of-Experts (FoE) to model structure in images. As a test case we focus on modeling synthetic and natural textures. We find that even for modeling single textures, the FoE provides insufficient flexibility to learn good generative models – it does not perform any better than the much simpler Gaussian FoE. We propose an extended version of the FoE (allowing for bimodal potentials) and demonstrate that this novel formulation, when trained with a better approximation of the likelihood gradient, gives rise to a more powerful generative model of specific visual structure that produces significantly better results for the texture task
PrAGMATiC: a Probabilistic and Generative Model of Areas Tiling the Cortex
Much of the human cortex seems to be organized into topographic cortical
maps. Yet few quantitative methods exist for characterizing these maps. To
address this issue we developed a modeling framework that can reveal
group-level cortical maps based on neuroimaging data. PrAGMATiC, a
probabilistic and generative model of areas tiling the cortex, is a
hierarchical Bayesian generative model of cortical maps. This model assumes
that the cortical map in each individual subject is a sample from a single
underlying probability distribution. Learning the parameters of this
distribution reveals the properties of a cortical map that are common across a
group of subjects while avoiding the potentially lossy step of co-registering
each subject into a group anatomical space. In this report we give a
mathematical description of PrAGMATiC, describe approximations that make it
practical to use, show preliminary results from its application to a real
dataset, and describe a number of possible future extensions
- …