58,405 research outputs found
Bayesian models of category acquisition and meaning development
The ability to organize concepts (e.g., dog, chair) into efficient mental representations,
i.e., categories (e.g., animal, furniture) is a fundamental mechanism which allows humans
to perceive, organize, and adapt to their world. Much research has been dedicated
to the questions of how categories emerge and how they are represented. Experimental
evidence suggests that (i) concepts and categories are represented through sets of
features (e.g., dogs bark, chairs are made of wood) which are structured into different
types (e.g, behavior, material); (ii) categories and their featural representations are
learnt jointly and incrementally; and (iii) categories are dynamic and their representations
adapt to changing environments.
This thesis investigates the mechanisms underlying the incremental and dynamic formation
of categories and their featural representations through cognitively motivated
Bayesian computational models. Models of category acquisition have been extensively
studied in cognitive science and primarily tested on perceptual abstractions or artificial
stimuli. In this thesis, we focus on categories acquired from natural language stimuli,
using nouns as a stand-in for their reference concepts, and their linguistic contexts as
a representation of the concepts’ features. The use of text corpora allows us to (i) develop
large-scale unsupervised models thus simulating human learning, and (ii) model
child category acquisition, leveraging the linguistic input available to children in the
form of transcribed child-directed language.
In the first part of this thesis we investigate the incremental process of category acquisition.
We present a Bayesian model and an incremental learning algorithm which
sequentially integrates newly observed data. We evaluate our model output against
gold standard categories (elicited experimentally from human participants), and show
that high-quality categories are learnt both from child-directed data and from large,
thematically unrestricted text corpora. We find that the model performs well even under
constrained memory resources, resembling human cognitive limitations. While
lists of representative features for categories emerge from this model, they are neither
structured nor jointly optimized with the categories.
We address these shortcomings in the second part of the thesis, and present a Bayesian
model which jointly learns categories and structured featural representations. We
present both batch and incremental learning algorithms, and demonstrate the model’s
effectiveness on both encyclopedic and child-directed data. We show that high-quality
categories and features emerge in the joint learning process, and that the structured
features are intuitively interpretable through human plausibility judgment evaluation.
In the third part of the thesis we turn to the dynamic nature of meaning: categories and
their featural representations change over time, e.g., children distinguish some types
of features (such as size and shade) less clearly than adults, and word meanings adapt
to our ever changing environment and its structure. We present a dynamic Bayesian
model of meaning change, which infers time-specific concept representations as a set
of feature types and their prevalence, and captures their development as a smooth process.
We analyze the development of concept representations in their complexity over
time from child-directed data, and show that our model captures established patterns of
child concept learning. We also apply our model to diachronic change of word meaning,
modeling how word senses change internally and in prevalence over centuries.
The contributions of this thesis are threefold. Firstly, we show that a variety of experimental
results on the acquisition and representation of categories can be captured
with computational models within the framework of Bayesian modeling. Secondly,
we show that natural language text is an appropriate source of information for modeling
categorization-related phenomena suggesting that the environmental structure that
drives category formation is encoded in this data. Thirdly, we show that the experimental
findings hold on a larger scale. Our models are trained and tested on a larger
set of concepts and categories than is common in behavioral experiments and the categories
and featural representations they can learn from linguistic text are in principle
unrestricted
Recommended from our members
A Goal-Directed Bayesian Framework for Categorization
Categorization is a fundamental ability for efficient behavioral control. It allows organisms to remember the correct responses to categorical cues and not for every stimulus encountered (hence eluding computational cost or complexity), and to generalize appropriate responses to novel stimuli dependant on category assignment. Assuming the brain performs Bayesian inference, based on a generative model of the external world and future goals, we propose a computational model of categorization in which important properties emerge. These properties comprise the ability to infer latent causes of sensory experience, a hierarchical organization of latent causes, and an explicit inclusion of context and action representations. Crucially, these aspects derive from considering the environmental statistics that are relevant to achieve goals, and from the fundamental Bayesian principle that any generative model should be preferred over alternative models based on an accuracy-complexity trade-off. Our account is a step toward elucidating computational principles of categorization and its role within the Bayesian brain hypothesis
Sketch-a-Net that Beats Humans
We propose a multi-scale multi-channel deep neural network framework that,
for the first time, yields sketch recognition performance surpassing that of
humans. Our superior performance is a result of explicitly embedding the unique
characteristics of sketches in our model: (i) a network architecture designed
for sketch rather than natural photo statistics, (ii) a multi-channel
generalisation that encodes sequential ordering in the sketching process, and
(iii) a multi-scale network ensemble with joint Bayesian fusion that accounts
for the different levels of abstraction exhibited in free-hand sketches. We
show that state-of-the-art deep networks specifically engineered for photos of
natural objects fail to perform well on sketch recognition, regardless whether
they are trained using photo or sketch. Our network on the other hand not only
delivers the best performance on the largest human sketch dataset to date, but
also is small in size making efficient training possible using just CPUs.Comment: Accepted to BMVC 2015 (oral
SERKET: An Architecture for Connecting Stochastic Models to Realize a Large-Scale Cognitive Model
To realize human-like robot intelligence, a large-scale cognitive
architecture is required for robots to understand the environment through a
variety of sensors with which they are equipped. In this paper, we propose a
novel framework named Serket that enables the construction of a large-scale
generative model and its inference easily by connecting sub-modules to allow
the robots to acquire various capabilities through interaction with their
environments and others. We consider that large-scale cognitive models can be
constructed by connecting smaller fundamental models hierarchically while
maintaining their programmatic independence. Moreover, connected modules are
dependent on each other, and parameters are required to be optimized as a
whole. Conventionally, the equations for parameter estimation have to be
derived and implemented depending on the models. However, it becomes harder to
derive and implement those of a larger scale model. To solve these problems, in
this paper, we propose a method for parameter estimation by communicating the
minimal parameters between various modules while maintaining their programmatic
independence. Therefore, Serket makes it easy to construct large-scale models
and estimate their parameters via the connection of modules. Experimental
results demonstrated that the model can be constructed by connecting modules,
the parameters can be optimized as a whole, and they are comparable with the
original models that we have proposed
A Survey of Adaptive Resonance Theory Neural Network Models for Engineering Applications
This survey samples from the ever-growing family of adaptive resonance theory
(ART) neural network models used to perform the three primary machine learning
modalities, namely, unsupervised, supervised and reinforcement learning. It
comprises a representative list from classic to modern ART models, thereby
painting a general picture of the architectures developed by researchers over
the past 30 years. The learning dynamics of these ART models are briefly
described, and their distinctive characteristics such as code representation,
long-term memory and corresponding geometric interpretation are discussed.
Useful engineering properties of ART (speed, configurability, explainability,
parallelization and hardware implementation) are examined along with current
challenges. Finally, a compilation of online software libraries is provided. It
is expected that this overview will be helpful to new and seasoned ART
researchers
One-shot learning of object categories
Learning visual models of object categories notoriously requires hundreds or thousands of training examples. We show that it is possible to learn much information about a category from just one, or a handful, of images. The key insight is that, rather than learning from scratch, one can take advantage of knowledge coming from previously learned categories, no matter how different these categories might be. We explore a Bayesian implementation of this idea. Object categories are represented by probabilistic models. Prior knowledge is represented as a probability density function on the parameters of these models. The posterior model for an object category is obtained by updating the prior in the light of one or more observations. We test a simple implementation of our algorithm on a database of 101 diverse object categories. We compare category models learned by an implementation of our Bayesian approach to models learned from by maximum likelihood (ML) and maximum a posteriori (MAP) methods. We find that on a database of more than 100 categories, the Bayesian approach produces informative models when the number of training examples is too small for other methods to operate successfully
- …