3 research outputs found
Grounding the Meanings in Sensorimotor Behavior using Reinforcement Learning
The recent outburst of interest in cognitive developmental robotics is fueled by the ambition to propose ecologically plausible mechanisms of how, among other things, a learning agent/robot could ground linguistic meanings in its sensorimotor behavior. Along this stream, we propose a model that allows the simulated iCub robot to learn the meanings of actions (point, touch, and push) oriented toward objects in robotâs peripersonal space. In our experiments, the iCub learns to execute motor actions and comment on them. Architecturally, the model is composed of three neural-network-based modules that are trained in different ways. The first module, a two-layer perceptron, is trained by back-propagation to attend to the target position in the visual scene, given the low-level visual information and the feature-based target information. The second module, having the form of an actor-critic architecture, is the most distinguishing part of our model, and is trained by a continuous version of reinforcement learning to execute actions as sequences, based on a linguistic command. The third module, an echo-state network, is trained to provide the linguistic description of the executed actions. The trained model generalizes well in case of novel action-target combinations with randomized initial arm positions. It can also promptly adapt its behavior if the action/target suddenly changes during motor execution
An investigation of fast and slow mapping
Children learn words astonishingly skilfully. Even infants can reliably âfast mapâ
novel category labels to their referents without feedback or supervision (Carey &
Bartlett, 1978; Houston-Price, Plunkett, & Harris, 2005). Using both empirical and
neural network modelling methods this thesis presents an examination of both the fast
and slow mapping phases of children's early word learning in the context of object and
action categorisation. A series of empirical experiments investigates the relationship
between within-category perceptual variability on two-year-old childrenâs ability to
learn labels for novel categories of objects and actions. Results demonstrate that
variability profoundly affects both noun and verb learning.
A review paper situates empirical word learning research in the context of recent
advances in the application of computational models to developmental research. Data
from the noun experiments are then simulated using a Dynamic Neural Field (DNF)
model (see Spencer & Schöner, 2009), suggesting that childrenâs early object categories
can emerge dynamically from simple label-referent associations strengthened over time.
Novel predictions generated by the model are replicated empirically, providing proofof-
concept for the use of DNF models in simulations of word learning, as well
emphasising the strong featural basis of early categorisation.
The noun data are further explored using a connectionist architecture (Morse, de
Greef, Belpaeme & Cangelosi, 2010) in a robotic system, providing the groundwork for
future research in cognitive robotics. The implications of these different approaches to
cognitive modelling are discussed, situating the current work firmly in the dynamic
systems tradition whilst emphasising the value of interdisciplinary research in
motivating novel research paradigms
Grounding semantic cognition using computational modelling and network analysis
The overarching objective of this thesis is to further the field of grounded semantics using a range of computational and empirical studies. Over the past thirty years, there have been many algorithmic advances in the
modelling of semantic cognition. A commonality across these cognitive models is a reliance on hand-engineering âtoy-modelsâ. Despite incorporating newer
techniques (e.g. Long short-term memory), the model inputs remain unchanged. We argue that the inputs to these traditional semantic models have little resemblance with real human experiences. In this dissertation, we ground our neural network models by training them with real-world visual scenes using naturalistic photographs. Our approach is an alternative to both hand-coded
features and embodied raw sensorimotor signals.
We conceptually replicate the mutually reinforcing nature of hybrid (feature-based and grounded) representations using silhouettes of concrete concepts as model inputs. We next gradually develop a novel grounded cognitive semantic representation which we call scene2vec, starting with object co-occurrences and then adding emotions and language-based tags. Limitations of our scene-based representation are identified for more abstract concepts (e.g. freedom). We further present a large-scale human semantics study, which reveals small-world semantic network topologies are context-dependent and
that scenes are the most dominant cognitive dimension. This finding leads us to conclude that there is no meaning without context. Lastly, scene2vec shows
promising human-like context-sensitive stereotypes (e.g. gender role bias), and we explore how such stereotypes are reduced by targeted debiasing. In conclusion, this thesis provides support for a novel computational
viewpoint on investigating meaning - scene-based grounded semantics. Future research scaling scene-based semantic models to human-levels through virtual grounding has the potential to unearth new insights into the human mind and
concurrently lead to advancements in artificial general intelligence by enabling robots, embodied or otherwise, to acquire and represent meaning directly from the environment