4,136 research outputs found
Encoding of phonology in a recurrent neural model of grounded speech
We study the representation and encoding of phonemes in a recurrent neural
network model of grounded speech. We use a model which processes images and
their spoken descriptions, and projects the visual and auditory representations
into the same semantic space. We perform a number of analyses on how
information about individual phonemes is encoded in the MFCC features extracted
from the speech signal, and the activations of the layers of the model. Via
experiments with phoneme decoding and phoneme discrimination we show that
phoneme representations are most salient in the lower layers of the model,
where low-level signals are processed at a fine-grained level, although a large
amount of phonological information is retain at the top recurrent layer. We
further find out that the attention mechanism following the top recurrent layer
significantly attenuates encoding of phonology and makes the utterance
embeddings much more invariant to synonymy. Moreover, a hierarchical clustering
of phoneme representations learned by the network shows an organizational
structure of phonemes similar to those proposed in linguistics.Comment: Accepted at CoNLL 201
Ongoing Emergence: A Core Concept in Epigenetic Robotics
We propose ongoing emergence as a core concept in
epigenetic robotics. Ongoing emergence refers to the
continuous development and integration of new skills
and is exhibited when six criteria are satisfied: (1)
continuous skill acquisition, (2) incorporation of new
skills with existing skills, (3) autonomous development
of values and goals, (4) bootstrapping of initial skills, (5)
stability of skills, and (6) reproducibility. In this paper
we: (a) provide a conceptual synthesis of ongoing
emergence based on previous theorizing, (b) review
current research in epigenetic robotics in light of ongoing
emergence, (c) provide prototypical examples of ongoing
emergence from infant development, and (d) outline
computational issues relevant to creating robots
exhibiting ongoing emergence
Emerging Linguistic Functions in Early Infancy
This paper presents results from experimental
studies on early language acquisition in infants and
attempts to interpret the experimental results within
the framework of the Ecological Theory of
Language Acquisition (ETLA) recently proposed
by (Lacerda et al., 2004a). From this perspective,
the infantâs first steps in the acquisition of the
ambient language are seen as a consequence of the
infantâs general capacity to represent sensory input
and the infantâs interaction with other actors in its
immediate ecological environment. On the basis of
available experimental evidence, it will be argued
that ETLA offers a productive alternative to
traditional descriptive views of the language
acquisition process by presenting an operative
model of how early linguistic function may emerge
through interaction
Learning weakly supervised multimodal phoneme embeddings
Recent works have explored deep architectures for learning multimodal speech
representation (e.g. audio and images, articulation and audio) in a supervised
way. Here we investigate the role of combining different speech modalities,
i.e. audio and visual information representing the lips movements, in a weakly
supervised way using Siamese networks and lexical same-different side
information. In particular, we ask whether one modality can benefit from the
other to provide a richer representation for phone recognition in a weakly
supervised setting. We introduce mono-task and multi-task methods for merging
speech and visual modalities for phone recognition. The mono-task learning
consists in applying a Siamese network on the concatenation of the two
modalities, while the multi-task learning receives several different
combinations of modalities at train time. We show that multi-task learning
enhances discriminability for visual and multimodal inputs while minimally
impacting auditory inputs. Furthermore, we present a qualitative analysis of
the obtained phone embeddings, and show that cross-modal visual input can
improve the discriminability of phonological features which are visually
discernable (rounding, open/close, labial place of articulation), resulting in
representations that are closer to abstract linguistic features than those
based on audio only
Introduction: The Third International Conference on Epigenetic Robotics
This paper summarizes the paper and poster contributions
to the Third International Workshop on
Epigenetic Robotics. The focus of this workshop is
on the cross-disciplinary interaction of developmental
psychology and robotics. Namely, the general
goal in this area is to create robotic models of the
psychological development of various behaviors. The
term "epigenetic" is used in much the same sense as
the term "developmental" and while we could call
our topic "developmental robotics", developmental
robotics can be seen as having a broader interdisciplinary
emphasis. Our focus in this workshop is
on the interaction of developmental psychology and
robotics and we use the phrase "epigenetic robotics"
to capture this focus
Evaluating computational models of infant phonetic learning across languages
In the first year of life, infants' speech perception becomes attuned to the
sounds of their native language. Many accounts of this early phonetic learning
exist, but computational models predicting the attunement patterns observed in
infants from the speech input they hear have been lacking. A recent study
presented the first such model, drawing on algorithms proposed for unsupervised
learning from naturalistic speech, and tested it on a single phone contrast.
Here we study five such algorithms, selected for their potential cognitive
relevance. We simulate phonetic learning with each algorithm and perform tests
on three phone contrasts from different languages, comparing the results to
infants' discrimination patterns. The five models display varying degrees of
agreement with empirical observations, showing that our approach can help
decide between candidate mechanisms for early phonetic learning, and providing
insight into which aspects of the models are critical for capturing infants'
perceptual development.Comment: 7 pages, 1 figur
An emergentist perspective on the origin of number sense
open2noopenZorzi, Marco; Testolin, AlbertoZorzi, Marco; Testolin, Albert
- âŠ