6,734 research outputs found
Symbol Emergence in Robotics: A Survey
Humans can learn the use of language through physical interaction with their
environment and semiotic communication with other people. It is very important
to obtain a computational understanding of how humans can form a symbol system
and obtain semiotic skills through their autonomous mental development.
Recently, many studies have been conducted on the construction of robotic
systems and machine-learning methods that can learn the use of language through
embodied multimodal interaction with their environment and other systems.
Understanding human social interactions and developing a robot that can
smoothly communicate with human users in the long term, requires an
understanding of the dynamics of symbol systems and is crucially important. The
embodied cognition and social interaction of participants gradually change a
symbol system in a constructive manner. In this paper, we introduce a field of
research called symbol emergence in robotics (SER). SER is a constructive
approach towards an emergent symbol system. The emergent symbol system is
socially self-organized through both semiotic communications and physical
interactions with autonomous cognitive developmental agents, i.e., humans and
developmental robots. Specifically, we describe some state-of-art research
topics concerning SER, e.g., multimodal categorization, word discovery, and a
double articulation analysis, that enable a robot to obtain words and their
embodied meanings from raw sensory--motor information, including visual
information, haptic information, auditory information, and acoustic speech
signals, in a totally unsupervised manner. Finally, we suggest future
directions of research in SER.Comment: submitted to Advanced Robotic
Multimodal Hierarchical Dirichlet Process-based Active Perception
In this paper, we propose an active perception method for recognizing object
categories based on the multimodal hierarchical Dirichlet process (MHDP). The
MHDP enables a robot to form object categories using multimodal information,
e.g., visual, auditory, and haptic information, which can be observed by
performing actions on an object. However, performing many actions on a target
object requires a long time. In a real-time scenario, i.e., when the time is
limited, the robot has to determine the set of actions that is most effective
for recognizing a target object. We propose an MHDP-based active perception
method that uses the information gain (IG) maximization criterion and lazy
greedy algorithm. We show that the IG maximization criterion is optimal in the
sense that the criterion is equivalent to a minimization of the expected
Kullback--Leibler divergence between a final recognition state and the
recognition state after the next set of actions. However, a straightforward
calculation of IG is practically impossible. Therefore, we derive an efficient
Monte Carlo approximation method for IG by making use of a property of the
MHDP. We also show that the IG has submodular and non-decreasing properties as
a set function because of the structure of the graphical model of the MHDP.
Therefore, the IG maximization problem is reduced to a submodular maximization
problem. This means that greedy and lazy greedy algorithms are effective and
have a theoretical justification for their performance. We conducted an
experiment using an upper-torso humanoid robot and a second one using synthetic
data. The experimental results show that the method enables the robot to select
a set of actions that allow it to recognize target objects quickly and
accurately. The results support our theoretical outcomes.Comment: submitte
Computational and Robotic Models of Early Language Development: A Review
We review computational and robotics models of early language learning and
development. We first explain why and how these models are used to understand
better how children learn language. We argue that they provide concrete
theories of language learning as a complex dynamic system, complementing
traditional methods in psychology and linguistics. We review different modeling
formalisms, grounded in techniques from machine learning and artificial
intelligence such as Bayesian and neural network approaches. We then discuss
their role in understanding several key mechanisms of language development:
cross-situational statistical learning, embodiment, situated social
interaction, intrinsically motivated learning, and cultural evolution. We
conclude by discussing future challenges for research, including modeling of
large-scale empirical data about language acquisition in real-world
environments.
Keywords: Early language learning, Computational and robotic models, machine
learning, development, embodiment, social interaction, intrinsic motivation,
self-organization, dynamical systems, complexity.Comment: to appear in International Handbook on Language Development, ed. J.
Horst and J. von Koss Torkildsen, Routledg
The cybernetic Bayesian brain: from interoceptive inference to sensorimotor contingencies
Is there a single principle by which neural operations can account for perception, cognition, action, and even consciousness? A strong candidate is now taking shape in the form of âpredictive processingâ. On this theory, brains engage in predictive inference on the causes of sensory inputs by continuous minimization of prediction errors or informational âfree energyâ. Predictive processing can account, supposedly, not only for perception, but also for action and for the essential contribution of the body and environment in structuring sensorimotor interactions. In this paper I draw together some recent developments within predictive processing that involve predictive modelling of internal physiological states (interoceptive inference), and integration with âenactiveâ and âembodiedâ approaches to cognitive science (predictive perception of sensorimotor contingencies). The upshot is a development of predictive processing that originates, not in Helmholtzian perception-as-inference, but rather in 20th-century cybernetic principles that emphasized homeostasis and predictive control. This way of thinking leads to (i) a new view of emotion as active interoceptive inference; (ii) a common predictive framework linking experiences of body ownership, emotion, and exteroceptive perception; (iii) distinct interpretations of active inference as involving disruptive and disambiguatoryânot just confirmatoryâactions to test perceptual hypotheses; (iv) a neurocognitive operationalization of the âmastery of sensorimotor contingenciesâ (where sensorimotor contingencies reflect the rules governing sensory changes produced by various actions); and (v) an account of the sense of subjective reality of perceptual contents (âperceptual presenceâ) in terms of the extent to which predictive models encode potential sensorimotor relations (this being âcounterfactual richnessâ). This is rich and varied territory, and surveying its landmarks emphasizes the need for experimental tests of its key contributions
Multimodal Bayesian Network for Artificial Perception
In order to make machines perceive their external environment coherently, multiple sources of sensory information derived from several different modalities can be used (e.g. cameras, LIDAR, stereo, RGB-D, and radars). All these different sources of information can be efficiently merged to form a robust perception of the environment. Some of the mechanisms that underlie this merging of the sensor information are highlighted in this chapter, showing that depending on the type of information, different combination and integration strategies can be used and that prior knowledge are often required for interpreting the sensory signals efficiently. The notion that perception involves Bayesian inference is an increasingly popular position taken by a considerable number of researchers. Bayesian models have provided insights into many perceptual phenomena, showing that they are a valid approach to deal with real-world uncertainties and for robust classification, including classification in time-dependent problems. This chapter addresses the use of Bayesian networks applied to sensory perception in the following areas: mobile robotics, autonomous driving systems, advanced driver assistance systems, sensor fusion for object detection, and EEG-based mental states classification
- âŠ