32,704 research outputs found
Symbol Emergence in Robotics: A Survey
Humans can learn the use of language through physical interaction with their
environment and semiotic communication with other people. It is very important
to obtain a computational understanding of how humans can form a symbol system
and obtain semiotic skills through their autonomous mental development.
Recently, many studies have been conducted on the construction of robotic
systems and machine-learning methods that can learn the use of language through
embodied multimodal interaction with their environment and other systems.
Understanding human social interactions and developing a robot that can
smoothly communicate with human users in the long term, requires an
understanding of the dynamics of symbol systems and is crucially important. The
embodied cognition and social interaction of participants gradually change a
symbol system in a constructive manner. In this paper, we introduce a field of
research called symbol emergence in robotics (SER). SER is a constructive
approach towards an emergent symbol system. The emergent symbol system is
socially self-organized through both semiotic communications and physical
interactions with autonomous cognitive developmental agents, i.e., humans and
developmental robots. Specifically, we describe some state-of-art research
topics concerning SER, e.g., multimodal categorization, word discovery, and a
double articulation analysis, that enable a robot to obtain words and their
embodied meanings from raw sensory--motor information, including visual
information, haptic information, auditory information, and acoustic speech
signals, in a totally unsupervised manner. Finally, we suggest future
directions of research in SER.Comment: submitted to Advanced Robotic
Real Time Animation of Virtual Humans: A Trade-off Between Naturalness and Control
Virtual humans are employed in many interactive applications using 3D virtual environments, including (serious) games. The motion of such virtual humans should look realistic (or ‘natural’) and allow interaction with the surroundings and other (virtual) humans. Current animation techniques differ in the trade-off they offer between motion naturalness and the control that can be exerted over the motion. We show mechanisms to parametrize, combine (on different body parts) and concatenate motions generated by different animation techniques. We discuss several aspects of motion naturalness and show how it can be evaluated. We conclude by showing the promise of combinations of different animation paradigms to enhance both naturalness and control
Articulation rate in Swedish child-directed speech increases as a function of the age of the child even when surprisal is controlled for
In earlier work, we have shown that articulation rate in Swedish
child-directed speech (CDS) increases as a function of the age of the child,
even when utterance length and differences in articulation rate between
subjects are controlled for. In this paper we show on utterance level in
spontaneous Swedish speech that i) for the youngest children, articulation rate
in CDS is lower than in adult-directed speech (ADS), ii) there is a significant
negative correlation between articulation rate and surprisal (the negative log
probability) in ADS, and iii) the increase in articulation rate in Swedish CDS
as a function of the age of the child holds, even when surprisal along with
utterance length and differences in articulation rate between speakers are
controlled for. These results indicate that adults adjust their articulation
rate to make it fit the linguistic capacity of the child.Comment: 5 pages, Interspeech 201
Fast, invariant representation for human action in the visual system
Humans can effortlessly recognize others' actions in the presence of complex
transformations, such as changes in viewpoint. Several studies have located the
regions in the brain involved in invariant action recognition, however, the
underlying neural computations remain poorly understood. We use
magnetoencephalography (MEG) decoding and a dataset of well-controlled,
naturalistic videos of five actions (run, walk, jump, eat, drink) performed by
different actors at different viewpoints to study the computational steps used
to recognize actions across complex transformations. In particular, we ask when
the brain discounts changes in 3D viewpoint relative to when it initially
discriminates between actions. We measure the latency difference between
invariant and non-invariant action decoding when subjects view full videos as
well as form-depleted and motion-depleted stimuli. Our results show no
difference in decoding latency or temporal profile between invariant and
non-invariant action recognition in full videos. However, when either form or
motion information is removed from the stimulus set, we observe a decrease and
delay in invariant action decoding. Our results suggest that the brain
recognizes actions and builds invariance to complex transformations at the same
time, and that both form and motion information are crucial for fast, invariant
action recognition
The Evolution of First Person Vision Methods: A Survey
The emergence of new wearable technologies such as action cameras and
smart-glasses has increased the interest of computer vision scientists in the
First Person perspective. Nowadays, this field is attracting attention and
investments of companies aiming to develop commercial devices with First Person
Vision recording capabilities. Due to this interest, an increasing demand of
methods to process these videos, possibly in real-time, is expected. Current
approaches present a particular combinations of different image features and
quantitative methods to accomplish specific objectives like object detection,
activity recognition, user machine interaction and so on. This paper summarizes
the evolution of the state of the art in First Person Vision video analysis
between 1997 and 2014, highlighting, among others, most commonly used features,
methods, challenges and opportunities within the field.Comment: First Person Vision, Egocentric Vision, Wearable Devices, Smart
Glasses, Computer Vision, Video Analytics, Human-machine Interactio
Synthesis of variable dancing styles based on a compact spatiotemporal representation of dance
Dance as a complex expressive form of motion is able to convey emotion, meaning and social idiosyncrasies that opens channels for non-verbal communication, and promotes rich cross-modal interactions with music and the environment. As such, realistic dancing characters may incorporate crossmodal information and variability of the dance forms through compact representations that may describe the movement structure in terms of its spatial and temporal organization. In this paper, we propose a novel method for synthesizing beatsynchronous dancing motions based on a compact topological model of dance styles, previously captured with a motion capture system. The model was based on the Topological Gesture Analysis (TGA) which conveys a discrete three-dimensional point-cloud representation of the dance, by describing the spatiotemporal variability of its gestural trajectories into uniform spherical distributions, according to classes of the musical meter. The methodology for synthesizing the modeled dance traces back the topological representations, constrained with definable metrical and spatial parameters, into complete dance instances whose variability is controlled by stochastic processes that considers both TGA distributions and the kinematic constraints of the body morphology. In order to assess the relevance and flexibility of each parameter into feasibly reproducing the style of the captured dance, we correlated both captured and synthesized trajectories of samba dancing sequences in relation to the level of compression of the used model, and report on a subjective evaluation over a set of six tests. The achieved results validated our approach, suggesting that a periodic dancing style, and its musical synchrony, can be feasibly reproduced from a suitably parametrized discrete spatiotemporal representation of the gestural motion trajectories, with a notable degree of compression
- …