5,634 research outputs found
Motor (but not auditory) attention affects syntactic choice
Understanding the determinants of syntactic choice in sentence production is a salient topic
in psycholinguistics. Existing evidence suggests that syntactic choice results from an interplay
between linguistic and non-linguistic factors, and a speaker’s attention to the elements
of a described event represents one such factor. Whereas multimodal accounts of attention
suggest a role for different modalities in this process, existing studies examining attention
effects in syntactic choice are primarily based on visual cueing paradigms. Hence, it remains
unclear whether attentional effects on syntactic choice are limited to the visual modality or
are indeed more general. This issue is addressed by the current study. Native English participants
viewed and described line drawings of simple transitive events while their attention
was directed to the location of the agent or the patient of the depicted event by means of
either an auditory (monaural beep) or a motor (unilateral key press) lateral cue. Our results
show an effect of cue location, with participants producing more passive-voice descriptions
in the patient-cued conditions. Crucially, this cue location effect emerged in the motor-cue
but not (or substantially less so) in the auditory-cue condition, as confirmed by a reliable
interaction between cue location (agent vs. patient) and cue type (auditory vs. motor). Our
data suggest that attentional effects on the speaker’s syntactic choices are modality-specific
and limited to the visual and motor, but not the auditory, domain
Symbol Emergence in Robotics: A Survey
Humans can learn the use of language through physical interaction with their
environment and semiotic communication with other people. It is very important
to obtain a computational understanding of how humans can form a symbol system
and obtain semiotic skills through their autonomous mental development.
Recently, many studies have been conducted on the construction of robotic
systems and machine-learning methods that can learn the use of language through
embodied multimodal interaction with their environment and other systems.
Understanding human social interactions and developing a robot that can
smoothly communicate with human users in the long term, requires an
understanding of the dynamics of symbol systems and is crucially important. The
embodied cognition and social interaction of participants gradually change a
symbol system in a constructive manner. In this paper, we introduce a field of
research called symbol emergence in robotics (SER). SER is a constructive
approach towards an emergent symbol system. The emergent symbol system is
socially self-organized through both semiotic communications and physical
interactions with autonomous cognitive developmental agents, i.e., humans and
developmental robots. Specifically, we describe some state-of-art research
topics concerning SER, e.g., multimodal categorization, word discovery, and a
double articulation analysis, that enable a robot to obtain words and their
embodied meanings from raw sensory--motor information, including visual
information, haptic information, auditory information, and acoustic speech
signals, in a totally unsupervised manner. Finally, we suggest future
directions of research in SER.Comment: submitted to Advanced Robotic
Nonparametric Bayesian Double Articulation Analyzer for Direct Language Acquisition from Continuous Speech Signals
Human infants can discover words directly from unsegmented speech signals
without any explicitly labeled data. In this paper, we develop a novel machine
learning method called nonparametric Bayesian double articulation analyzer
(NPB-DAA) that can directly acquire language and acoustic models from observed
continuous speech signals. For this purpose, we propose an integrative
generative model that combines a language model and an acoustic model into a
single generative model called the "hierarchical Dirichlet process hidden
language model" (HDP-HLM). The HDP-HLM is obtained by extending the
hierarchical Dirichlet process hidden semi-Markov model (HDP-HSMM) proposed by
Johnson et al. An inference procedure for the HDP-HLM is derived using the
blocked Gibbs sampler originally proposed for the HDP-HSMM. This procedure
enables the simultaneous and direct inference of language and acoustic models
from continuous speech signals. Based on the HDP-HLM and its inference
procedure, we developed a novel double articulation analyzer. By assuming
HDP-HLM as a generative model of observed time series data, and by inferring
latent variables of the model, the method can analyze latent double
articulation structure, i.e., hierarchically organized latent words and
phonemes, of the data in an unsupervised manner. The novel unsupervised double
articulation analyzer is called NPB-DAA.
The NPB-DAA can automatically estimate double articulation structure embedded
in speech signals. We also carried out two evaluation experiments using
synthetic data and actual human continuous speech signals representing Japanese
vowel sequences. In the word acquisition and phoneme categorization tasks, the
NPB-DAA outperformed a conventional double articulation analyzer (DAA) and
baseline automatic speech recognition system whose acoustic model was trained
in a supervised manner.Comment: 15 pages, 7 figures, Draft submitted to IEEE Transactions on
Autonomous Mental Development (TAMD
On staying grounded and avoiding Quixotic dead ends
The 15 articles in this special issue on The Representation of Concepts illustrate the rich variety of theoretical positions and supporting research that characterize the area. Although much agreement exists among contributors, much disagreement exists as well, especially about the roles of grounding and abstraction in conceptual processing. I first review theoretical approaches raised in these articles that I believe are Quixotic dead ends, namely, approaches that are principled and inspired but likely to fail. In the process, I review various theories of amodal symbols, their distortions of grounded theories, and fallacies in the evidence used to support them. Incorporating further contributions across articles, I then sketch a theoretical approach that I believe is likely to be successful, which includes grounding, abstraction, flexibility, explaining classic conceptual phenomena, and making contact with real-world situations. This account further proposes that (1) a key element of grounding is neural reuse, (2) abstraction takes the forms of multimodal compression, distilled abstraction, and distributed linguistic representation (but not amodal symbols), and (3) flexible context-dependent representations are a hallmark of conceptual processing
Scene extraction in motion pictures
This paper addresses the challenge of bridging the semantic gap between the rich meaning users desire when they query to locate and browse media and the shallowness of media descriptions that can be computed in today\u27s content management systems. To facilitate high-level semantics-based content annotation and interpretation, we tackle the problem of automatic decomposition of motion pictures into meaningful story units, namely scenes. Since a scene is a complicated and subjective concept, we first propose guidelines from fill production to determine when a scene change occurs. We then investigate different rules and conventions followed as part of Fill Grammar that would guide and shape an algorithmic solution for determining a scene. Two different techniques using intershot analysis are proposed as solutions in this paper. In addition, we present different refinement mechanisms, such as film-punctuation detection founded on Film Grammar, to further improve the results. These refinement techniques demonstrate significant improvements in overall performance. Furthermore, we analyze errors in the context of film-production techniques, which offer useful insights into the limitations of our method
Robot Navigation in Unseen Spaces using an Abstract Map
Human navigation in built environments depends on symbolic spatial
information which has unrealised potential to enhance robot navigation
capabilities. Information sources such as labels, signs, maps, planners, spoken
directions, and navigational gestures communicate a wealth of spatial
information to the navigators of built environments; a wealth of information
that robots typically ignore. We present a robot navigation system that uses
the same symbolic spatial information employed by humans to purposefully
navigate in unseen built environments with a level of performance comparable to
humans. The navigation system uses a novel data structure called the abstract
map to imagine malleable spatial models for unseen spaces from spatial symbols.
Sensorimotor perceptions from a robot are then employed to provide purposeful
navigation to symbolic goal locations in the unseen environment. We show how a
dynamic system can be used to create malleable spatial models for the abstract
map, and provide an open source implementation to encourage future work in the
area of symbolic navigation. Symbolic navigation performance of humans and a
robot is evaluated in a real-world built environment. The paper concludes with
a qualitative analysis of human navigation strategies, providing further
insights into how the symbolic navigation capabilities of robots in unseen
built environments can be improved in the future.Comment: 15 pages, published in IEEE Transactions on Cognitive and
Developmental Systems (http://doi.org/10.1109/TCDS.2020.2993855), see
https://btalb.github.io/abstract_map/ for access to softwar
- …