6,541 research outputs found
Emotional State Categorization from Speech: Machine vs. Human
This paper presents our investigations on emotional state categorization from
speech signals with a psychologically inspired computational model against
human performance under the same experimental setup. Based on psychological
studies, we propose a multistage categorization strategy which allows
establishing an automatic categorization model flexibly for a given emotional
speech categorization task. We apply the strategy to the Serbian Emotional
Speech Corpus (GEES) and the Danish Emotional Speech Corpus (DES), where human
performance was reported in previous psychological studies. Our work is the
first attempt to apply machine learning to the GEES corpus where the human
recognition rates were only available prior to our study. Unlike the previous
work on the DES corpus, our work focuses on a comparison to human performance
under the same experimental settings. Our studies suggest that
psychology-inspired systems yield behaviours that, to a great extent, resemble
what humans perceived and their performance is close to that of humans under
the same experimental setup. Furthermore, our work also uncovers some
differences between machine and humans in terms of emotional state recognition
from speech.Comment: 14 pages, 15 figures, 12 table
The neural bases of event monitoring across domains: a simultaneous ERP-fMRI study.
The ability to check and evaluate the environment over time with the aim to detect the occurrence of target stimuli is supported by sustained/tonic as well as transient/phasic control processes, which overall might be referred to as event monitoring. The neural underpinning of sustained control processes involves a fronto-parietal network. However, it has not been well-defined yet whether this cortical circuit acts irrespective of the specific material to be monitored and whether this mediates sustained as well as transient monitoring processes. In the current study, the functional activity of brain during an event monitoring task was investigated and compared between two cognitive domains, whose processing is mediated by differently lateralized areas. Namely, participants were asked to monitor sequences of either faces (supported by right-hemisphere regions) or tools (left-hemisphere). In order to disentangle sustained from transient components of monitoring, a simultaneous EEG-fMRI technique was adopted within a block design. When contrasting monitoring versus control blocks, the conventional fMRI analysis revealed the sustained involvement of bilateral fronto-parietal regions, in both task domains. Event-related potentials (ERPs) showed a more positive amplitude over frontal sites in monitoring compared to control blocks, providing evidence of a transient monitoring component. The joint ERP-fMRI analysis showed that, in the case of face monitoring, these transient processes rely on right-lateralized areas, including the inferior parietal lobule and the middle frontal gyrus. In the case of tools, no fronto-parietal areas correlated with the transient ERP activity, suggesting that in this domain phasic monitoring processes were masked by tonic ones. Overall, the present findings highlight the role of bilateral fronto-parietal regions in sustained monitoring, independently of the specific task requirements, and suggest that right-lateralized areas subtend transient monitoring processes, at least in some task contexts
A fine-grained approach to scene text script identification
This paper focuses on the problem of script identification in unconstrained
scenarios. Script identification is an important prerequisite to recognition,
and an indispensable condition for automatic text understanding systems
designed for multi-language environments. Although widely studied for document
images and handwritten documents, it remains an almost unexplored territory for
scene text images.
We detail a novel method for script identification in natural images that
combines convolutional features and the Naive-Bayes Nearest Neighbor
classifier. The proposed framework efficiently exploits the discriminative
power of small stroke-parts, in a fine-grained classification framework.
In addition, we propose a new public benchmark dataset for the evaluation of
joint text detection and script identification in natural scenes. Experiments
done in this new dataset demonstrate that the proposed method yields state of
the art results, while it generalizes well to different datasets and variable
number of scripts. The evidence provided shows that multi-lingual scene text
recognition in the wild is a viable proposition. Source code of the proposed
method is made available online
The evolution of grounded spatial language
This book presents groundbreaking robotic experiments on how and why spatial language evolves. It provides detailed explanations of the origins of spatial conceptualization strategies, spatial categories, landmark systems and spatial grammar by tracing the interplay of environmental conditions, communicative and cognitive pressures. The experiments discussed in this book go far beyond previous approaches in grounded language evolution. For the first time, agents can evolve not only particular lexical systems but also evolve complex conceptualization strategies underlying the emergence of category systems and compositional semantics. Moreover, many issues in cognitive science, ranging from perception and conceptualization to language processing, had to be dealt with to instantiate these experiments, so that this book contributes not only to the study of language evolution but to the investigation of the cognitive bases of spatial language as well
The neural correlates of semantic richness : Evidence from an fMRI study of word learning
We investigated the neural correlates of concrete nouns with either many or few semantic features. A group of 21 participants underwent two days of training and were then asked to categorize 40 newly learned words and a set of matched familiar words as living or nonliving in an MRI scanner. Our results showed that the most reliable effects of semantic richness were located in the left angular gyrus (AG) and middle temporal gyrus (MTG), where activation was higher for semantically rich than poor words. Other areas showing the same pattern included bilateral precuneus and posterior cingulate gyrus. Our findings support the view that AG and anterior MTG, as part of the multimodal network, play a significant role in representing and integrating semantic features from different input modalities. We propose that activation in bilateral precuneus and posterior cingulate gyrus reflects interplay between AG and episodic memory systems during semantic retrieval
Visual style: Qualitative and context-dependent categorization
Style is an ordering principle by which to structure artifacts in a design domain. The application of a visual order entails some explicit grouping property that is both cognitively plausible and contextually dependent. Central to cognitive-contextual notions are the type of representation used in analysis and the flexibility to allow semantic interpretation. We present a model of visual style based on the concept of similarity as a qualitative context-dependent categorization. The two core components of the model are semantic feature extraction and self-organizing maps (SOMs). The model proposes a method of categorizing two-dimensional unannotated design diagrams using both low-level geometric and high-level semantic features that are automatically derived from the pictorial content of the design. The operation of the initial model, called Q-SOM, is then extended to include relevance feedback (Q-SOM:RF). The extended model can be seen as a series of sequential processing stages, in which qualitative encoding and feature extraction are followed by iterative recategorization. Categorization is achieved using an unsupervised SOM, and contextual dependencies are integrated via cluster relevance determined by the observer's feedback. The following stages are presented: initial per feature detection and extraction, selection of feature sets corresponding to different spatial ontologies, unsupervised categorization of design diagrams based on appropriate feature subsets, and integration of design context via relevance feedback. From our experiments we compare different outcomes from consecutive stages of the model. The results show that the model provides a cognitively plausible and context-dependent method for characterizing visual style in design. Copyright © 2006 Cambridge University Press
- …