6,541 research outputs found

    Emotional State Categorization from Speech: Machine vs. Human

    Full text link
    This paper presents our investigations on emotional state categorization from speech signals with a psychologically inspired computational model against human performance under the same experimental setup. Based on psychological studies, we propose a multistage categorization strategy which allows establishing an automatic categorization model flexibly for a given emotional speech categorization task. We apply the strategy to the Serbian Emotional Speech Corpus (GEES) and the Danish Emotional Speech Corpus (DES), where human performance was reported in previous psychological studies. Our work is the first attempt to apply machine learning to the GEES corpus where the human recognition rates were only available prior to our study. Unlike the previous work on the DES corpus, our work focuses on a comparison to human performance under the same experimental settings. Our studies suggest that psychology-inspired systems yield behaviours that, to a great extent, resemble what humans perceived and their performance is close to that of humans under the same experimental setup. Furthermore, our work also uncovers some differences between machine and humans in terms of emotional state recognition from speech.Comment: 14 pages, 15 figures, 12 table

    The neural bases of event monitoring across domains: a simultaneous ERP-fMRI study.

    Get PDF
    The ability to check and evaluate the environment over time with the aim to detect the occurrence of target stimuli is supported by sustained/tonic as well as transient/phasic control processes, which overall might be referred to as event monitoring. The neural underpinning of sustained control processes involves a fronto-parietal network. However, it has not been well-defined yet whether this cortical circuit acts irrespective of the specific material to be monitored and whether this mediates sustained as well as transient monitoring processes. In the current study, the functional activity of brain during an event monitoring task was investigated and compared between two cognitive domains, whose processing is mediated by differently lateralized areas. Namely, participants were asked to monitor sequences of either faces (supported by right-hemisphere regions) or tools (left-hemisphere). In order to disentangle sustained from transient components of monitoring, a simultaneous EEG-fMRI technique was adopted within a block design. When contrasting monitoring versus control blocks, the conventional fMRI analysis revealed the sustained involvement of bilateral fronto-parietal regions, in both task domains. Event-related potentials (ERPs) showed a more positive amplitude over frontal sites in monitoring compared to control blocks, providing evidence of a transient monitoring component. The joint ERP-fMRI analysis showed that, in the case of face monitoring, these transient processes rely on right-lateralized areas, including the inferior parietal lobule and the middle frontal gyrus. In the case of tools, no fronto-parietal areas correlated with the transient ERP activity, suggesting that in this domain phasic monitoring processes were masked by tonic ones. Overall, the present findings highlight the role of bilateral fronto-parietal regions in sustained monitoring, independently of the specific task requirements, and suggest that right-lateralized areas subtend transient monitoring processes, at least in some task contexts

    A fine-grained approach to scene text script identification

    Full text link
    This paper focuses on the problem of script identification in unconstrained scenarios. Script identification is an important prerequisite to recognition, and an indispensable condition for automatic text understanding systems designed for multi-language environments. Although widely studied for document images and handwritten documents, it remains an almost unexplored territory for scene text images. We detail a novel method for script identification in natural images that combines convolutional features and the Naive-Bayes Nearest Neighbor classifier. The proposed framework efficiently exploits the discriminative power of small stroke-parts, in a fine-grained classification framework. In addition, we propose a new public benchmark dataset for the evaluation of joint text detection and script identification in natural scenes. Experiments done in this new dataset demonstrate that the proposed method yields state of the art results, while it generalizes well to different datasets and variable number of scripts. The evidence provided shows that multi-lingual scene text recognition in the wild is a viable proposition. Source code of the proposed method is made available online

    The evolution of grounded spatial language

    Get PDF
    This book presents groundbreaking robotic experiments on how and why spatial language evolves. It provides detailed explanations of the origins of spatial conceptualization strategies, spatial categories, landmark systems and spatial grammar by tracing the interplay of environmental conditions, communicative and cognitive pressures. The experiments discussed in this book go far beyond previous approaches in grounded language evolution. For the first time, agents can evolve not only particular lexical systems but also evolve complex conceptualization strategies underlying the emergence of category systems and compositional semantics. Moreover, many issues in cognitive science, ranging from perception and conceptualization to language processing, had to be dealt with to instantiate these experiments, so that this book contributes not only to the study of language evolution but to the investigation of the cognitive bases of spatial language as well

    The neural correlates of semantic richness : Evidence from an fMRI study of word learning

    Get PDF
    We investigated the neural correlates of concrete nouns with either many or few semantic features. A group of 21 participants underwent two days of training and were then asked to categorize 40 newly learned words and a set of matched familiar words as living or nonliving in an MRI scanner. Our results showed that the most reliable effects of semantic richness were located in the left angular gyrus (AG) and middle temporal gyrus (MTG), where activation was higher for semantically rich than poor words. Other areas showing the same pattern included bilateral precuneus and posterior cingulate gyrus. Our findings support the view that AG and anterior MTG, as part of the multimodal network, play a significant role in representing and integrating semantic features from different input modalities. We propose that activation in bilateral precuneus and posterior cingulate gyrus reflects interplay between AG and episodic memory systems during semantic retrieval

    Visual style: Qualitative and context-dependent categorization

    Full text link
    Style is an ordering principle by which to structure artifacts in a design domain. The application of a visual order entails some explicit grouping property that is both cognitively plausible and contextually dependent. Central to cognitive-contextual notions are the type of representation used in analysis and the flexibility to allow semantic interpretation. We present a model of visual style based on the concept of similarity as a qualitative context-dependent categorization. The two core components of the model are semantic feature extraction and self-organizing maps (SOMs). The model proposes a method of categorizing two-dimensional unannotated design diagrams using both low-level geometric and high-level semantic features that are automatically derived from the pictorial content of the design. The operation of the initial model, called Q-SOM, is then extended to include relevance feedback (Q-SOM:RF). The extended model can be seen as a series of sequential processing stages, in which qualitative encoding and feature extraction are followed by iterative recategorization. Categorization is achieved using an unsupervised SOM, and contextual dependencies are integrated via cluster relevance determined by the observer's feedback. The following stages are presented: initial per feature detection and extraction, selection of feature sets corresponding to different spatial ontologies, unsupervised categorization of design diagrams based on appropriate feature subsets, and integration of design context via relevance feedback. From our experiments we compare different outcomes from consecutive stages of the model. The results show that the model provides a cognitively plausible and context-dependent method for characterizing visual style in design. Copyright © 2006 Cambridge University Press
    • …
    corecore