1,355 research outputs found

    Feature extraction based on bio-inspired model for robust emotion recognition

    Get PDF
    Emotional state identification is an important issue to achieve more natural speech interactive systems. Ideally, these systems should also be able to work in real environments in which generally exist some kind of noise. Several bio-inspired representations have been applied to artificial systems for speech processing under noise conditions. In this work, an auditory signal representation is used to obtain a novel bio-inspired set of features for emotional speech signals. These characteristics, together with other spectral and prosodic features, are used for emotion recognition under noise conditions. Neural models were trained as classifiers and results were compared to the well-known mel-frequency cepstral coefficients. Results show that using the proposed representations, it is possible to significantly improve the robustness of an emotion recognition system. The results were also validated in a speaker independent scheme and with two emotional speech corpora.Fil: Albornoz, Enrique Marcelo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Milone, Diego Humberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Rufiner, Hugo Leonardo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentin

    Fully generated scripted dialogue for embodied agents

    Get PDF
    This paper presents the NECA approach to the generation of dialogues between Embodied Conversational Agents (ECAs). This approach consist of the automated construction of an abstract script for an entire dialogue (cast in terms of dialogue acts), which is incrementally enhanced by a series of modules and finally ''performed'' by means of text, speech and body language, by a cast of ECAs. The approach makes it possible to automatically produce a large variety of highly expressive dialogues, some of whose essential properties are under the control of a user. The paper discusses the advantages and disadvantages of NECA's approach to Fully Generated Scripted Dialogue (FGSD), and explains the main techniques used in the two demonstrators that were built. The paper can be read as a survey of issues and techniques in the construction of ECAs, focusing on the generation of behaviour (i.e., focusing on information presentation) rather than on interpretation

    Gesture Theory is Linguistics: On Modelling Multimodality as Prosody

    Get PDF
    PACLIC 23 / City University of Hong Kong / 3-5 December 200

    Paralinguistic abilities of adults with intellectual disability

    Get PDF
    The aim of this research was to determine the ability level of paralinguistic production and comprehension in adults with intellectual disability (ID) with regard to the level of their intellectual functioning and presence of co-morbid psychiatric conditions or dual diagnosis (DD). The sample consisted of 120 participants of both genders, ranging in age between 20 and 56 years (M=31.82, SD =8.702). Approximately 50% of the sample comprised participants with a co-existing psychiatric condition. Each of these two sub-samples (those with ID only and those with DD) consisted of 25 participants with mild ID and 35 participants with moderate ID. The paralinguistic scale from The Assessment Battery for Communication (ABaCo; Sacco et al., 2008) was used to assess the abilities of comprehension and production of paralinguistic elements. The results showed that the participants with mild ID are more successful than the participants with moderate ID both in paralinguistic comprehension tasks (p =.000) and in paralinguistic production tasks (p =.001). Additionally, the results indicated the presence of separate influences of both ID levels on all of the paralinguistic abilities (F [116]= 42.549, p =.000) and the existence of DD (F[116] = 18.215, p =.000).This is the peer-reviewed version of the article: Dordević, M.; Glumbićs, N.; Brojčin, B. Paralinguistic Abilities of Adults with Intellectual Disability. Res. Dev. Disabil. 2016, 48, 211–219. [https://doi.org/10.1016/j.ridd.2015.11.001

    Using prosodic cues to identify dialogue acts: methodological challenges

    Get PDF
    Using prosodic cues to identify dialogue acts: methodological challenge

    Irony in a second language: exploring the comprehension of Japanese speakers of English

    Get PDF
    This thesis focuses on the extent to which non-native speakers of English understand potentially ironic utterances in a similar way to native speakers. Barbe (1995: 4) sees irony as one of ‘the final obstacles before achieving near native-speaker fluency.’ This assumption is supported by the findings of earlier studies (Bouton 1999, Lee 2002; Manowong 2011; Yamanaka 2003) which assumed a Gricean framework seeing irony as communicating the ‘opposite of what is said’ (Grice 1975, 1978). This thesis adopts instead the relevance-theoretic account of irony as echoic (Sperber and Wilson 1995; Wilson and Sperber 2012), arguing that previous work suffers from both problematic theoretical assumptions and flawed experimental methods. The thesis reports the findings of two experiments designed to examine similarities and differences between the responses of non-native speakers of English (here Japanese speakers) and native speakers and how similar or different the effects of prosody are for these groups. The first experiment, conducted by an online survey, provided surprising results, suggesting that Japanese speakers can respond to potentially ironical utterances similarly to native speakers. The second experiment, focusing on the effects of prosody, compared the groups with regard to response trends. Three prosodic contours were used in this study, labelled ‘basic’ (a kind of default, unmarked tone), ‘deadpan’ (with a narrower pitch range), and ‘exaggerated’ (with a wider pitch range). The results indicated that Japanese participants could perceive English prosodic structure in similar ways to native speakers and were affected by prosodic contours in similar ways. It also suggested that Japanese participants were affected less strongly by ‘exaggerated’ intonation and slightly more strongly by ‘deadpan’ tones. These findings suggest that a relevance-theoretic framework provides the means to carry out fuller investigations than carried out previously and to develop a more systematic explanation of the understanding of irony in a second language

    Learning to adapt in dialogue systems : data-driven models for personality recognition and generation.

    Get PDF
    Dialogue systems are artefacts that converse with human users in order to achieve some task. Each step of the dialogue requires understanding the user's input, deciding on what to reply, and generating an output utterance. Although there are many ways to express any given content, most dialogue systems do not take linguistic variation into account in both the understanding and generation phases, i.e. the user's linguistic style is typically ignored, and the style conveyed by the system is chosen once for all interactions at development time. We believe that modelling linguistic variation can greatly improve the interaction in dialogue systems, such as in intelligent tutoring systems, video games, or information retrieval systems, which all require specific linguistic styles. Previous work has shown that linguistic style affects many aspects of users' perceptions, even when the dialogue is task-oriented. Moreover, users attribute a consistent personality to machines, even when exposed to a limited set of cues, thus dialogue systems manifest personality whether designed into the system or not. Over the past few years, psychologists have identified the main dimensions of individual differences in human behaviour: the Big Five personality traits. We hypothesise that the Big Five provide a useful computational framework for modelling important aspects of linguistic variation. This thesis first explores the possibility of recognising the user's personality using data-driven models trained on essays and conversational data. We then test whether it is possible to generate language varying consistently along each personality dimension in the information presentation domain. We present PERSONAGE: a language generator modelling findings from psychological studies to project various personality traits. We use PERSONAGE to compare various generation paradigms: (1) rule-based generation, (2) overgenerate and select and (3) generation using parameter estimation models-a novel approach that learns to produce recognisable variation along meaningful stylistic dimensions without the computational cost incurred by overgeneration techniques. We also present the first human evaluation of a data-driven generation method that projects multiple stylistic dimensions simultaneously and on a continuous scale

    Post-training discriminative pruning for RBMs

    Get PDF
    One of the major challenges in the area of artificial neural networks is the identification of a suitable architecture for a specific problem. Choosing an unsuitable topology can exponentially increase the training cost, and even hinder network convergence. On the other hand, recent research indicates that larger or deeper nets can map the problem features into a more appropriate space, and thereby improve the classification process, thus leading to an apparent dichotomy. In this regard, it is interesting to inquire whether independent measures, such as mutual information, could provide a clue to finding the most discriminative neurons in a network. In the present work we explore this question in the context of Restricted Boltzmann Machines, by employing different measures to realize post-training pruning. The neurons which are determined by each measure to be the most discriminative, are combined and a classifier is applied to the ensuing network to determine its usefulness. We find that two measures in particular seem to be good indicators of the most discriminative neurons, producing savings of generally more than 50% of the neurons, while maintaining an acceptable error rate. Further, it is borne out that starting with a larger network architecture and then pruning is more advantageous than using a smaller network to begin with. Finally, a quantitative index is introduced which can provide information on choosing a suitable pruned network.Fil: Sánchez Gutiérrez, Máximo. Universidad Autónoma Metropolitana; MéxicoFil: Albornoz, Enrique Marcelo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Rufiner, Hugo Leonardo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentina. Universidad Nacional de Entre Ríos; ArgentinaFil: Close, John Goddard. Universidad Autónoma Metropolitana; Méxic
    corecore