13,171 research outputs found

    SHOE:The extraction of hierarchical structure for machine learning of natural language

    Get PDF

    Gender detection in children’s speech utterances for human-robot interaction

    Get PDF
    The human voice speech essentially includes paralinguistic information used in many real-time applications. Detecting the children’s gender is considered a challenging task compared to the adult’s gender. In this study, a system for human-robot interaction (HRI) is proposed to detect the gender in children’s speech utterances without depending on the text. The robot's perception includes three phases: Feature’s extraction phase where four formants are measured at each glottal pulse and then a median is calculated across these measurements. After that, three types of features are measured which are formant average (AF), formant dispersion (DF), and formant position (PF). Feature’s standardization phase where the measured feature dimensions are standardized using the z-score method. The semantic understanding phase is where the children’s gender is detected accurately using the logistic regression classifier. At the same time, the action of the robot is specified via a speech response using the text to speech (TTS) technique. Experiments are conducted on the Carnegie Mellon University (CMU) Kids dataset to measure the suggested system’s performance. In the suggested system, the overall accuracy is 98%. The results show a relatively clear improvement in terms of accuracy of up to 13% compared to related works that utilized the CMU Kids dataset

    A process-oriented language for describing aspects of reading comprehension

    Get PDF
    Includes bibliographical references (p. 36-38)The research described herein was supported in part by the National Institute of Education under Contract No. MS-NIE-C-400-76-011

    c

    Get PDF
    In this article, we describe and interpret a set of acoustic and linguistic features that characterise emotional/emotion-related user states – confined to the one database processed: four classes in a German corpus of children interacting with a pet robot. To this end, we collected a very large feature vector consisting of more than 4000 features extracted at different sites. We performed extensive feature selection (Sequential Forward Floating Search) for seven acoustic and four linguistic types of features, ending up in a small number of ‘most important ’ features which we try to interpret by discussing the impact of different feature and extraction types. We establish different measures of impact and discuss the mutual influence of acoustics and linguistics

    USING DEEP LEARNING-BASED FRAMEWORK FOR CHILD SPEECH EMOTION RECOGNITION

    Get PDF
    Biological languages of the body through which human emotion can be detected abound including heart rate, facial expressions, movement of the eyelids and dilation of the eyes, body postures, skin conductance, and even the speech we make. Speech emotion recognition research started some three decades ago, and the popular Interspeech Emotion Challenge has helped to propagate this research area. However, most speech recognition research is focused on adults and there is very little research on child speech. This dissertation is a description of the development and evaluation of a child speech emotion recognition framework. The higher-level components of the framework are designed to sort and separate speech based on the speaker’s age, ensuring that focus is only on speeches made by children. The framework uses Baddeley’s Theory of Working Memory to model a Working Memory Recurrent Network that can process and recognize emotions from speech. Baddeley’s Theory of Working Memory offers one of the best explanations on how the human brain holds and manipulates temporary information which is very crucial in the development of neural networks that learns effectively. Experiments were designed and performed to provide answers to the research questions, evaluate the proposed framework, and benchmark the performance of the framework with other methods. Satisfactory results were obtained from the experiments and in many cases, our framework was able to outperform other popular approaches. This study has implications for various applications of child speech emotion recognition such as child abuse detection and child learning robots

    Contexts for writing: understanding the child’s perspective

    Get PDF
    The integration of social theories into a cognitive explanation of the composing process enlarges our notion of context, calling attention to the historical, social and ideological forces that shape the making of knowledge in educational settings. These approaches suggest that context cues certain actions and that students gain entry into academic contexts if they learn the appropriate forms and discourse conventions. However, methodological approaches to teaching do not address how individuals construct meaning, use knowledge for their own purposes, or engage in reflective processes that influence how individuals will act in a socially-governed situation. Nor do they address the issue of how school-acquired knowledge may be transformed to enable individual students to take ownership of their writing. These concerns motivate the attempt to form a cognitive-social epistemic that acknowledges and explains the role of the individual in constructing meaning within culturally-organized activities in primary educational systems. Through questionnaires, interviews and classroom observations, and applying qualitative analytical procedures, the study discloses layers of complexity in a multi-level description of the ways context and cognition interact. At the general level, a comparative analysis of teachers' and pupils' rationales underlying given writing tasks produces converging references to the educational purposes for writing. At a deeper level, findings that writing possibilities and social possibilities are dynamically interlinked with the emergence of identity, suggest that learning is a constructive process of meaning-making which is uniquely manifested in diverse ways. Studies of classroom interaction determine the impact of strategies deployed within classroom communication to control the meaning-making process and make it possible to discuss the efficacies of peer-interaction in the classroom. A second strand of contextual-oriented research in a non-school setting, which incorporates the computer as a writing tool, reinforces the view that children are primarily social players negotiating roles and relationships by whatever mediational means are made available to them. In light of these results, the thesis acknowledges the complexity of a largely implicit cultural architecture for directing the context of action, and concludes that this structure will be explicated only by adopting an inclusive research strategy to encompass simultaneous acting influences

    Frustration recognition from speech during game interaction using wide residual networks

    Get PDF
    ABSTRACT Background Although frustration is a common emotional reaction during playing games, an excessive level of frustration can harm users’ experiences, discouraging them from undertaking further game interactions. The automatic detection of players’ frustration enables the development of adaptive systems, which through a real-time difficulty adjustment, would adapt the game to the user’s specific needs; thus, maximising players experience and guaranteeing the game success. To this end, we present our speech-based approach for the automatic detection of frustration during game interactions, a specific task still under-explored in research. Method The experiments were performed on the Multimodal Game Frustration Database (MGFD), an audiovisual dataset—collected within the Wizard-of-Oz framework—specially tailored to investigate verbal and facial expressions of frustration during game interactions. We explored the performance of a variety of acoustic feature sets, including Mel-Spectrograms and Mel-Frequency Cepstral Coefficients (MFCCs), as well as the low dimensional knowledge-based acoustic feature set eGeMAPS. Due to the always increasing improvements achieved by the use of Convolutional Neural Networks (CNNs) in speech recognition tasks, unlike the MGFD baseline—based on Long Short-Term Memory (LSTM) architecture and Support Vector Machine (SVM) classifier—in the present work we take into consideration typically used CNNs, including ResNets, VGG, and AlexNet. Furthermore, given the still open debate on the shallow vs deep networks suitability, we also examine the performance of two of the latest deep CNNs, i. e., WideResNets and EfficientNet. Results Our best result, achieved with WideResNets and Mel-Spectrogram features, increases the system performance from 58.8 % Unweighted Average Recall (UAR) to 93.1 % UAR for speech-based automatic frustration recognition

    The effects of music on brain development

    Get PDF
    The influence of music on brain development is a complex and widely studied area in science. Studies show that being exposed to music, especially in early stages of life, can improve different cognitive abilities like language, reasoning, and spatial-temporal skills. Engaging with music also helps in regulating emotions and developing social skills. Furthermore, music education supports creativity and self-expression, which are crucial for overall brain development. With a deeper understanding of how music affects the brain at a neurological level, educators, therapists, and parents can leverage its advantages to enhance well-being and cognitive functions for people of all ages
    • …
    corecore