7,349 research outputs found

    Multimodal Speech Emotion Recognition Using Audio and Text

    Full text link
    Speech emotion recognition is a challenging task, and extensive reliance has been placed on models that use audio features in building well-performing classifiers. In this paper, we propose a novel deep dual recurrent encoder model that utilizes text data and audio signals simultaneously to obtain a better understanding of speech data. As emotional dialogue is composed of sound and spoken content, our model encodes the information from audio and text sequences using dual recurrent neural networks (RNNs) and then combines the information from these sources to predict the emotion class. This architecture analyzes speech data from the signal level to the language level, and it thus utilizes the information within the data more comprehensively than models that focus on audio features. Extensive experiments are conducted to investigate the efficacy and properties of the proposed model. Our proposed model outperforms previous state-of-the-art methods in assigning data to one of four emotion categories (i.e., angry, happy, sad and neutral) when the model is applied to the IEMOCAP dataset, as reflected by accuracies ranging from 68.8% to 71.8%.Comment: 7 pages, Accepted as a conference paper at IEEE SLT 201

    A Simple Web Platform Solution for M-Learning

    Get PDF
    Nowadays the role of educational platforms is more than obvious, thanks to websites and modern platforms like Microsoft SharePoint designed for e-learning. We consider that the next generation of learning platforms will be m-learning platforms. These kind of platforms offer first of all mobility for the potential users of PDAs, pocket PCs, smart phones and other modern mobile devices, discovered and developed in last years. One of the most important aspect of these manners of e-learning is the display mode. Classic systems like personal computers have a bigger screen, modern portable devices have a few inches screens and the problem is to adapt the structure of websites and platforms for pocket PC screens and in the same time to develop the capability to produce same experience and usefulness to all users.Platform, M-learning, Discussion Forum, Search Engine, JavaScript, IIS, Port Forwarding

    Wireless Handheld Computers in the Preclinical Undergraduate Curriculum

    Get PDF
    This report presents the results of a pilot project using wireless PDAs as teaching tools in an undergraduate medical curriculum. This technology was used to foster a transition from a passive to an interactive learning environment in the classroom and provided a solution for the implementation of computer-based exams for a large class. Wayne State Medical School recently provided model e570 Toshiba PocketPCs® (personal digital assistants or PDAs), network interface cards, and application software developed by CampusMobility® to 20 sophomore medical students. The pilot group of preclinical students used the PDAs to access web-based course content, for communication, scheduling, to participate in interactive teaching sessions, and to complete course evaluations. Another part of this pilot has been to utilize the PDAs for computer-based exams in a wireless environment. Server authentication that restricted access during the exams and a proctoring console to monitor and record the PDA screens will be described in this report. Results of a student satisfaction survey will be present

    Multi-Scale Attention with Dense Encoder for Handwritten Mathematical Expression Recognition

    Full text link
    Handwritten mathematical expression recognition is a challenging problem due to the complicated two-dimensional structures, ambiguous handwriting input and variant scales of handwritten math symbols. To settle this problem, we utilize the attention based encoder-decoder model that recognizes mathematical expression images from two-dimensional layouts to one-dimensional LaTeX strings. We improve the encoder by employing densely connected convolutional networks as they can strengthen feature extraction and facilitate gradient propagation especially on a small training set. We also present a novel multi-scale attention model which is employed to deal with the recognition of math symbols in different scales and save the fine-grained details that will be dropped by pooling operations. Validated on the CROHME competition task, the proposed method significantly outperforms the state-of-the-art methods with an expression recognition accuracy of 52.8% on CROHME 2014 and 50.1% on CROHME 2016, by only using the official training dataset

    Trends in ICT access and use

    Get PDF
    corecore