862 research outputs found

    Studies on noise robust automatic speech recognition

    Get PDF
    Noise in everyday acoustic environments such as cars, traffic environments, and cafeterias remains one of the main challenges in automatic speech recognition (ASR). As a research theme, it has received wide attention in conferences and scientific journals focused on speech technology. This article collection reviews both the classic and novel approaches suggested for noise robust ASR. The articles are literature reviews written for the spring 2009 seminar course on noise robust automatic speech recognition (course code T-61.6060) held at TKK

    Driver Drowsiness Estimation from EEG Signals Using Online Weighted Adaptation Regularization for Regression (OwARR)

    Full text link
    © 1993-2012 IEEE. One big challenge that hinders the transition of brain-computer interfaces (BCIs) from laboratory settings to real-life applications is the availability of high-performance and robust learning algorithms that can effectively handle individual differences, i.e., algorithms that can be applied to a new subject with zero or very little subject-specific calibration data. Transfer learning and domain adaptation have been extensively used for this purpose. However, most previous works focused on classification problems. This paper considers an important regression problem in BCI, namely, online driver drowsiness estimation from EEG signals. By integrating fuzzy sets with domain adaptation, we propose a novel online weighted adaptation regularization for regression (OwARR) algorithm to reduce the amount of subject-specific calibration data, and also a source domain selection (SDS) approach to save about half of the computational cost of OwARR. Using a simulated driving dataset with 15 subjects, we show that OwARR and OwARR-SDS can achieve significantly smaller estimation errors than several other approaches. We also provide comprehensive analyses on the robustness of OwARR and OwARR-SDS

    Histogram equalization for robust text-independent speaker verification in telephone environments

    Get PDF
    Word processed copy. Includes bibliographical references

    Fundamental Approaches to Software Engineering

    Get PDF
    This open access book constitutes the proceedings of the 25th International Conference on Fundamental Approaches to Software Engineering, FASE 2022, which was held during April 4-5, 2022, in Munich, Germany, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022. The 17 regular papers presented in this volume were carefully reviewed and selected from 64 submissions. The proceedings also contain 3 contributions from the Test-Comp Competition. The papers deal with the foundations on which software engineering is built, including topics like software engineering as an engineering discipline, requirements engineering, software architectures, software quality, model-driven development, software processes, software evolution, AI-based software engineering, and the specification, design, and implementation of particular classes of systems, such as (self-)adaptive, collaborative, AI, embedded, distributed, mobile, pervasive, cyber-physical, or service-oriented applications

    Naturalistic Emotional Speech Corpora with Large Scale Emotional Dimension Ratings

    Get PDF
    The investigation of the emotional dimensions of speech is dependent on large sets of reliable data. Existing work has been carried out on the creation of emotional speech corpora and the acoustic analysis of emotional speech and this research seeks to buildupon this work while suggesting new methods and areas of potential. A review of the literature determined that a two dimensional emotional model of activation and evaluation was the ideal method for representing the emotional states expressed inspeech. Two case studies were carried out to investigate methods of obtaining naturalunderlying emotional speech in a high quality audio environment, the results of which were used to design a final experimental procedure to elicit natural underlying emotional speech. The speech obtained in this experiment was used in the creation ofa speech corpus that was underpinned by a persistent backend database that incorporated a three-tiered annotation methodology. This methodology was used to comprehensively annotate the metadata, acoustic data and emotional data of the recorded speech. Structuring the three levels of annotation and the assets in a persistent backend database allowed interactive web-based tools to be developed; aweb-based listening tool was developed to obtain a large amount of ratings for the assets that were then written back to the database for analysis. Once a large amount of ratings had been obtained, statistical analysis was used to determine the dimensionalrating for each asset. Acoustic analysis of the underlying emotional speech was then carried out and determined that certain acoustic parameters were correlated with the activation dimension of the dimensional model. This substantiated some of thefindings in the literature review and further determined that spectral energy was strongly correlated with the activation dimension in relation to underlying emotional speech. The lack of a correlation for certain acoustic parameters in relation to the evaluation dimension was also determined, again substantiating some of the findings in the literature.The work contained in this thesis makes a number of contributions to the field: the development of an experimental design to elicit natural underlying emotional speech in a high quality audio environment; the development and implementation of acomprehensive three-tiered corpus annotation methodology; the development and implementation of large scale web based listening tests to rate the emotional dimensions of emotional speech; the determination that certain acoustic parameters are correlated with the activation dimension of a dimensional emotional model inrelation to natural underlying emotional speech and the determination that certain acoustic parameters are not correlated with the evaluation dimension of a twodimensional emotional model in relation to natural underlying emotional speech

    Fundamental Approaches to Software Engineering

    Get PDF
    This open access book constitutes the proceedings of the 25th International Conference on Fundamental Approaches to Software Engineering, FASE 2022, which was held during April 4-5, 2022, in Munich, Germany, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022. The 17 regular papers presented in this volume were carefully reviewed and selected from 64 submissions. The proceedings also contain 3 contributions from the Test-Comp Competition. The papers deal with the foundations on which software engineering is built, including topics like software engineering as an engineering discipline, requirements engineering, software architectures, software quality, model-driven development, software processes, software evolution, AI-based software engineering, and the specification, design, and implementation of particular classes of systems, such as (self-)adaptive, collaborative, AI, embedded, distributed, mobile, pervasive, cyber-physical, or service-oriented applications

    Speech Recognition

    Get PDF
    Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes

    Models and analysis of vocal emissions for biomedical applications

    Get PDF
    This book of Proceedings collects the papers presented at the 3rd International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, MAVEBA 2003, held 10-12 December 2003, Firenze, Italy. The workshop is organised every two years, and aims to stimulate contacts between specialists active in research and industrial developments, in the area of voice analysis for biomedical applications. The scope of the Workshop includes all aspects of voice modelling and analysis, ranging from fundamental research to all kinds of biomedical applications and related established and advanced technologies

    Finding Frustration: a Dive into the EEG of Drivers

    Get PDF
    Emotion recognition technologies for driving are increasingly used to render automotive travel more pleasurable and, more importantly, safer. Since emotions such as frustration and anger can lead to an increase in traffic accidents, this thesis explored the utility of electroencephalogram (EEG) features to recognize the driver’s frustration level. It, therefore, sought to find a balance between the ecologically valid emotion induction of a driving simulator and the noise-sensitive but highly informative measure of the EEG. Participants’ brain activity was captured with the CGX quick-30 mobile EEG system. 19 participants completed four different frustration-inducing and two baseline driving scenarios in a 360° driving simulator. Subsequently, the participants continuously rated their frustration level based on the replay of each scenario. The resulting subjective measures were used to classify EEG time periods into episodes with or without frustration. Results showed that the frequently used measure of the Alpha Asymmetry Index (AAI) had, as hypothesized, significantly more negative indices for high frustration (vs. no frustration). However, a commingling effect of anger on this result could not be dismissed. The results could not provide evidence for the yet to be replicated previous research of frustration correlates within narrow-band oscillations (delta, theta, alpha, and beta) at specified electrode positions (frontal, central, and posterior). This thesis concludes with suggestions for subsequent research endeavors and forthcoming practical implications in the form of insights acquired
    • …
    corecore