5,608 research outputs found

    Classification and Separation Techniques based on Fundamental Frequency for Speech Enhancement

    Get PDF
    [ES] En esta tesis se desarrollan nuevos algoritmos de clasificación y mejora de voz basados en las propiedades de la frecuencia fundamental (F0) de la señal vocal. Estas propiedades permiten su discriminación respecto al resto de señales de la escena acústica, ya sea mediante la definición de características (para clasificación) o la definición de modelos de señal (para separación). Tres contribuciones se aportan en esta tesis: 1) un algoritmo de clasificación de entorno acústico basado en F0 para audífonos digitales, capaz de clasificar la señal en las clases voz y no-voz; 2) un algoritmo de detección de voz sonora basado en la aperiodicidad, capaz de funcionar en ruido no estacionario y con aplicación a mejora de voz; 3) un algoritmo de separación de voz y ruido basado en descomposición NMF, donde el ruido se modela de una forma genérica mediante restricciones matemáticas.[EN]This thesis is focused on the development of new classification and speech enhancement algorithms based, explicitly or implicitly, on the fundamental frequency (F0). The F0 of speech has a number of properties that enable speech discrimination from the remaining signals in the acoustic scene, either by defining F0-based signal features (for classification) or F0-based signal models (for separation). Three main contributions are included in this work: 1) an acoustic environment classification algorithm for hearing aids based on F0 to classify the input signal into speech and nonspeech classes; 2) a frame-by-frame basis voiced speech detection algorithm based on the aperiodicity measure, able to work under non-stationary noise and applicable to speech enhancement; 3) a speech denoising algorithm based on a regularized NMF decomposition, in which the background noise is described in a generic way with mathematical constraints.Tesis Univ. Jaén. Departamento de Ingeniería de Telecomunición. Leída el 11 de enero de 201

    Mathematics and Digital Signal Processing

    Get PDF
    Modern computer technology has opened up new opportunities for the development of digital signal processing methods. The applications of digital signal processing have expanded significantly and today include audio and speech processing, sonar, radar, and other sensor array processing, spectral density estimation, statistical signal processing, digital image processing, signal processing for telecommunications, control systems, biomedical engineering, and seismology, among others. This Special Issue is aimed at wide coverage of the problems of digital signal processing, from mathematical modeling to the implementation of problem-oriented systems. The basis of digital signal processing is digital filtering. Wavelet analysis implements multiscale signal processing and is used to solve applied problems of de-noising and compression. Processing of visual information, including image and video processing and pattern recognition, is actively used in robotic systems and industrial processes control today. Improving digital signal processing circuits and developing new signal processing systems can improve the technical characteristics of many digital devices. The development of new methods of artificial intelligence, including artificial neural networks and brain-computer interfaces, opens up new prospects for the creation of smart technology. This Special Issue contains the latest technological developments in mathematics and digital signal processing. The stated results are of interest to researchers in the field of applied mathematics and developers of modern digital signal processing systems

    Hierarchical feature extraction from spatiotemporal data for cyber-physical system analytics

    Get PDF
    With the advent of ubiquitous sensing, robust communication and advanced computation, data-driven modeling is increasingly becoming popular for many engineering problems. Eliminating difficulties of physics-based modeling, avoiding simplifying assumptions and ad hoc empirical models are significant among many advantages of data-driven approaches, especially for large-scale complex systems. While classical statistics and signal processing algorithms have been widely used by the engineering community, advanced machine learning techniques have not been sufficiently explored in this regard. This study summarizes various categories of machine learning tools that have been applied or may be a candidate for addressing engineering problems. While there are increasing number of machine learning algorithms, the main steps involved in applying such techniques to the problems consist in: data collection and pre-processing, feature extraction, model training and inference for decision-making. To support decision-making processes in many applications, hierarchical feature extraction is key. Among various feature extraction principles, recent studies emphasize hierarchical approaches of extracting salient features that is carried out at multiple abstraction levels from data. In this context, the focus of the dissertation is towards developing hierarchical feature extraction algorithms within the framework of machine learning in order to solve challenging cyber-physical problems in various domains such as electromechanical systems and agricultural systems. Furthermore, the feature extraction techniques are described using the spatial, temporal and spatiotemporal data types collected from the systems. The wide applicability of such features in solving some selected real-life domain problems are demonstrated throughout this study

    Single channel signal separation using pseudo-stereo model and time-freqency masking

    Get PDF
    PhD ThesisIn many practical applications, one sensor is only available to record a mixture of a number of signals. Single-channel blind signal separation (SCBSS) is the research topic that addresses the problem of recovering the original signals from the observed mixture without (or as little as possible) any prior knowledge of the signals. Given a single mixture, a new pseudo-stereo mixing model is developed. A “pseudo-stereo” mixture is formulated by weighting and time-shifting the original single-channel mixture. This creates an artificial resemblance of a stereo signal given by one location which results in the same time-delay but different attenuation of the source signals. The pseudo-stereo mixing model relaxes the underdetermined ill-conditions associated with monaural source separation and begets the advantage of the relationship of the signals between the readily observed mixture and the pseudo-stereo mixture. This research proposes three novel algorithms based on the pseudo-stereo mixing model and the binary time-frequency (TF) mask. Firstly, the proposed SCBSS algorithm estimates signals’ weighted coefficients from a ratio of the pseudo-stereo mixing model and then constructs a binary maximum likelihood TF masking for separating the observed mixture. Secondly, a mixture in noisy background environment is considered. Thus, a mixture enhancement algorithm has been developed and the proposed SCBSS algorithm is reformulated using an adaptive coefficients estimator. The adaptive coefficients estimator computes the signal characteristics for each time frame. This property is desirable for both speech and audio signals as they are aptly characterized as non-stationary AR processes. Finally, a multiple-time delay (MTD) pseudo-stereo SINGLE CHANNEL SIGNAL SEPARATION ii mixture is developed. The MTD mixture enhances the flexibility as well as the separability over the originally proposed pseudo-stereo mixing model. The separation algorithm of the MTD mixture has also been derived. Additionally, comparison analysis between the MTD mixture and the pseudo-stereo mixture has also been identified. All algorithms have been demonstrated by synthesized and real-audio signals. The performance of source separation has been assessed by measuring the distortion between original source and the estimated one according to the signal-to-distortion (SDR) ratio. Results show that all proposed SCBSS algorithms yield a significantly better separation performance with an average SDR improvement that ranges from 2.4dB to 5dB per source and they are computationally faster over the benchmarked algorithms.Payap University

    Binary Masking & Speech Intelligibility

    Get PDF

    CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap

    Get PDF
    After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in multimedia search engines, we have identified and analyzed gaps within European research effort during our second year. In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio- economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal challenges

    Emotion Recognition from Speech with Acoustic, Non-Linear and Wavelet-based Features Extracted in Different Acoustic Conditions

    Get PDF
    ABSTRACT: In the last years, there has a great progress in automatic speech recognition. The challenge now it is not only recognize the semantic content in the speech but also the called "paralinguistic" aspects of the speech, including the emotions, and the personality of the speaker. This research work aims in the development of a methodology for the automatic emotion recognition from speech signals in non-controlled noise conditions. For that purpose, different sets of acoustic, non-linear, and wavelet based features are used to characterize emotions in different databases created for such purpose

    Metaheuristic design of feedforward neural networks: a review of two decades of research

    Get PDF
    Over the past two decades, the feedforward neural network (FNN) optimization has been a key interest among the researchers and practitioners of multiple disciplines. The FNN optimization is often viewed from the various perspectives: the optimization of weights, network architecture, activation nodes, learning parameters, learning environment, etc. Researchers adopted such different viewpoints mainly to improve the FNN's generalization ability. The gradient-descent algorithm such as backpropagation has been widely applied to optimize the FNNs. Its success is evident from the FNN's application to numerous real-world problems. However, due to the limitations of the gradient-based optimization methods, the metaheuristic algorithms including the evolutionary algorithms, swarm intelligence, etc., are still being widely explored by the researchers aiming to obtain generalized FNN for a given problem. This article attempts to summarize a broad spectrum of FNN optimization methodologies including conventional and metaheuristic approaches. This article also tries to connect various research directions emerged out of the FNN optimization practices, such as evolving neural network (NN), cooperative coevolution NN, complex-valued NN, deep learning, extreme learning machine, quantum NN, etc. Additionally, it provides interesting research challenges for future research to cope-up with the present information processing era
    corecore