366 research outputs found

    An Experimental Study on Speech Enhancement Based on a Combination of Wavelets and Deep Learning

    Get PDF
    The purpose of speech enhancement is to improve the quality of speech signals degraded by noise, reverberation, or other artifacts that can affect the intelligibility, automatic recognition, or other attributes involved in speech technologies and telecommunications, among others. In such applications, it is essential to provide methods to enhance the signals to allow the understanding of the messages or adequate processing of the speech. For this purpose, during the past few decades, several techniques have been proposed and implemented for the abundance of possible conditions and applications. Recently, those methods based on deep learning seem to outperform previous proposals even on real-time processing. Among the new explorations found in the literature, the hybrid approaches have been presented as a possibility to extend the capacity of individual methods, and therefore increase their capacity for the applications. In this paper, we evaluate a hybrid approach that combines both deep learning and wavelet transformation. The extensive experimentation performed to select the proper wavelets and the training of neural networks allowed us to assess whether the hybrid approach is of benefit or not for the speech enhancement task under several types and levels of noise, providing relevant information for future implementations.UCR::Vicerrectoría de Docencia::Ingeniería::Facultad de Ingeniería::Escuela de Ingeniería Eléctric

    Real-Time Perceptual Moving-Horizon Multiple-Description Audio Coding

    Get PDF
    A novel scheme for perceptual coding of audio for robust and real-time communication is designed and analyzed. As an alternative to PCM, DPCM, and more general noise-shaping converters, we propose to use psychoacoustically optimized noise-shaping quantizers based on the moving-horizon principle. In moving-horizon quantization, a few samples look-ahead is allowed at the encoder, which makes it possible to better shape the quantization noise and thereby reduce the resulting distortion over what is possible with conventional noise-shaping techniques. It is first shown that significant gains over linear PCM can be obtained without introducing a delay and without requiring postprocessing at the decoder, i.e., the encoded samples can be stored as, e.g., 16-bit linear PCM on CD-ROMs, and played out on standards-compliant CD players. We then show that multiple-description coding can be combined with moving-horizon quantization in order to combat possible erasures on the wireless link without introducing additional delays

    Adaptive Speech Enhancement Using Partial Differential Equations and Back Propagation Neural Networks

    Get PDF
    Abstract: In this work, we propose a new approach to improve the performance of speech enhancement technique based on partial differential equations. As we know, the real-world noise is highly random in nature. So we try for reduction of white Gaussian noise. The proposed method was evaluated on several speakers. The subjective and objective results show that the new method highly improves speech enhancement. Comparisons of several methods are reported

    Design of Computationally Efficient Digital FIR Filters and Filter Banks

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    A survey on artificial intelligence-based acoustic source identification

    Get PDF
    The concept of Acoustic Source Identification (ASI), which refers to the process of identifying noise sources has attracted increasing attention in recent years. The ASI technology can be used for surveillance, monitoring, and maintenance applications in a wide range of sectors, such as defence, manufacturing, healthcare, and agriculture. Acoustic signature analysis and pattern recognition remain the core technologies for noise source identification. Manual identification of acoustic signatures, however, has become increasingly challenging as dataset sizes grow. As a result, the use of Artificial Intelligence (AI) techniques for identifying noise sources has become increasingly relevant and useful. In this paper, we provide a comprehensive review of AI-based acoustic source identification techniques. We analyze the strengths and weaknesses of AI-based ASI processes and associated methods proposed by researchers in the literature. Additionally, we did a detailed survey of ASI applications in machinery, underwater applications, environment/event source recognition, healthcare, and other fields. We also highlight relevant research directions

    Modeling and rendering for development of a virtual bone surgery system

    Get PDF
    A virtual bone surgery system is developed to provide the potential of a realistic, safe, and controllable environment for surgical education. It can be used for training in orthopedic surgery, as well as for planning and rehearsal of bone surgery procedures...Using the developed system, the user can perform virtual bone surgery by simultaneously seeing bone material removal through a graphic display device, feeling the force via a haptic deice, and hearing the sound of tool-bone interaction --Abstract, page iii

    A Study into Speech Enhancement Techniques in Adverse Environment

    Get PDF
    This dissertation developed speech enhancement techniques that improve the speech quality in applications such as mobile communications, teleconferencing and smart loudspeakers. For these applications it is necessary to suppress noise and reverberation. Thus the contribution in this dissertation is twofold: single channel speech enhancement system which exploits the temporal and spectral diversity of the received microphone signal for noise suppression and multi-channel speech enhancement method with the ability to employ spatial diversity to reduce reverberation

    Models and analysis of vocal emissions for biomedical applications

    Get PDF
    This book of Proceedings collects the papers presented at the 3rd International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, MAVEBA 2003, held 10-12 December 2003, Firenze, Italy. The workshop is organised every two years, and aims to stimulate contacts between specialists active in research and industrial developments, in the area of voice analysis for biomedical applications. The scope of the Workshop includes all aspects of voice modelling and analysis, ranging from fundamental research to all kinds of biomedical applications and related established and advanced technologies
    corecore