661 research outputs found

    Signal processing algorithms for digital hearing aids

    Get PDF
    Hearing loss is a problem that severely affects the speech communication and disqualify most hearing-impaired people from holding a normal life. Although the vast majority of hearing loss cases could be corrected by using hearing aids, however, only a scarce of hearing-impaired people who could be benefited from hearing aids purchase one. This irregular use of hearing aids arises from the existence of a problem that, to date, has not been solved effectively and comfortably: the automatic adaptation of the hearing aid to the changing acoustic environment that surrounds its user. There are two approaches aiming to comply with it. On the one hand, the "manual" approach, in which the user has to identify the acoustic situation and choose the adequate amplification program has been found to be very uncomfortable. The second approach requires to include an automatic program selection within the hearing aid. This latter approach is deemed very useful by most hearing aid users, even if its performance is not completely perfect. Although the necessity of the aforementioned sound classification system seems to be clear, its implementation is a very difficult matter. The development of an automatic sound classification system in a digital hearing aid is a challenging goal because of the inherent limitations of the Digital Signal Processor (DSP) the hearing aid is based on. The underlying reason is that most digital hearing aids have very strong constraints in terms of computational capacity, memory and battery, which seriously limit the implementation of advanced algorithms in them. With this in mind, this thesis focuses on the design and implementation of a prototype for a digital hearing aid able to automatically classify the acoustic environments hearing aid users daily face on and select the amplification program that is best adapted to such environment aiming at enhancing the speech intelligibility perceived by the user. The most important contribution of this thesis is the implementation of a prototype for a digital hearing aid that automatically classifies the acoustic environment surrounding its user and selects the most appropriate amplification program for such environment, aiming at enhancing the sound quality perceived by the user. The battery life of this hearing aid is 140 hours, which has been found to be very similar to that of hearing aids in the market, and what is of key importance, there is still about 30% of the DSP resources available for implementing other algorithms

    Speech Mode Classification using the Fusion of CNNs and LSTM Networks

    Get PDF
    Speech mode classification is an area that has not been as widely explored in the field of sound classification as others such as environmental sounds, music genre, and speaker identification. But what is speech mode? While mode is defined as the way or the manner in which something occurs or is expressed or done, speech mode is defined as the style in which the speech is delivered by a person. There are some reports on speech mode classification using conventional methods, such as whispering and talking using a normal phonetic sound. However, to the best of our knowledge, deep learning-based methods have not been reported in the open literature for the aforementioned classification scenario. Specifically, in this work we assess the performance of image-based classification algorithms on this challenging speech mode classification problem, including the usage of pre-trained deep neural networks, namely AlexNet, ResNet18 and SqueezeNet. Thus, we compare the classification efficiency of a set of deep learning-based classifiers, while we also assess the impact of different 2D image representations (spectrograms, mel-spectrograms, and their image-based fusion) on classification accuracy. These representations are used as input to the networks after being generated from the original audio signals. Next, we compare the accuracy of the DL-based classifies to a set of machine learning (ML) ones that use as their inputs Mel-Frequency Cepstral Coefficients (MFCCs) features. Then, after determining the most efficient sampling rate for our classification problem (i.e. 32kHz), we study the performance of our proposed method of combining CNN with LSTM (Long Short-Term Memory) networks. For this purpose, we use the features extracted from the deep networks of the previous step. We conclude our study by evaluating the role of sampling rates on classification accuracy by generating two sets of 2D image representations – one with 32kHz and the other with 16kHz sampling. Experimental results show that after cross validation the accuracy of DL-based approaches is 15% higher than ML ones, with SqueezeNet yielding an accuracy of more than 91% at 32kHz, whether we use transfer learning, feature-level fusion or score-level fusion (92.5%). Our proposed method using LSTMs further increased that accuracy by more than 3%, resulting in an average accuracy of 95.7%

    Classification and Separation Techniques based on Fundamental Frequency for Speech Enhancement

    Get PDF
    [ES] En esta tesis se desarrollan nuevos algoritmos de clasificación y mejora de voz basados en las propiedades de la frecuencia fundamental (F0) de la señal vocal. Estas propiedades permiten su discriminación respecto al resto de señales de la escena acústica, ya sea mediante la definición de características (para clasificación) o la definición de modelos de señal (para separación). Tres contribuciones se aportan en esta tesis: 1) un algoritmo de clasificación de entorno acústico basado en F0 para audífonos digitales, capaz de clasificar la señal en las clases voz y no-voz; 2) un algoritmo de detección de voz sonora basado en la aperiodicidad, capaz de funcionar en ruido no estacionario y con aplicación a mejora de voz; 3) un algoritmo de separación de voz y ruido basado en descomposición NMF, donde el ruido se modela de una forma genérica mediante restricciones matemáticas.[EN]This thesis is focused on the development of new classification and speech enhancement algorithms based, explicitly or implicitly, on the fundamental frequency (F0). The F0 of speech has a number of properties that enable speech discrimination from the remaining signals in the acoustic scene, either by defining F0-based signal features (for classification) or F0-based signal models (for separation). Three main contributions are included in this work: 1) an acoustic environment classification algorithm for hearing aids based on F0 to classify the input signal into speech and nonspeech classes; 2) a frame-by-frame basis voiced speech detection algorithm based on the aperiodicity measure, able to work under non-stationary noise and applicable to speech enhancement; 3) a speech denoising algorithm based on a regularized NMF decomposition, in which the background noise is described in a generic way with mathematical constraints.Tesis Univ. Jaén. Departamento de Ingeniería de Telecomunición. Leída el 11 de enero de 201

    Models and analysis of vocal emissions for biomedical applications

    Get PDF
    This book of Proceedings collects the papers presented at the 3rd International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, MAVEBA 2003, held 10-12 December 2003, Firenze, Italy. The workshop is organised every two years, and aims to stimulate contacts between specialists active in research and industrial developments, in the area of voice analysis for biomedical applications. The scope of the Workshop includes all aspects of voice modelling and analysis, ranging from fundamental research to all kinds of biomedical applications and related established and advanced technologies

    Engineering derivatives from biological systems for advanced aerospace applications

    Get PDF
    The present study consisted of a literature survey, a survey of researchers, and a workshop on bionics. These tasks produced an extensive annotated bibliography of bionics research (282 citations), a directory of bionics researchers, and a workshop report on specific bionics research topics applicable to space technology. These deliverables are included as Appendix A, Appendix B, and Section 5.0, respectively. To provide organization to this highly interdisciplinary field and to serve as a guide for interested researchers, we have also prepared a taxonomy or classification of the various subelements of natural engineering systems. Finally, we have synthesized the results of the various components of this study into a discussion of the most promising opportunities for accelerated research, seeking solutions which apply engineering principles from natural systems to advanced aerospace problems. A discussion of opportunities within the areas of materials, structures, sensors, information processing, robotics, autonomous systems, life support systems, and aeronautics is given. Following the conclusions are six discipline summaries that highlight the potential benefits of research in these areas for NASA's space technology programs

    Towards Cognizant Hearing Aids: Modeling of Content, Affect and Attention

    Get PDF

    Computational speech segregation inspired by principles of auditory processing

    Get PDF
    corecore