54 research outputs found

    Low-delay nonuniform pseudo-QMF banks with application to speech enhancement

    Get PDF
    Journal ArticleAbstract-This paper presents a method for designing low-delay nonuniform pseudo quadrature mirror filter (QMF) banks. This method is motivated by the work of Li, Nguyen, and Tantaratana, in which the nonuniform filter bank is realized by combining an appropriate number of adjacent sub-bands of a uniform pseudo-QMF bank. In prior work, the prototype filter of the uniform pseudo-QMF bank was constrained to have linear phase and the overall delay associated with the filter bank was often unacceptably large for filter banks with a large number of sub-bands. This paper proposes a pseudo-QMF filter bank design technique that significantly reduces the delay by relaxing the linear phase constraints. An example in which an oversampled critical-band nonuniform filter bank is designed and applied to a two-state modeling speech enhancement system is presented in this paper. Comparison of the performance of this system to competing methods employing tree-structured, linear phase multiresolution analysis indicates that the approach described in this paper strikes a good balance between system performance and low delay

    Studies in Signal Processing Techniques for Speech Enhancement: A comparative study

    Get PDF
    Speech enhancement is very essential to suppress the background noise and to increase speech intelligibility and reduce fatigue in hearing. There exist many simple speech enhancement algorithms like spectral subtraction to complex algorithms like Bayesian Magnitude estimators based on Minimum Mean Square Error (MMSE) and its variants. A continuous research is going and new algorithms are emerging to enhance speech signal recorded in the background of environment such as industries, vehicles and aircraft cockpit. In aviation industries speech enhancement plays a vital role to bring crucial information from pilot’s conversation in case of an incident or accident by suppressing engine and other cockpit instrument noises. In this work proposed is a new approach to speech enhancement making use harmonic wavelet transform and Bayesian estimators. The performance indicators, SNR and listening confirms to the fact that newly modified algorithms using harmonic wavelet transform indeed show better results than currently existing methods. Further, the Harmonic Wavelet Transform is computationally efficient and simple to implement due to its inbuilt decimation-interpolation operations compared to those of filter-bank approach to realize sub-bands

    Blind dereverberation of speech from moving and stationary speakers using sequential Monte Carlo methods

    Get PDF
    Speech signals radiated in confined spaces are subject to reverberation due to reflections of surrounding walls and obstacles. Reverberation leads to severe degradation of speech intelligibility and can be prohibitive for applications where speech is digitally recorded, such as audio conferencing or hearing aids. Dereverberation of speech is therefore an important field in speech enhancement. Driven by consumer demand, blind speech dereverberation has become a popular field in the research community and has led to many interesting approaches in the literature. However, most existing methods are dictated by their underlying models and hence suffer from assumptions that constrain the approaches to specific subproblems of blind speech dereverberation. For example, many approaches limit the dereverberation to voiced speech sounds, leading to poor results for unvoiced speech. Few approaches tackle single-sensor blind speech dereverberation, and only a very limited subset allows for dereverberation of speech from moving speakers. Therefore, the aim of this dissertation is the development of a flexible and extendible framework for blind speech dereverberation accommodating different speech sound types, single- or multiple sensor as well as stationary and moving speakers. Bayesian methods benefit from – rather than being dictated by – appropriate model choices. Therefore, the problem of blind speech dereverberation is considered from a Bayesian perspective in this thesis. A generic sequential Monte Carlo approach accommodating a multitude of models for the speech production mechanism and room transfer function is consequently derived. In this approach both the anechoic source signal and reverberant channel are estimated using their optimal estimators by means of Rao-Blackwellisation of the state-space of unknown variables. The remaining model parameters are estimated using sequential importance resampling. The proposed approach is implemented for two different speech production models for stationary speakers, demonstrating substantial reduction in reverberation for both unvoiced and voiced speech sounds. Furthermore, the channel model is extended to facilitate blind dereverberation of speech from moving speakers. Due to the structure of measurement model, single- as well as multi-microphone processing is facilitated, accommodating physically constrained scenarios where only a single sensor can be used as well as allowing for the exploitation of spatial diversity in scenarios where the physical size of microphone arrays is of no concern. This dissertation is concluded with a survey of possible directions for future research, including the use of switching Markov source models, joint target tracking and enhancement, as well as an extension to subband processing for improved computational efficiency

    Optimisation techniques for low bit rate speech coding

    Get PDF
    This thesis extends the background theory of speech and major speech coding schemes used in existing networks to an implementation of GSM full-rate speech compression on a RISC DSP and a multirate application for speech coding. Speech coding is the field concerned with obtaining compact digital representations of speech signals for the purpose of efficient transmission. In this thesis, the background of speech compression, characteristics of speech signals and the DSP algorithms used have been examined. The current speech coding schemes and requirements have been studied. The Global System for Mobile communication (GSM) is a digital mobile radio system which is extensively used throughout Europe, and also in many other parts of the world. The algorithm is standardised by the European Telecommunications Standardisation histitute (ETSI). The full-rate and half-rate speech compression of GSM have been analysed. A real time implementation of the full-rate algorithm has been carried out on a RISC processor GEPARD by Austria Mikro Systeme International (AMS). The GEPARD code has been tested with all of the test sequences provided by ETSI and the results are bit-exact. The transcoding delay is lower than the ETSI requirement. A comparison of the half-rate and full-rate compression algorithms is discussed. Both algorithms offer near toll speech quality comparable or better than analogue cellular networks. The half-rate compression requires more computationally intensive operations and therefore a more powerful processor will be needed due to the complexity of the code. Hence the cost of the implementation of half-rate codec will be considerably higher than full-rate. A description of multirate signal processing and its application on speech (SBC) and speech/audio (MPEG) has been given. An investigation into the possibility of combining multirate filtering and GSM fill-rate speech algorithm. The results showed that multirate signal processing cannot be directly applied GSM full-rate speech compression since this method requires more processing power, causing longer coding delay but did not appreciably improve the bit rate. In order to achieve a lower bit rate, the GSM full-rate mathematical algorithm can be used instead of the standardised ETSI recommendation. Some changes including the number of quantisation bits has to be made before the application of multirate signal processing and a new standard will be required

    Perceptual and acoustic impacts of aberrant properties of electrolaryngeal speech.

    Get PDF
    Thesis (Ph. D.)—Harvard-MIT Division of Health Sciences and Technology, 2003.Includes bibliographical references (p. 167-171).This electronic version was prepared by the author. The certified thesis is available in the Institute Archives and Special Collections.Ph. D

    Doctor of Philosophy

    Get PDF
    dissertationHearing aids suffer from the problem of acoustic feedback that limits the gain provided by hearing aids. Moreover, the output sound quality of hearing aids may be compromised in the presence of background acoustic noise. Digital hearing aids use advanced signal processing to reduce acoustic feedback and background noise to improve the output sound quality. However, it is known that the output sound quality of digital hearing aids deteriorates as the hearing aid gain is increased. Furthermore, popular subband or transform domain digital signal processing in modern hearing aids introduces analysis-synthesis delays in the forward path. Long forward-path delays are not desirable because the processed sound combines with the unprocessed sound that arrives at the cochlea through the vent and changes the sound quality. In this dissertation, we employ a variable, frequency-dependent gain function that is lower at frequencies of the incoming signal where the information is perceptually insignificant. In addition, the method of this dissertation automatically identifies and suppresses residual acoustical feedback components at frequencies that have the potential to drive the system to instability. The suppressed frequency components are monitored and the suppression is removed when such frequencies no longer pose a threat to drive the hearing aid system into instability. Together, the method of this dissertation provides more stable gain over traditional methods by reducing acoustical coupling between the microphone and the loudspeaker of a hearing aid. In addition, the method of this dissertation performs necessary hearing aid signal processing with low-delay characteristics. The central idea for the low-delay hearing aid signal processing is a spectral gain shaping method (SGSM) that employs parallel parametric equalization (EQ) filters. Parameters of the parametric EQ filters and associated gain values are selected using a least-squares approach to obtain the desired spectral response. Finally, the method of this dissertation switches to a least-squares adaptation scheme with linear complexity at the onset of howling. The method adapts to the altered feedback path quickly and allows the patient to not lose perceivable information. The complexity of the least-squares estimate is reduced by reformulating the least-squares estimate into a Toeplitz system and solving it with a direct Toeplitz solver. The increase in stable gain over traditional methods and the output sound quality were evaluated with psychoacoustic experiments on normal-hearing listeners with speech and music signals. The results indicate that the method of this dissertation provides 8 to 12 dB more hearing aid gain than feedback cancelers with traditional fixed gain functions. Furthermore, experimental results obtained with real world hearing aid gain profiles indicate that the method of this dissertation provides less distortion in the output sound quality than classical feedback cancelers, enabling the use of more comfortable style hearing aids for patients with moderate to profound hearing loss. Extensive MATLAB simulations and subjective evaluations of the results indicate that the method of this dissertation exhibits much smaller forward-path delays with superior howling suppression capability

    Mathematics and Digital Signal Processing

    Get PDF
    Modern computer technology has opened up new opportunities for the development of digital signal processing methods. The applications of digital signal processing have expanded significantly and today include audio and speech processing, sonar, radar, and other sensor array processing, spectral density estimation, statistical signal processing, digital image processing, signal processing for telecommunications, control systems, biomedical engineering, and seismology, among others. This Special Issue is aimed at wide coverage of the problems of digital signal processing, from mathematical modeling to the implementation of problem-oriented systems. The basis of digital signal processing is digital filtering. Wavelet analysis implements multiscale signal processing and is used to solve applied problems of de-noising and compression. Processing of visual information, including image and video processing and pattern recognition, is actively used in robotic systems and industrial processes control today. Improving digital signal processing circuits and developing new signal processing systems can improve the technical characteristics of many digital devices. The development of new methods of artificial intelligence, including artificial neural networks and brain-computer interfaces, opens up new prospects for the creation of smart technology. This Special Issue contains the latest technological developments in mathematics and digital signal processing. The stated results are of interest to researchers in the field of applied mathematics and developers of modern digital signal processing systems

    Source Separation for Hearing Aid Applications

    Get PDF
    corecore