43 research outputs found

    Speech Signal Enhancement through Adaptive Wavelet Thresholding

    Get PDF
    This paper demonstrates the application of the Bionic Wavelet Transform (BWT), an adaptive wavelet transform derived from a non-linear auditory model of the cochlea, to the task of speech signal enhancement. Results, measured objectively by Signal-to-Noise ratio (SNR) and Segmental SNR (SSNR) and subjectively by Mean Opinion Score (MOS), are given for additive white Gaussian noise as well as four different types of realistic noise environments. Enhancement is accomplished through the use of thresholding on the adapted BWT coefficients, and the results are compared to a variety of speech enhancement techniques, including Ephraim Malah filtering, iterative Wiener filtering, and spectral subtraction, as well as to wavelet denoising based on a perceptually scaled wavelet packet transform decomposition. Overall results indicate that SNR and SSNR improvements for the proposed approach are comparable to those of the Ephraim Malah filter, with BWT enhancement giving the best results of all methods for the noisiest (−10 db and −5 db input SNR) conditions. Subjective measurements using MOS surveys across a variety of 0 db SNR noise conditions indicate enhancement quality competitive with but still lower than results for Ephraim Malah filtering and iterative Wiener filtering, but higher than the perceptually scaled wavelet method

    BionicWavelet Based Denoising Using Source Separation

    Get PDF
    We consider the problem of speech denoising using source separation. In this study we have proposed a hybrid technique that consists in applying in the first step, the Bionic Wavelet Transform (BWT) to two different mixtures of the same speech signal with noise. This speech signal is corrupted by a Gaussian white noise with two different values of the Signal to Noise Ratio (SNR) in order to obtain those two mixtures. The second step consists in computing the entropy of each bionic wavelet coefficient and finds the two subbands having the minimal entropy. Those two subbands are used to estimate the separation matrix of the speech signal from noise by using the source separation. Our proposed technique is evaluated by comparing it to the denoising technique based on source separation in time domain

    An analogue approach for the processing of biomedical signals

    No full text
    Constant device scaling has signifcantly boosted electronic systems design in the digital domain enabling incorporation of more functionality within small silicon area and at the same time allows high-speed computation. This trend has been exploited for developing high-performance miniaturised systems in a number of application areas like communication, sensor network, main frame computers, biomedical information processing etc. Although successful, the associated cost comes in the form of high leakage power dissipation and systems reliability. With the increase of customer demands for smarter and faster technologies and with the advent of pervasive information processing, these issues may prove to be limiting factors for application of traditional digital design techniques. Furthermore, as the limit of device scaling is nearing, performance enhancement for the conventional digital system design methodology cannot be achieved any further unless innovations in new materials and new transistor design are made. To this end, an alternative design methodology that may enable performance enhancement without depending on device scaling is much sought today.Analogue design technique is one of these alternative techniques that have recently gained considerable interests. Although it is well understood that there are several roadblocks still to be overcome for making analogue-based system design for information processing as the main-stream design technique (e.g., lack of automated design tool, noise performance, efficient passive components implementation on silicon etc.), it may offer a faster way of realising a system with very few components and therefore may have a positive implication on systems performance enhancement. The main aim of this thesis is to explore possible ways of information processing using analogue design techniques in particular in the field of biomedical systems

    Analogue CMOS Cochlea Systems: A Historic Retrospective

    Get PDF

    Decoding Neural Signals with Computational Models: A Systematic Review of Invasive BMI

    Full text link
    There are significant milestones in modern human's civilization in which mankind stepped into a different level of life with a new spectrum of possibilities and comfort. From fire-lighting technology and wheeled wagons to writing, electricity and the Internet, each one changed our lives dramatically. In this paper, we take a deep look into the invasive Brain Machine Interface (BMI), an ambitious and cutting-edge technology which has the potential to be another important milestone in human civilization. Not only beneficial for patients with severe medical conditions, the invasive BMI technology can significantly impact different technologies and almost every aspect of human's life. We review the biological and engineering concepts that underpin the implementation of BMI applications. There are various essential techniques that are necessary for making invasive BMI applications a reality. We review these through providing an analysis of (i) possible applications of invasive BMI technology, (ii) the methods and devices for detecting and decoding brain signals, as well as (iii) possible options for stimulating signals into human's brain. Finally, we discuss the challenges and opportunities of invasive BMI for further development in the area.Comment: 51 pages, 14 figures, review articl

    Identification of Transient Speech Using Wavelet Transforms

    Get PDF
    It is generally believed that abrupt stimulus changes, which in speech may be time-varying frequency edges associated with consonants, transitions between consonants and vowels and transitions within vowels are critical to the perception of speech by humans and for speech recognition by machines. Noise affects speech transitions more than it affects quasi-steady-state speech. I believe that identifying and selectively amplifying speech transitions may enhance the intelligibility of speech in noisy conditions. The purpose of this study is to evaluate the use of wavelet transforms to identify speech transitions. Using wavelet transforms may be computationally efficient and allow for real-time applications. The discrete wavelet transform (DWT), stationary wavelet transform (SWT) and wavelet packets (WP) are evaluated. Wavelet analysis is combined with variable frame rate processing to improve the identification process. Variable frame rate can identify time segments when speech feature vectors are changing rapidly and when they are relatively stationary. Energy profiles for words, which show the energy in each node of a speech signal decomposed using wavelets, are used to identify nodes that include predominately transient information and nodes that include predominately quasi-steady-state information, and these are used to synthesize transient and quasi-steady-state speech components. These speech components are estimates of the tonal and nontonal speech components, which Yoo et al identified using time-varying band-pass filters. Comparison of spectra, a listening test and mean-squared-errors between the transient components synthesized using wavelets and Yoo's nontonal components indicated that wavelet packets identified the best estimates of Yoo's components. An algorithm that incorporates variable frame rate analysis into wavelet packet analysis is proposed. The development of this algorithm involves the processes of choosing a wavelet function and a decomposition level to be used. The algorithm itself has 4 steps: wavelet packet decomposition; classification of terminal nodes; incorporation of variable frame rate processing; synthesis of speech components. Combining wavelet analysis with variable frame rate analysis provides the best estimates of Yoo's speech components

    Speech Enhancement with Adaptive Thresholding and Kalman Filtering

    Get PDF
    Speech enhancement has been extensively studied for many years and various speech enhance- ment methods have been developed during the past decades. One of the objectives of speech en- hancement is to provide high-quality speech communication in the presence of background noise and concurrent interference signals. In the process of speech communication, the clean speech sig- nal is inevitably corrupted by acoustic noise from the surrounding environment, transmission media, communication equipment, electrical noise, other speakers, and other sources of interference. These disturbances can significantly degrade the quality and intelligibility of the received speech signal. Therefore, it is of great interest to develop efficient speech enhancement techniques to recover the original speech from the noisy observation. In recent years, various techniques have been developed to tackle this problem, which can be classified into single channel and multi-channel enhancement approaches. Since single channel enhancement is easy to implement, it has been a significant field of research and various approaches have been developed. For example, spectral subtraction and Wiener filtering, are among the earliest single channel methods, which are based on estimation of the power spectrum of stationary noise. However, when the noise is non-stationary, or there exists music noise and ambient speech noise, the enhancement performance would degrade considerably. To overcome this disadvantage, this thesis focuses on single channel speech enhancement under adverse noise environment, especially the non-stationary noise environment. Recently, wavelet transform based methods have been widely used to reduce the undesired background noise. On the other hand, the Kalman filter (KF) methods offer competitive denoising results, especially in non-stationary environment. It has been used as a popular and powerful tool for speech enhancement during the past decades. In this regard, a single channel wavelet thresholding based Kalman filter (KF) algorithm is proposed for speech enhancement in this thesis. The wavelet packet (WP) transform is first applied to the noise corrupted speech on a frame-by-frame basis, which decomposes each frame into a number of subbands. A voice activity detector (VAD) is then designed to detect the voiced/unvoiced frames of the subband speech. Based on the VAD result, an adaptive thresholding scheme is applied to each subband speech followed by the WP based reconstruction to obtain the pre-enhanced speech. To achieve a further level of enhancement, an iterative Kalman filter (IKF) is used to process the pre-enhanced speech. The proposed adaptive thresholding iterative Kalman filtering (AT-IKF) method is evaluated and compared with some existing methods under various noise conditions in terms of segmental SNR and perceptual evaluation of speech quality (PESQ) as two well-known performance indexes. Firstly, we compare the proposed adaptive thresholding (AT) scheme with three other threshold- ing schemes: the non-linear universal thresholding (U-T), the non-linear wavelet packet transform thresholding (WPT-T) and the non-linear SURE thresholding (SURE-T). The experimental results show that the proposed AT scheme can significantly improve the segmental SNR and PESQ for all input SNRs compared with the other existing thresholding schemes. Secondly, extensive computer simulations are conducted to evaluate the proposed AT-IKF as opposed to the AT and the IKF as standalone speech enhancement methods. It is shown that the AT-IKF method still performs the best. Lastly, the proposed ATIKF method is compared with three representative and popular meth- ods: the improved spectral subtraction based speech enhancement algorithm (ISS), the improved Wiener filter based method (IWF) and the representative subband Kalman filter based algorithm (SIKF). Experimental results demonstrate the effectiveness of the proposed method as compared to some previous works both in terms of segmental SNR and PESQ

    Offline and real time noise reduction in speech signals using the discrete wavelet packet decomposition

    Get PDF
    This thesis describes the development of an offline and real time wavelet based speech enhancement system to process speech corrupted with various amounts of white Gaussian noise and other different noise types

    Glucose-powered neuroelectronics

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 157-164).A holy grail of bioelectronics is to engineer biologically implantable systems that can be embedded without disturbing their local environments, while harvesting from their surroundings all of the power they require. As implantable electronic devices become increasingly prevalent in scientific research and in the diagnosis, management, and treatment of human disease, there is correspondingly increasing demand for devices with unlimited functional lifetimes that integrate seamlessly with their hosts in these two ways. This thesis presents significant progress toward establishing the feasibility of one such system: A brain-machine interface powered by a bioimplantable fuel cell that harvests energy from extracellular glucose in the cerebrospinal fluid surrounding the brain. The first part of this thesis describes a set of biomimetic algorithms and low-power circuit architectures for decoding electrical signals from ensembles of neurons in the brain. The decoders are intended for use in the context of neural rehabilitation, to provide paralyzed or otherwise disabled patients with instantaneous, natural, thought-based control of robotic prosthetic limbs and other external devices. This thesis presents a detailed discussion of the decoding algorithms, descriptions of the low-power analog and digital circuit architectures used to implement the decoders, and results validating their performance when applied to decode real neural data. A major constraint on brain-implanted electronic devices is the requirement that they consume and dissipate very little power, so as not to damage surrounding brain tissue. The systems described here address that constraint, computing in the style of biological neural networks, and using arithmetic-free, purely logical primitives to establish universal computing architectures for neural decoding. The second part of this thesis describes the development of an implantable fuel cell powered by extracellular glucose at concentrations such as those found in the cerebrospinal fluid surrounding the brain. The theoretical foundations, details of design and fabrication, mechanical and electrochemical characterization, as well as in vitro performance data for the fuel cell are presented.by Benjamin Isaac Rapoport.Ph.D

    Modeling and design of an active silicon cochlea

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references.Silicon cochleas are inspired by the biological cochlea and perform efficient spectrum analysis: They realize a bank of constant-Q Nth-order filters with O(N) efficiency rather than O(N²) efficiency due to their use of an exponentially tapered filter cascade. They are useful in speech-recognition front ends, cochlear implants, and hearing aids, especially as architectures for improving spectral analysis in noisy environments and for performing low-power spectrum analysis. In this thesis I describe four contributions towards improving the state-of-the-art in silicon-cochlea design, two of which involve theoretical modeling, and two of which involve integrated-circuit design. On the theoretical side, I first show that a simple rational approximation to distributed partition impedances in the biological cochlea captures its essential features and enables an efficient artificial implementation achieving maximum gain in a minimum number of stages while still maintaining stability. In particular, I show that the terminating impedance of the cochlea is crucial for its stability and discuss various analytic methods for termination. Second, I derive a novel composite artificial cochlear architecture composed of a cascade of all-pass second-order filters from a first-principles analysis of the biological cochlear transmission line. The novel all-pass architecture reduces phase lag and group delay in the silicon cochlea, a problem in prior designs, sharpens its high-frequency rolloff slopes, increases its frequency selectivity, and improves its nonlinear compression characteristics. On the circuit side, I first present a novel current-mode log-domain topology that simultaneously increases signal-to-noise ratio (SNR) and dynamic range while lowering power consumption in resonant filters with high quality factor Q.(cont.) The novel topology is validated in a second-order low-pass resonant filter, which is employed in the silicon cochlea, demonstrating a reduction in power consumption and increase in SNR by a factor of Q. When bias currents in the filter are adjusted as the signal level varies, this technique enables an improvement in maximum SNR by a factor of Q and an increase in maximum non-distorted signal power and dynamic range by a factor of Q⁴. Measurements from a chip in a 0.18-[mu]m 1.1-V CMOS technology achieve a quiescent power consumption of 580-nW at a 15-kHz center frequency with a maximum SNR of 41.3dB and dynamic range of 76dB for a Q=4. Finally, I describe a current-mode -stage 0.18-[mu]m silicon cochlea that achieves 79dB of dynamic range with 41-[mu]W power consumption on a 1-V power supply over a usable 3.5kHz-14kHz frequency range. These numbers represent an 18dB improvement in dynamic range and a 12.5x reduction in power consumption over prior state-of-the-art silicon cochleas.by Serhii M. Zhak.Ph.D
    corecore