75 research outputs found

    ENHANCEMENT OF SPEECH INTELLIGIBILITY USING SPEECH TRANSIENTS EXTRACTED BY A WAVELET PACKET-BASED REAL-TIME ALGORITHM

    Get PDF
    Studies have shown that transient speech, which is associated with consonants, transitions between consonants and vowels, and transitions within some vowels, is an important cue for identifying and discriminating speech sounds. However, compared to the relatively steady-state vowel segments of speech, transient speech has much lower energy and thus is easily masked by background noise. Emphasis of transient speech can improve the intelligibility of speech in background noise, but methods to demonstrate this improvement have either identified transient speech manually or proposed algorithms that cannot be implemented to run in real-time.We have developed an algorithm to automatically extract transient speech in real-time. The algorithm involves the use of a function, which we term the transitivity function, to characterize the rate of change of wavelet coefficients of a wavelet packet transform representation of a speech signal. The transitivity function is large and positive when a signal is changing rapidly and small when a signal is in steady state. Two different definitions of the transitivity function, one based on the short-time energy and the other on Mel-frequency cepstral coefficients, were evaluated experimentally, and the MFCC-based transitivity function produced better results. The extracted transient speech signal is used to create modified speech by combining it with original speech.To facilitate comparison of our transient and modified speech to speech processed using methods proposed by other researcher to emphasize transients, we developed three indices. The indices are used to characterize the extent to which a speech modification/processing method emphasizes (1) a particular region of speech, (2) consonants relative to, and (3) onsets and offsets of formants compared to steady formant. These indices are very useful because they quantify differences in speech signals that are difficult to show using spectrograms, spectra and time-domain waveforms.The transient extraction algorithm includes parameters which when varied influence the intelligibility of the extracted transient speech. The best values for these parameters were selected using psycho-acoustic testing. Measurements of speech intelligibility in background noise using psycho-acoustic testing showed that modified speech was more intelligible than original speech, especially at high noise levels (-20 and -15 dB). The incorporation of a method that automatically identifies and boosts unvoiced speech into the algorithm was evaluated and showed that this method does not result in additional speech intelligibility improvements

    Sound source segregation of multiple concurrent talkers via Short-Time Target Cancellation

    Full text link
    The Short-Time Target Cancellation (STTC) algorithm, developed as part of this dissertation research, is a “Cocktail Party Problem” processor that can boost speech intelligibility for a target talker from a specified “look” direction, while suppressing the intelligibility of competing talkers. The algorithm holds promise for both automatic speech recognition and assistive listening device applications. The STTC algorithm operates on a frame-by-frame basis, leverages the computational efficiency of the Fast Fourier Transform (FFT), and is designed to run in real time. Notably, performance in objective measures of speech intelligibility and sound source segregation is comparable to that of the Ideal Binary Mask (IBM) and Ideal Ratio Mask (IRM). Because the STTC algorithm computes a time-frequency mask that can be applied independently to both the left and right signals, binaural cues for spatial hearing, including Interaural Time Differences (ITDs), Interaural Level Differences (ILDs) and spectral cues, can be preserved in potential hearing aid applications. A minimalist design for a proposed STTC Assistive Listening Device (ALD), consisting of six microphones embedded in the frame of a pair of eyeglasses, is presented and evaluated using virtual room acoustics and both objective and behavioral measures. The results suggest that the proposed STTC ALD can provide a significant speech intelligibility benefit in complex auditory scenes comprised of multiple spatially separated talkers.2020-10-22T00:00:00

    Doctor of Philosophy

    Get PDF
    dissertationHearing aids suffer from the problem of acoustic feedback that limits the gain provided by hearing aids. Moreover, the output sound quality of hearing aids may be compromised in the presence of background acoustic noise. Digital hearing aids use advanced signal processing to reduce acoustic feedback and background noise to improve the output sound quality. However, it is known that the output sound quality of digital hearing aids deteriorates as the hearing aid gain is increased. Furthermore, popular subband or transform domain digital signal processing in modern hearing aids introduces analysis-synthesis delays in the forward path. Long forward-path delays are not desirable because the processed sound combines with the unprocessed sound that arrives at the cochlea through the vent and changes the sound quality. In this dissertation, we employ a variable, frequency-dependent gain function that is lower at frequencies of the incoming signal where the information is perceptually insignificant. In addition, the method of this dissertation automatically identifies and suppresses residual acoustical feedback components at frequencies that have the potential to drive the system to instability. The suppressed frequency components are monitored and the suppression is removed when such frequencies no longer pose a threat to drive the hearing aid system into instability. Together, the method of this dissertation provides more stable gain over traditional methods by reducing acoustical coupling between the microphone and the loudspeaker of a hearing aid. In addition, the method of this dissertation performs necessary hearing aid signal processing with low-delay characteristics. The central idea for the low-delay hearing aid signal processing is a spectral gain shaping method (SGSM) that employs parallel parametric equalization (EQ) filters. Parameters of the parametric EQ filters and associated gain values are selected using a least-squares approach to obtain the desired spectral response. Finally, the method of this dissertation switches to a least-squares adaptation scheme with linear complexity at the onset of howling. The method adapts to the altered feedback path quickly and allows the patient to not lose perceivable information. The complexity of the least-squares estimate is reduced by reformulating the least-squares estimate into a Toeplitz system and solving it with a direct Toeplitz solver. The increase in stable gain over traditional methods and the output sound quality were evaluated with psychoacoustic experiments on normal-hearing listeners with speech and music signals. The results indicate that the method of this dissertation provides 8 to 12 dB more hearing aid gain than feedback cancelers with traditional fixed gain functions. Furthermore, experimental results obtained with real world hearing aid gain profiles indicate that the method of this dissertation provides less distortion in the output sound quality than classical feedback cancelers, enabling the use of more comfortable style hearing aids for patients with moderate to profound hearing loss. Extensive MATLAB simulations and subjective evaluations of the results indicate that the method of this dissertation exhibits much smaller forward-path delays with superior howling suppression capability

    Adaptiiviset läpikuuluvuuskuulokkeet

    Get PDF
    Hear-through equalization can be used to make a headset acoustically transparent, i.e.~to produce sound perception that is similar to perception without the headset. The headset must have microphones outside the earpieces to capture the ambient sounds, which is then reproduced with the headset transducers after the equalization. The reproduced signal is called the hear-through signal. Equalization is needed, since the headset affects the acoustics of the outer ear. \\ In addition to the external microphones, the headset used in this study has additional internal microphones. Together these microphones can be used to estimate the attenuation of the headset online and to detect poor fit. Since the poor fit causes leaks and decreased attenuation, the combined effect of the leaked sound and the hear-through signal changes, when compared to proper fit situation. Therefore, the isolation estimate is used to control the hear-through equalization in order to produce better acoustical transparency. Furthermore, the proposed adaptive hear-through algorithm includes manual controls for the equalizers and the volume of the hear-through signal. \\ The proposed algorithm is found to transform the used headset acoustically transparent. The equalization controls improve the performance of the headset, when the fit is poor or when the volume of the hear-through signal is adjusted, by reducing the comb-filtering effect due to the summation of the leaked sound and the hear-through signal inside the ear canal. The behavior of the proposed algorithm can be demonstrated with an implemented Matlab simulator.Läpikuuluvuusekvalisoinnilla voidaan saavuttaa akustinen läpinäkyvyys kuulokkeita käytettäessä, eli tuottaa samankaltainen ääniaistimus kuin mikä havaittaisiin ilman kuulokkeita. Käytetyissä kuulokkeissa tulee olla mikrofonit kuulokkeen ulkopinnalla, joiden avulla voidaan tallentaa ympäröiviä ääniä. Mikrofonisignaalit ekvalisoidaan, jolloin niistä tulee läpikuuluvuussignaalit, ja toistetaan kuulokkeista. Ekvalisointi on tarpeellista, sillä kuulokkeet muuttavat ulkokorvan akustiikka ja siten myös äänihavaintoa. \\ Tässä diplomityössä käytetyssä prototyyppikuulokeparissa on edellä mainittujen mikrofonien lisäksi myös toiset, korvakäytävän sisälle asettuvat mikrofonit. Yhdessä näiden kahden mikrofonin avulla voidaan määrittää reaaliaikainen estimaatti kuulokkeen vaimennukselle ja tunnistaa huono istuvuus. Koska huonosti asetettu kuuloke vuotaa enemmän ääntä korvakäytävän sisään kuin kunnolla asetettu, kuulokkeen äänen ja vuotavan äänen yhteisvaikutus muuttuu. Tästä syystä vaimennusestimaattia käytetään läpikuuluvuusekvalisoinnin säätöön, jotta akustinen läpinäkyvyys ei kärsisi. Lisäksi esitellyssä algoritmissa on manuaaliset säädöt ekvalisaattoreille ja läpikuuluvuussignaalin voimakkuudelle.\\ Esitetyn algoritmin havaitaan tuottavan akustinen läpinäkyvyys, kun sitä käytetään prototyyppikuulokkeiden kanssa. Ekvalisointisäädöt parantavat kuulokkeiden toimintaa istuvuuden ollessa huono tai säädettäessä läpikuuluvuussignaalin voimakkuutta, koska ne vähentävät kampasuodatusefektiä, joka voi aiheutua vuotavan äänen ja läpikuuluvuussignaalin summautuessa. Esitellyn algoritmin toimintaa voidaan havainnollistaa toteutetulla Matlab-simulaattorilla
    corecore