Search CORE

280 research outputs found

Techniques for the Regeneration of Wideband Speech from Narrowband Speech

Author: Fuemmeler Jason A.
Gardner William R.
Hardie Russell C.
Publication venue: eCommons
Publication date: 01/01/2001
Field of study

This paper addresses the problem of reconstructing wideband speech signals from observed narrowband speech signals. The goal of this work is to improve the perceived quality of speech signals which have been transmitted through narrowband channels or degraded during acquisition. We describe a system, based on linear predictive coding, for estimating wideband speech from narrowband. This system employs both previously identified and novel techniques. Experimental results are provided in order to illustrate the system’s ability to improve speech quality. Both objective and subjective criteria are used to evaluate the quality of the processed speech signals

Directory of Open Access Journals

University of Dayton

Coding Strategies for Cochlear Implants Under Adverse Environments

Author: Tahmina Qudsia
Publication venue: UWM Digital Commons
Publication date: 01/05/2016
Field of study

Cochlear implants are electronic prosthetic devices that restores partial hearing in patients with severe to profound hearing loss. Although most coding strategies have significantly improved the perception of speech in quite listening conditions, there remains limitations on speech perception under adverse environments such as in background noise, reverberation and band-limited channels, and we propose strategies that improve the intelligibility of speech transmitted over the telephone networks, reverberated speech and speech in the presence of background noise. For telephone processed speech, we propose to examine the effects of adding low-frequency and high- frequency information to the band-limited telephone speech. Four listening conditions were designed to simulate the receiving frequency characteristics of telephone handsets. Results indicated improvement in cochlear implant and bimodal listening when telephone speech was augmented with high frequency information and therefore this study provides support for design of algorithms to extend the bandwidth towards higher frequencies. The results also indicated added benefit from hearing aids for bimodal listeners in all four types of listening conditions. Speech understanding in acoustically reverberant environments is always a difficult task for hearing impaired listeners. Reverberated sounds consists of direct sound, early reflections and late reflections. Late reflections are known to be detrimental to speech intelligibility. In this study, we propose a reverberation suppression strategy based on spectral subtraction to suppress the reverberant energies from late reflections. Results from listening tests for two reverberant conditions (RT60 = 0.3s and 1.0s) indicated significant improvement when stimuli was processed with SS strategy. The proposed strategy operates with little to no prior information on the signal and the room characteristics and therefore, can potentially be implemented in real-time CI speech processors. For speech in background noise, we propose a mechanism underlying the contribution of harmonics to the benefit of electroacoustic stimulations in cochlear implants. The proposed strategy is based on harmonic modeling and uses synthesis driven approach to synthesize the harmonics in voiced segments of speech. Based on objective measures, results indicated improvement in speech quality. This study warrants further work into development of algorithms to regenerate harmonics of voiced segments in the presence of noise

University of Wisconsin-Milwaukee

An implementation of a reference symbol approach to generic modulation in fading channels

Author: Lodge J. H.
Pacola L. C.
Young R. J.
Publication venue
Publication date
Field of study

As mobile satellite communications systems evolve over the next decade, they will have to adapt to a changing tradeoff between bandwidth and power. This paper presents a flexible approach to digital modulation and coding that will accommodate both wideband and narrowband schemes. This architecture could be the basis for a family of modems, each satisfying a specific power and bandwidth constraint, yet all having a large number of common signal processing blocks. The implementation of this generic approach, with general purpose digital processors for transmission of 4.8 kilobits per sec. digitally encoded speech, is described

NASA Technical Reports Server

Glottal-synchronous speech processing

Author: Thomas Mark R P
Thomas Mark R P
Publication venue: Electrical and Electronic Engineering, Imperial College London
Publication date: 01/01/2010
Field of study

Glottal-synchronous speech processing is a field of speech science where the pseudoperiodicity of voiced speech is exploited. Traditionally, speech processing involves segmenting and processing short speech frames of predefined length; this may fail to exploit the inherent periodic structure of voiced speech which glottal-synchronous speech frames have the potential to harness. Glottal-synchronous frames are often derived from the glottal closure instants (GCIs) and glottal opening instants (GOIs). The SIGMA algorithm was developed for the detection of GCIs and GOIs from the Electroglottograph signal with a measured accuracy of up to 99.59%. For GCI and GOI detection from speech signals, the YAGA algorithm provides a measured accuracy of up to 99.84%. Multichannel speech-based approaches are shown to be more robust to reverberation than single-channel algorithms. The GCIs are applied to real-world applications including speech dereverberation, where SNR is improved by up to 5 dB, and to prosodic manipulation where the importance of voicing detection in glottal-synchronous algorithms is demonstrated by subjective testing. The GCIs are further exploited in a new area of data-driven speech modelling, providing new insights into speech production and a set of tools to aid deployment into real-world applications. The technique is shown to be applicable in areas of speech coding, identification and artificial bandwidth extension of telephone speec

Spiral - Imperial College Digital Repository

OpenGrey Repository

Underwater Object Tracking Using Time Frequency Signatures of Acoustic Signals

Author: Bucur Diana
Candel Ion
Digulescu Angela
Ioana Cornel
Petrut Teodor
Publication venue: HAL CCSD
Publication date: 07/04/2014
Field of study

International audienceDetecting underwater objects is an important ap plication in marine applications. Most of the techniques are based on the amplitude related techniques, whereby the amplitude of the received echo is used to detect objects within specified bounds. Amplitude techniques however are prone to interference and attenuation, thus limiting the capabilities of such systems. Hence, the aim of this paper is to propose a new technique that detect and track underwater moving objects usingthe turbulence generated by the object. Wideband signals have proven to be a very efficient alternative for merging turbulent flow characteristics and waveform design in order to describe and explain the behavior of turbulence, both artificial and natural. Therefore, constructing adapted waveforms to the natural turbulence embedded in the flow, as well as to the artificial turbulence created by an unknown underwater moving object may hold the key for a new technique for underwater object tracking. When acoustic signals with a particular Instantaneous Frequency Law traveling into underwater environment will hit a moving object, their Instantaneous Frequency Law will capture the object's artificial turbulence, as well as the natural turbulence embedded in the flow. Experimental results carried out in our reduced scale facility provide the validation of the technique

Crossref

Hal - Université Grenoble Alpes

Interference Unmixing and Estimation Technique for Improvement of Speech Separation Performance

Author: Maurya Ankur
Publication venue: Dublin Institute of Technology
Publication date: 10/12/2017
Field of study

Presence of noise in the speech can sometimes become annoying as it can lead to loss of important data or create misunderstandings between the communications area which can lead to major problems associated to loss of time and money. This thesis focuses to filter out noise form a speech signal which is simulated in Matlab/Octave software while making a comparison between temporal resolution of signal with respect to the spectral resolution of the signal in which the parameters such as the size of window length are varied in order to obtain the best speech separation performance. To get the best spectral and temporal resolution with respect to the window length in order to find out the presence of speech sound in the mixture or how strongly the mixture is dominated by the noisy signal. The reconstructed signal is the original speech sound which was applied at the input. To study the relationship between window-disjoint orthogonality and window length and to get the best separation performance

Arrow@TUDublin

Early FM Radio

Author: Frost Gary L.
Publication venue: 'Project Muse'
Publication date: 15/07/2022
Field of study

The commonly accepted history of FM radio is one of the twentieth century’s iconic sagas of invention, heroism, and tragedy. Edwin Howard Armstrong created a system of wideband frequency-modulation radio in 1933. The Radio Corporation of America (RCA), convinced that Armstrong’s system threatened its AM empire, failed to develop the new technology and refused to pay Armstrong royalties. Armstrong sued the company at great personal cost. He died despondent, exhausted, and broke. But this account, according to Gary L. Frost, ignores the contributions of scores of other individuals who were involved in the decades-long struggle to realize the potential of FM radio. The first scholar to fully examine recently uncovered evidence from the Armstrong v. RCA lawsuit, Frost offers a thorough revision of the FM story. Frost’s balanced, contextualized approach provides a much-needed corrective to previous accounts. Navigating deftly through the details of a complicated story, he examines the motivations and interactions of the three communities most intimately involved in the development of the technology—Progressive-era amateur radio operators, RCA and Westinghouse engineers, and early FM broadcasters. In the process, Frost demonstrates the tension between competition and collaboration that goes hand in hand with the emergence and refinement of new technologies. Frost's study reconsiders both the social construction of FM radio and the process of technological evolution. Historians of technology, communication, and media will welcome this important reexamination of the canonic story of early FM radio

Directory of Open Access Books (DOAB)

Telephone speech enhancement for the hearing impaired

Author: Murshed C. Z.
Tasnim Bony
Publication venue: BRAC University
Publication date
Field of study

This thesis report is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2009.Cataloged from PDF version of thesis report.Includes bibliographical references (page 48).Bony TasnimC. Z. MurshedB. Computer Science and Engineerin

BRAC University Institutional Repository

Configurable EBEN: Extreme Bandwidth Extension Network to enhance body-conducted speech capture

Author: Bavu Éric
Hauret Julien
Joubaud Thomas
Zimpfer Véronique
Publication venue
Publication date: 12/09/2023
Field of study

This paper presents a configurable version of Extreme Bandwidth Extension Network (EBEN), a Generative Adversarial Network (GAN) designed to improve audio captured with body-conduction microphones. We show that although these microphones significantly reduce environmental noise, this insensitivity to ambient noise happens at the expense of the bandwidth of the speech signal acquired by the wearer of the devices. The obtained captured signals therefore require the use of signal enhancement techniques to recover the full-bandwidth speech. EBEN leverages a configurable multiband decomposition of the raw captured signal. This decomposition allows the data time domain dimensions to be reduced and the full band signal to be better controlled. The multiband representation of the captured signal is processed through a U-Net-like model, which combines feature and adversarial losses to generate an enhanced speech signal. We also benefit from this original representation in the proposed configurable discriminators architecture. The configurable EBEN approach can achieve state-of-the-art enhancement results on synthetic data with a lightweight generator that allows real-time processing.Comment: Accepted in IEEE/ACM Transactions on Audio, Speech and Language Processing on 14/08/202

arXiv.org e-Print Archive