10,141 research outputs found

    A variable rate speech compressor for mobile applications

    Get PDF
    One of the most promising speech coder at the bit rate of 9.6 to 4.8 kbits/s is CELP. Code Excited Linear Prediction (CELP) has been dominating 9.6 to 4.8 kbits/s region during the past 3 to 4 years. Its set back however, is its expensive implementation. As an alternative to CELP, the Base-Band CELP (CELP-BB) was developed which produced good quality speech comparable to CELP and a single chip implementable complexity as reported previously. Its robustness was also improved to tolerate errors up to 1.0 pct. and maintain intelligibility up to 5.0 pct. and more. Although, CELP-BB produces good quality speech at around 4.8 kbits/s, it has a fundamental problem when updating the pitch filter memory. A sub-optimal solution is proposed for this problem. Below 4.8 kbits/s, however, CELP-BB suffers from noticeable quantization noise as a result of the large vector dimensions used. Efficient representation of speech below 4.8 kbits/s is reported by introducing Sinusoidal Transform Coding (STC) to represent the LPC excitation which is called Sine Wave Excited LPC (SWELP). In this case, natural sounding good quality synthetic speech is obtained at around 2.4 kbits/s

    Techniques for the Regeneration of Wideband Speech from Narrowband Speech

    Get PDF
    This paper addresses the problem of reconstructing wideband speech signals from observed narrowband speech signals. The goal of this work is to improve the perceived quality of speech signals which have been transmitted through narrowband channels or degraded during acquisition. We describe a system, based on linear predictive coding, for estimating wideband speech from narrowband. This system employs both previously identified and novel techniques. Experimental results are provided in order to illustrate the system’s ability to improve speech quality. Both objective and subjective criteria are used to evaluate the quality of the processed speech signals

    Reconstruction of Phonated Speech from Whispers Using Formant-Derived Plausible Pitch Modulation

    Get PDF
    Whispering is a natural, unphonated, secondary aspect of speech communications for most people. However, it is the primary mechanism of communications for some speakers who have impaired voice production mechanisms, such as partial laryngectomees, as well as for those prescribed voice rest, which often follows surgery or damage to the larynx. Unlike most people, who choose when to whisper and when not to, these speakers may have little choice but to rely on whispers for much of their daily vocal interaction. Even though most speakers will whisper at times, and some speakers can only whisper, the majority of today’s computational speech technology systems assume or require phonated speech. This article considers conversion of whispers into natural-sounding phonated speech as a noninvasive prosthetic aid for people with voice impairments who can only whisper. As a by-product, the technique is also useful for unimpaired speakers who choose to whisper. Speech reconstruction systems can be classified into those requiring training and those that do not. Among the latter, a recent parametric reconstruction framework is explored and then enhanced through a refined estimation of plausible pitch from weighted formant differences. The improved reconstruction framework, with proposed formant-derived artificial pitch modulation, is validated through subjective and objective comparison tests alongside state-of-the-art alternatives

    Application of advanced on-board processing concepts to future satellite communications systems: Bibliography

    Get PDF
    Abstracts are presented of a literature survey of reports concerning the application of signal processing concepts. Approximately 300 references are included

    Rethinking network governance: new forms of analysis and the implications for IGR/MLG

    Get PDF
    Our position is that network governance can be understood as a communicative arena. Networks, then, are not defined by frequency of interactions between actors but by sharing of and contest between different clusters of ideas, theories and normative orientations (discourses) in relation to the specific context within which actors operate. A discourse comprises an ensemble of ideas, concepts and causal theories that give meaning to and reproduce ways of understanding the world (Chouliaraki and Fairclough 1999). Consequently, network governance can be understood as the inherently political process through which discourses are produced, reproduced and transformed. Democratic network governance thus becomes the study of the way in which the core challenges of democratic practice are addressed – how is legitimacy awarded, by what mechanisms are decisions reached, and how is accountability enabled. Three approaches to the discursive analysis of democracy in network governance are considered - argumentation analysis, inter-subjectivity, and critical discourse analysis – and their implications for the study of intergovernmental relations and multi-level governance (IGR/MLG) are discussed. Case examples are provided. We conclude that the value for the study of MLG/IGR is to complement existing forms of analysis by opening up the communicative and ideational aspects of interactions between levels of government and other actors

    Coding Strategies for Cochlear Implants Under Adverse Environments

    Get PDF
    Cochlear implants are electronic prosthetic devices that restores partial hearing in patients with severe to profound hearing loss. Although most coding strategies have significantly improved the perception of speech in quite listening conditions, there remains limitations on speech perception under adverse environments such as in background noise, reverberation and band-limited channels, and we propose strategies that improve the intelligibility of speech transmitted over the telephone networks, reverberated speech and speech in the presence of background noise. For telephone processed speech, we propose to examine the effects of adding low-frequency and high- frequency information to the band-limited telephone speech. Four listening conditions were designed to simulate the receiving frequency characteristics of telephone handsets. Results indicated improvement in cochlear implant and bimodal listening when telephone speech was augmented with high frequency information and therefore this study provides support for design of algorithms to extend the bandwidth towards higher frequencies. The results also indicated added benefit from hearing aids for bimodal listeners in all four types of listening conditions. Speech understanding in acoustically reverberant environments is always a difficult task for hearing impaired listeners. Reverberated sounds consists of direct sound, early reflections and late reflections. Late reflections are known to be detrimental to speech intelligibility. In this study, we propose a reverberation suppression strategy based on spectral subtraction to suppress the reverberant energies from late reflections. Results from listening tests for two reverberant conditions (RT60 = 0.3s and 1.0s) indicated significant improvement when stimuli was processed with SS strategy. The proposed strategy operates with little to no prior information on the signal and the room characteristics and therefore, can potentially be implemented in real-time CI speech processors. For speech in background noise, we propose a mechanism underlying the contribution of harmonics to the benefit of electroacoustic stimulations in cochlear implants. The proposed strategy is based on harmonic modeling and uses synthesis driven approach to synthesize the harmonics in voiced segments of speech. Based on objective measures, results indicated improvement in speech quality. This study warrants further work into development of algorithms to regenerate harmonics of voiced segments in the presence of noise
    • …
    corecore