2,154 research outputs found

    Improved signal analysis and time-synchronous reconstruction in waveform interpolation coding

    Get PDF
    This paper presents a waveform-matched waveform interpolation (WMWI) technique which enables improved speech analysis over existing WI coders. In WMWI, an accurate representation of speech evolution is produced by extracting critically-sampled pitch periods of a time-warped, constant pitch residual. The technique also offers waveform-matching capabilities by using an inverse warping process to near-perfectly reconstruct the residual. Here, a pitch track optimisation technique is described which ensures the speech residual can be effectively decomposed and quantised. Also, the pitch parameters required to efficiently quantise and recreate the pitch track, on a period-by-period basis, are identified. This allows time-synchrony between the original and decoded signals to be preserved

    A Novel Approach to Stuttered Speech Correction

    Get PDF
    Stuttered speech is a dysfluency rich speech, more prevalent in males than females. It has been associated with insufficient air pressure or poor articulation, even though the root causes are more complex. The primary features include prolonged speech and repetitive speech, while some of its secondary features include, anxiety, fear, and shame. This study used LPC analysis and synthesis algorithms to reconstruct the stuttered speech. The results were evaluated using cepstral distance, Itakura-Saito distance, mean square error, and likelihood ratio. These measures implied perfect speech reconstruction quality. ASR was used for further testing, and the results showed that all the reconstructed speech samples were perfectly recognized while only three samples of the original speech were perfectly recognized

    Quantisation mechanisms in multi-protoype waveform coding

    Get PDF
    Prototype Waveform Coding is one of the most promising methods for speech coding at low bit rates over telecommunications networks. This thesis investigates quantisation mechanisms in Multi-Prototype Waveform (MPW) coding, and two prototype waveform quantisation algorithms for speech coding at bit rates of 2.4kb/s are proposed. Speech coders based on these algorithms have been found to be capable of producing coded speech with equivalent perceptual quality to that generated by the US 1016 Federal Standard CELP-4.8kb/s algorithm. The two proposed prototype waveform quantisation algorithms are based on Prototype Waveform Interpolation (PWI). The first algorithm is in an open loop architecture (Open Loop Quantisation). In this algorithm, the speech residual is represented as a series of prototype waveforms (PWs). The PWs are extracted in both voiced and unvoiced speech, time aligned and quantised and, at the receiver, the excitation is reconstructed by smooth interpolation between them. For low bit rate coding, the PW is decomposed into a slowly evolving waveform (SEW) and a rapidly evolving waveform (REW). The SEW is coded using vector quantisation on both magnitude and phase spectra. The SEW codebook search is based on the best matching of the SEW and the SEW codebook vector. The REW phase spectra is not quantised, but it is recovered using Gaussian noise. The REW magnitude spectra, on the other hand, can be either quantised with a certain update rate or only derived according to SEW behaviours

    Improvement of Speech Perception for Hearing-Impaired Listeners

    Get PDF
    Hearing impairment is becoming a prevalent health problem affecting 5% of world adult populations. Hearing aids and cochlear implant already play an essential role in helping patients over decades, but there are still several open problems that prevent them from providing the maximum benefits. Financial and discomfort reasons lead to only one of four patients choose to use hearing aids; Cochlear implant users always have trouble in understanding speech in a noisy environment. In this dissertation, we addressed the hearing aids limitations by proposing a new hearing aid signal processing system named Open-source Self-fitting Hearing Aids System (OS SF hearing aids). The proposed hearing aids system adopted the state-of-art digital signal processing technologies, combined with accurate hearing assessment and machine learning based self-fitting algorithm to further improve the speech perception and comfort for hearing aids users. Informal testing with hearing-impaired listeners showed that the testing results from the proposed system had less than 10 dB (by average) difference when compared with those results obtained from clinical audiometer. In addition, Sixteen-channel filter banks with adaptive differential microphone array provides up to six-dB SNR improvement in the noisy environment. Machine-learning based self-fitting algorithm provides more suitable hearing aids settings. To maximize cochlear implant users’ speech understanding in noise, the sequential (S) and parallel (P) coding strategies were proposed by integrating high-rate desynchronized pulse trains (DPT) in the continuous interleaved sampling (CIS) strategy. Ten participants with severe hearing loss participated in the two rounds cochlear implants testing. The testing results showed CIS-DPT-S strategy significantly improved (11%) the speech perception in background noise, while the CIS-DPT-P strategy had a significant improvement in both quiet (7%) and noisy (9%) environment

    Mach Bands: How Many Models are Possible? Recent Experiemental Findings and Modeling Attempts

    Full text link
    Mach bands are illusory bright and dark bands seen where a luminance plateau meets a ramp, as in half-shadows or penumbras. A tremendous amount of work has been devoted to studying the psychophysics and the potential underlying neural circuitry concerning this phenomenon. A number of theoretical models have also been proposed, originating in the seminal studies of Mach himself. The present article reviews the main experimental findings after 1965 and the main recent theories of early vision that have attempted to discount for the effect. It is shown that the different theories share working principles and can be grouped in three clsses: a) feature-based; b) rule-based; and c) filling-in. In order to evaluate individual proposals it is necessary to consider them in the larger picture of visual science and to determine how they contribute to the understanding of vision in general.Air Force Office of Scientific Research (F49620-92-J-0334); Office of Naval Research (N00014-J-4100); COPPE/UFRJ, Brazi

    A Signal processing approach for preprocessing and 3d analysis of airborne small-footprint full waveform lidar data

    Get PDF
    The extraction of structural object metrics from a next generation remote sensing modality, namely waveform light detection and ranging (LiDAR), has garnered increasing interest from the remote sensing research community. However, a number of challenges need to be addressed before structural or 3D vegetation modeling can be accomplished. These include proper processing of complex, often off-nadir waveform signals, extraction of relevant waveform parameters that relate to vegetation structure, and from a quantitative modeling perspective, 3D rendering of a vegetation object from LiDAR waveforms. Three corresponding, broad research objectives therefore were addressed in this dissertation. Firstly, the raw incoming LiDAR waveform typically exhibits a stretched, misaligned, and relatively distorted character. A robust signal preprocessing chain for LiDAR waveform calibration, which includes noise reduction, deconvolution, waveform registration, and angular rectification is presented. This preprocessing chain was validated using both simulated waveform data of high fidelity 3D vegetation models, which were derived via the Digital Imaging and Remote Sensing Image Generation (DIRSIG) modeling environment and real small-footprint waveform LiDAR data, collected by the Carnegie Airborne Observatory (CAO) in a savanna region of South Africa. Results showed that the preprocessing approach significantly increased our ability to recover the temporal signal resolution, and resulted in improved waveform-based vegetation biomass estimation. Secondly, a model for savanna vegetation biomass was derived using the resultant processed waveform data and by decoding the waveform in terms of feature metrics for woody and herbaceous biomass estimation. The results confirmed that small-footprint waveform LiDAR data have significant potential in the case of this application. Finally, a 3D image clustering-based waveform LiDAR inversion model was developed for 1st order (principal branch level) 3D tree reconstruction in both leaf-off and leaf-on conditions. These outputs not only contribute to the visualization of complex tree structures, but also benefit efforts related to the quantification of vegetation structure for natural resource applications from waveform LiDAR data

    Integrated source and channel encoded digital communication system design study

    Get PDF
    The particular Ku-band carrier, PN despreading, and symbol synchronization strategies, which were selected for implementation in the Ku-band transponder aboard the orbiter, were assessed and evaluated from a systems performance viewpoint, verifying that system specifications were met. A study was performed of the design and implementation of tracking techniques which are suitable for incorporation into the Orbiter Ku-band communication system. Emphasis was placed on maximizing tracking accuracy and communication system flexibility while minimizing cost, weight, and system complexity of Orbiter and ground systems hardware. The payload communication study assessed the design and performance of the forward link and return link bent-pipe relay modes for attached and detached payloads. As part of this study, a design for a forward link bent-pipe was proposed which employs a residual carrier but which is tracked by the existing Costas loop

    Stereo linear predictive coding of audio

    Get PDF
    corecore