47 research outputs found

    Over-Sampling for Accurate Masking Threshold Calculation in Wavelet Packet Audio Coders

    Get PDF
    Many existing audio coders use a critically sampled discrete wavelet transform (DWT) for the decomposition of audio signals. While the aliasing present in the wavelet coefficients is cancelled in the decoder, these coders normally perform calculation of the simultaneous masking threshold directly on these aliased coefficients. This paper uses over-sampling in the wavelet packet decomposition in order to provide alias-free coefficients for accurate simultaneous masking threshold calculation. The proposed technique is compared with masking threshold calculation based upon the FFT and critically-sampled wavelet coefficients, and the results show that a bit rate saving of up to 16 kbit/s can be achieved using over-sampling

    Analysis of orientation error of triaxial accelerometers on the assessment of energy expenditure

    Full text link
    This paper investigates the effects of orientation error in the positioning of triaxial accelerometers on the assessment of energy expenditure. Four subjects walked on a treadmill at varying velocities ranging from 4km.h -1 to 5km.h-1. During each test, a triaxial accelerometer attached to the lower back at arbitrary orientations to record body accelerations. Energy expenditure was estimated by the sum of the integrals of the absolute value of accelerometer output from all the three measurement directions. Based on theoretical analysis and experimental observations, it is concluded that small orientation errors ( < 3° ) have no distinguishable effects on the estimation of energy expenditure. We propose an efficient method to compensate for larger orientation errors. The experimental results verified the effectiveness of this proposed compensation method. ©2005 IEEE

    Estimation of walking energy expenditure by using support vector regression

    Full text link
    This paper develops a new predictor of walking energy expenditure from wireless measurements of body movements using triaxial accelerometers. Reliable data were collected from repeated walking experiments in different conditions on a treadmill with simultaneous measurement of expired oxygen and carbon dioxide. Support vector regression, a powerful non-linear regression method, was used to process and model the data. This novel processing method sets this investigation apart from existing papers. Good results were achieved in the robust estimation of walking related energy expenditure from a number of variables derived from triaxial accelerometer and treadmill speed. ©2005 IEEE

    Auditory modelling for speech processing in the perceptual domain

    Get PDF
    The human hearing system is the most robust speech processor despite noisy environments. This work presents a new computational model for our auditory system by exploring the psychoacoustical masking properties. The model is then applied to speech coding in the perceptual domain. The coding algorithm is capable of producing high quality coded speech and audio, which account for temporal as well as spectral details. The proposed filterbank is also applied to speech denoising in the perceptual domain. The enhanced speech is of good perceptual quality

    FORWARD MASKING THRESHOLD ESTIMATION USING NEURAL NETWORKS AND ITS APPLICATION TO PARALLEL SPEECH ENHANCEMENT

    Get PDF
    Forward masking models have been used successfully in speech enhancement and audio coding. Presently, forward masking thresholds are estimated using simplified masking models which have been used for audio coding and speech enhancement applications. In this paper, an accurate approximation of forward masking threshold estimation using neural networks is proposed. A performance comparison to the other existing masking models in speech enhancement application is presented. Objective measures using PESQ demonstrates that our proposed forward masking model, provides significant improvements (5-15 %) over four existing models, when tested with speech signals corrupted by various noises at very low signal to noise ratios. Moreover, a parallel implementation of the speech enhancement algorithm was developed using Matlab parallel computing toolbox

    The I4U Mega Fusion and Collaboration for NIST Speaker Recognition Evaluation 2016

    Get PDF
    The 2016 speaker recognition evaluation (SRE'16) is the latest edition in the series of benchmarking events conducted by the National Institute of Standards and Technology (NIST). I4U is a joint entry to SRE'16 as the result from the collaboration and active exchange of information among researchers from sixteen Institutes and Universities across 4 continents. The joint submission and several of its 32 sub-systems were among top-performing systems. A lot of efforts have been devoted to two major challenges, namely, unlabeled training data and dataset shift from Switchboard-Mixer to the new Call My Net dataset. This paper summarizes the lessons learned, presents our shared view from the sixteen research groups on recent advances, major paradigm shift, and common tool chain used in speaker recognition as we have witnessed in SRE'16. More importantly, we look into the intriguing question of fusing a large ensemble of sub-systems and the potential benefit of large-scale collaboration.Peer reviewe

    2014 OptoElectronics and Communication Conference, OECC 2014 and Australian Conference on Optical Fibre Technology, ACOFT 2014

    Full text link
    Research on singlemode polymer fiber Bragg gratings and their applications has been considerably progressed in the recent years and in this paper we report the recent research developments on polymer FBG sensor applications. © 2014 Engineers Australia
    corecore