Search CORE

47 research outputs found

Over-Sampling for Accurate Masking Threshold Calculation in Wavelet Packet Audio Coders

Author: Ambikairajah E.
Bradley A. P.
Sinaga F.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2004
Field of study

Many existing audio coders use a critically sampled discrete wavelet transform (DWT) for the decomposition of audio signals. While the aliasing present in the wavelet coefficients is cancelled in the decoder, these coders normally perform calculation of the simultaneous masking threshold directly on these aliased coefficients. This paper uses over-sampling in the wavelet packet decomposition in order to provide alias-free coefficients for accurate simultaneous masking threshold calculation. The proposed technique is compared with masking threshold calculation based upon the FFT and critically-sampled wavelet coefficients, and the results show that a bit rate saving of up to 16 kbit/s can be achieved using over-sampling

Queensland University of Technology ePrints Archive

University of Queensland eSpace

Analysis of orientation error of triaxial accelerometers on the assessment of energy expenditure

Author: Ambikairajah E
Celler BG
Su SW
Wang L
Publication venue
Publication date: 01/12/2005
Field of study

This paper investigates the effects of orientation error in the positioning of triaxial accelerometers on the assessment of energy expenditure. Four subjects walked on a treadmill at varying velocities ranging from 4km.h -1 to 5km.h-1. During each test, a triaxial accelerometer attached to the lower back at arbitrary orientations to record body accelerations. Energy expenditure was estimated by the sum of the integrals of the absolute value of accelerometer output from all the three measurement directions. Based on theoretical analysis and experimental observations, it is concluded that small orientation errors ( < 3° ) have no distinguishable effects on the estimation of energy expenditure. We propose an efficient method to compensate for larger orientation errors. The experimental results verified the effectiveness of this proposed compensation method. ©2005 IEEE

OPUS - University of Technology Sydney

Estimation of walking energy expenditure by using support vector regression

Author: Ambikairajah E
Celler BG
Savkin AV
Su SW
Wang L
Publication venue
Publication date: 01/12/2005
Field of study

This paper develops a new predictor of walking energy expenditure from wireless measurements of body movements using triaxial accelerometers. Reliable data were collected from repeated walking experiments in different conditions on a treadmill with simultaneous measurement of expired oxygen and carbon dioxide. Support vector regression, a powerful non-linear regression method, was used to process and model the data. This novel processing method sets this investigation apart from existing papers. Good results were achieved in the robust estimation of walking related energy expenditure from a number of variables derived from triaxial accelerometer and treadmill speed. ©2005 IEEE

OPUS - University of Technology Sydney

Auditory modelling for speech processing in the perceptual domain

Author: Ambikairajah E.
Holmes W. H.
Lin L.
Publication venue: Australian Mathematical Society
Publication date: 01/09/2004
Field of study

The human hearing system is the most robust speech processor despite noisy environments. This work presents a new computational model for our auditory system by exploring the psychoacoustical masking properties. The model is then applied to speech coding in the perceptual domain. The coding algorithm is capable of producing high quality coded speech and audio, which account for temporal as well as spectral details. The proposed filterbank is also applied to speech denoising in the perceptual domain. The enhanced speech is of good perceptual quality

Australian Mathematical Society (AustMS): E-Journals

Recommended from our members

Multimodal Affect Models: An Investigation of Relative Salience of Audio and Visual Cues for Emotion Prediction

Author: Ambikairajah E
Dang T
Sethu V
Wu J
Publication venue: Frontiers in Computer Science
Publication date: 10/01/2022
Field of study

People perceive emotions via multiple cues, predominantly speech and visual cues, and a number of emotion recognition systems utilize both audio and visual cues. Moreover, the perception of static aspects of emotion (speaker's arousal level is high/low) and the dynamic aspects of emotion (speaker is becoming more aroused) might be perceived via different expressive cues and these two aspects are integrated to provide a unified sense of emotion state. However, existing multimodal systems only focus on single aspect of emotion perception and the contributions of different modalities toward modeling static and dynamic emotion aspects are not well explored. In this paper, we investigate the relative salience of audio and video modalities to emotion state prediction and emotion change prediction using a Multimodal Markovian affect model. Experiments conducted in the RECOLA database showed that audio modality is better at modeling the emotion state of arousal and video for emotion state of valence, whereas audio shows superior advantages over video in modeling emotion changes for both arousal and valence.</jats:p

Apollo (Cambridge)

FORWARD MASKING THRESHOLD ESTIMATION USING NEURAL NETWORKS AND ITS APPLICATION TO PARALLEL SPEECH ENHANCEMENT

Author: E. Ambikairajah
O. O. Khalifa
T. S. Gunawan
Publication venue: 'IIUM Press'
Publication date: 01/01/2010
Field of study

Forward masking models have been used successfully in speech enhancement and audio coding. Presently, forward masking thresholds are estimated using simplified masking models which have been used for audio coding and speech enhancement applications. In this paper, an accurate approximation of forward masking threshold estimation using neural networks is proposed. A performance comparison to the other existing masking models in speech enhancement application is presented. Objective measures using PESQ demonstrates that our proposed forward masking model, provides significant improvements (5-15 %) over four existing models, when tested with speech signals corrupted by various noises at very low signal to noise ratios. Moreover, a parallel implementation of the speech enhancement algorithm was developed using Matlab parallel computing toolbox

CiteSeerX

Directory of Open Access Journals

The International Islamic University Malaysia Repository

Speech coding using compressive sensing on a multicore system

Author: Ambikairajah E.
Gunawan Teddy Surya
Khalifa Othman Omran
Shafie Amir Akramin
Publication venue: 'IIUM Press'
Publication date: 01/01/2011
Field of study

The International Islamic University Malaysia Repository

The I4U Mega Fusion and Collaboration for NIST Speaker Recognition Evaluation 2016

Author: Ajili M.
Alegre F.
Ambikairajah E.
Aronowitz H.
Bahmaninezhad F.
Bonastre J. F.
Bousquet P. M.
Busch C.
Chng E. S.
Delgado H.
Evans N.
Fauve B.
Halonen M.
Hansen J. H.L.
Hautamäki V.
Isadskiy S.
Jin R.
Kanervisto A.
Kheder W. B.
Kinnunen T.
Larcher A.
Le Lan G.
Lee K. A.
Li H.
Li Haizhou
Lim Z. H.
Lin W. W.
Liu Gang
Ma B.
Ma J.
Mak M. W.
Matrouf D.
Nautsch A.
Nguyen T. H.
Qian Q.
Rao W.
Rathgeb C.
Rouvier M.
Saeidi R.
Sahidullah M.
Sarkar A. K.
Sethu V.
Sizov A.
Sriskandaraja K.
Stafylakis T.
Sun H.
Tan Z. H.
Thomsen D. A.L.
Todisco M.
Tzimiropoulos G.
Vestman V.
Wang G.
Wang Tianzhou
Wang Z.
Xiao X.
Xu C.
Xu H.
Xue J.
Zhang C.
Zhao Q.
Zhao T.
Zhu S.
Publication venue: 'International Speech Communication Association'
Publication date: 01/01/2017
Field of study

The 2016 speaker recognition evaluation (SRE'16) is the latest edition in the series of benchmarking events conducted by the National Institute of Standards and Technology (NIST). I4U is a joint entry to SRE'16 as the result from the collaboration and active exchange of information among researchers from sixteen Institutes and Universities across 4 continents. The joint submission and several of its 32 sub-systems were among top-performing systems. A lot of efforts have been devoted to two major challenges, namely, unlabeled training data and dataset shift from Switchboard-Mixer to the new Call My Net dataset. This paper summarizes the lessons learned, presents our shared view from the sixteen research groups on recent advances, major paradigm shift, and common tool chain used in speaker recognition as we have witnessed in SRE'16. More importantly, we look into the intriguing question of fusing a large ensemble of sub-systems and the potential benefit of large-scale collaboration.Peer reviewe

Aaltodoc Publication Archive

VBN

Electromagnetic Transient Analysis of Stator Winding Insulation Failure in Induction Machines

Author: Ambikairajah E
Malekpour M
Phung TT
Publication venue
Publication date: 28/08/2015
Field of study

UNSWorks

2014 OptoElectronics and Communication Conference, OECC 2014 and Australian Conference on Optical Fibre Technology, ACOFT 2014

Author: Ambikairajah E
Bhowmik K
Peng GD
Rajan G
Publication venue
Publication date: 01/01/2014
Field of study

Research on singlemode polymer fiber Bragg gratings and their applications has been considerably progressed in the recent years and in this paper we report the recent research developments on polymer FBG sensor applications. © 2014 Engineers Australia

UNSWorks