16 research outputs found
Codage des signaux par EMD
In this letter a new signals coding framework based on the Empirical Mode Decomposition (EMD) is introduced. The EMD breaks down any signal into a reduced number of oscillating components called Intrinsic Modes Decomposition (IMFs). Based on IMF properties, different coding strategies are presented. No assumptions concerning the linearity or the stationarity are made about the signal to be coded. Results obtained on ECG signals are presented and compared to those of wavelets coding
Tatouage audio par EMD
In this paper a new adaptive audio watermarking algorithm based on Empirical Mode Decomposition (EMD) is introduced. The audio signal is divided into frames and each one is decomposed adaptively, by EMD, into intrinsic oscillatory components called Intrinsic Mode Functions (IMFs). The watermark and the synchronization codes are embedded into the extrema of the last IMF, a low frequency mode stable under different attacks and preserving audio perceptual quality of the host signal. The data embedding rate of the proposed algorithm is 46.9â50.3 b/s. Relying on exhaustive simulations, we show the robustness of the hidden watermark for additive noise, MP3 compression, re-quantization, filtering, cropping and resampling. The comparison analysis shows that our method has better performance than watermarking schemes reported recently
Rehaussement du signal de parole par EMD et opérateur de Teager-Kaiser
The authors would like to thank Professor Mohamed Bahoura from Universite de Quebec a Rimouski for fruitful discussions on time adaptive thresholdingIn this paper a speech denoising strategy based on time adaptive thresholding of intrinsic modes functions (IMFs) of the signal, extracted by empirical mode decomposition (EMD), is introduced. The denoised signal is reconstructed by the superposition of its adaptive thresholded IMFs. Adaptive thresholds are estimated using the TeagerâKaiser energy operator (TKEO) of signal IMFs. More precisely, TKEO identifies the type of frame by expanding differences between speech and non-speech frames in each IMF. Based on the EMD, the proposed speech denoising scheme is a fully data-driven approach. The method is tested on speech signals with different noise levels and the results are compared to EMD-shrinkage and wavelet transform (WT) coupled with TKEO. Speech enhancement performance is evaluated using output signal to noise ratio (SNR) and perceptual evaluation of speech quality (PESQ) measure. Based on the analyzed speech signals, the proposed enhancement scheme performs better than WT-TKEO and EMD-shrinkage approaches in terms of output SNR and PESQ. The noise is greatly reduced using time-adaptive thresholding than universal thresholding. The study is limited to signals corrupted by additive white Gaussian noise
Codage audio perceptuel à bas débit par Décomposition Modale Empirique (EMD)
National audienceCet article prĂ©sente une nouvelle technique de compression audio Ă bas dĂ©bit, basĂ©e sur la dĂ©composition EMD (Empirical Mode Decomposition), en association avec un modĂšle d'audition (seuil de masquage). Par un processus de tamisage, le signal audio est dĂ©composĂ© en une somme finie de composantes de type AM-FM, appelĂ©es IMF (Intrinsic Mode Function) parfaitement dĂ©crites par leurs extrema. En adoptant un seuillage appropriĂ©, contrĂŽlĂ© par le modĂšle psycho-acoustique, seuls les extrema pertinents d'une IMF sont codĂ©s. Le nombre de bits allouĂ© au codage des extrema seuillĂ©s varie d'une IMF Ă une autre et respecte la contrainte d'inaudibilitĂ© de l'erreur de quantification. Les techniques de seuillage des extrema et d'allocation des bits sur lesquelles repose le procĂ©dĂ© de compression proposĂ©, garantissent un bas dĂ©bit et une bonne qualitĂ© d'Ă©coute du signal codĂ©-dĂ©codĂ©. Les rĂ©sultats obtenus sur diffĂ©rents signaux audio, mettent en Ă©vidence l'intĂ©rĂȘt de l'approche proposĂ©e. ComparĂ© Ă une compression par ondelette et au codeur MP3, le codeur proposĂ© prĂ©sente un gain de performances significatif en termes de taux de compression et de qualitĂ© d'Ă©coute
Speech Enhancement via EMD
WOSInternational audienceIn this study, two new approaches for speech signal noise reduction based on the empirical mode decomposition (EMD) recently introduced by Huang et al. (1998) are proposed. Based on the EMD, both reduction schemes are fully data-driven approaches. Noisy signal is decomposed adaptively into oscillatory components called intrinsic mode functions (IMFs), using a temporal decomposition called sifting process. Two strategies for noise reduction are proposed: filtering and thresholding. The basic principle of these two methods is the signal reconstruction with IMFs previously filtered, using the minimum mean-squared error (MMSE) filter introduced by I. Y. Soon et al. (1998), or thresholded using a shrinkage function. The performance of these methods is analyzed and compared with those of the MMSE filter and wavelet shrinkage. The study is limited to signals corrupted by additive white Gaussian noise. The obtained results show that the proposed denoising schemes perform better than the MMSE filter and wavelet approach
Audio encoding using Huang and Hilbert transforms
International audienceIn this paper an audio coding scheme based on the Empirical Mode Decomposition (EMD) in association with the Hilbert transform is presented. The audio signal is decomposed adaptively into intrinsic oscillatory components called Intrinsic Mode Functions (IMFs) by EMD, and the associated instantaneous amplitudes and the instantaneous phases are calculated. The basic principle of the proposed approach consists in encoding the instantaneous amplitudes by linear prediction and the instantaneous phases by scalar quantization. The decoder recovers the original signal from IMFs reconstruction by demodulation and summation. The compression method is applied to different audio signals, and results are compared to MP3 a variable bit rate coder and to wavelet approaches
Audio encoding based on the empirical mode decomposition
National audienceThis paper deals with a new approach for perceptual audio encoding, based on the Empirical Mode Decomposition (EMD). The audio signal is decomposed adaptively into intrinsic oscillatory components by EMD called Intrinsic Mode Functions (IMFs), which can be fully described by their extrema. These extrema are encoded after an appropriate thresholding scheme controlled by a psycho-acoustic model. The decoder recovers the original signal after IMFs reconstruction by means of spline interpolation and their summation. The proposed approach is applied to different audio signals and results are compared to wavelets and to MPEG1-layer3 (MP3)approaches. Relying on exhaustive simulations, the obtained results show that the proposed compression scheme performs better than the MP3 and the wavelet approach in terms of bit rate and audio quality
Classification of MRI brain tumors based on registration preprocessing and deep belief networks
In recent years, augmented reality has emerged as an emerging technology with huge potential in image-guided surgery, and in particular, its application in brain tumor surgery seems promising. Augmented reality can be divided into two parts: hardware and software. Further, artificial intelligence, and deep learning in particular, have attracted great interest from researchers in the medical field, especially for the diagnosis of brain tumors. In this paper, we focus on the software part of an augmented reality scenario. The main objective of this study was to develop a classification technique based on a deep belief network (DBN) and a softmax classifier to (1) distinguish a benign brain tumor from a malignant one by exploiting the spatial heterogeneity of cancer tumors and homologous anatomical structures, and (2) extract the brain tumor features. In this work, we developed three steps to explain our classification method. In the first step, a global affine transformation is preprocessed for registration to obtain the same or similar results for different locations (voxels, ROI). In the next step, an unsupervised DBN with unlabeled features is used for the learning process. The discriminative subsets of features obtained in the first two steps serve as input to the classifier and are used in the third step for evaluation by a hybrid system combining the DBN and a softmax classifier. For the evaluation, we used data from Harvard Medical School to train the DBN with softmax regression. The model performed well in the classification phase, achieving an improved accuracy of 97.2%
Traitement et analyse des signaux sonores par la transformée de Huang (EMD)
This dissertation explores the potential of EMD as analyzing tool for audio and speech processing. This signal expansion into IMFs is adaptive and without any prior assumptions (stationarity and linearity) on the signal to be analyzed. Salient properties of EMD such as dyadic filter bank structure, quasi-symmetry of IMF and fully description of IMF by its extrema, are exploited for denoising, coding and watermarking purposes. In speech signals denoising, we initially proposed a technique based on IMFs thresholding. A comparative analysis of performance of this technique compared to the denoising technique based on the wavelet. Then, to remedy the problem of the MMSE filters which requires an estimation of the spectral properties of noise, we introduced the ACWA filter in the denoising procedure. The proposed approach is consisted to filter all IMFs of the noisy signal by ACWA filter. This filtering approach is implemented in the time domain, and also applicable in the context of colored noise. Finally, to handle the case of hybrid speech frames, that is composed of voiced and unvoiced speech, we introduced a stationarity index in the denoising approach to detect the transition between the mixture of voiced and unvoiced sounds. In audio signals coding, we proposed four compression approaches. The first two approaches are based on the EMD, and the other two approaches exploit the EMD in association with Hilbert transform. In particular, we proposed to use a predictive coding of the instantaneous amplitude and frequency of the IMFs Finally, we studied the problem of audio signals watermarking in context of copyright protection. The number of IMFs can be variable depending on the attack type. The proposed approach involves inserting the mark in the extrema of last IMFs. In addition, we introduced a synchronization code in the procedure in order to facility the extraction of the mark. These contributions are illustrated on synthetic and real data and results compared to well established methods such as MMSE filter, wavelets approach, MP3 and AAC coders showing the good performances of EMD based signal processes. These findings demonstrate the real potential of EMD as analyzing tool (in adaptive way) in speech and audio processing.Cette thĂšse explore les apports de l'EMD (Empirical Mode Decomposition) pour le traitement des signaux audio et de parole. Cette dĂ©composition conduit Ă une reprĂ©sentation du signal comme une somme de modes orthogonaux, ou IMFs (Intrinsic Mode Functions). Elle est adaptative et ne fait pas d'hypothĂšses de type stationnaritĂ© ou linĂ©aritĂ© sur le signal Ă analyser. Le comportement en banc de filtre dyadique de l'EMD ainsi que la quasi-symĂ©trie des modes, qui permet de les reprĂ©senter Ă partir de leurs extrema, sont les propriĂ©tĂ©s Ă l'origine des outils dĂ©veloppĂ©s dans cette thĂšse qui aborde plus spĂ©cifiquement le dĂ©bruitage, le codage et le tatouage de des signaux audio et de parole. Dans le cadre du dĂ©bruitage des signaux de parole, nous avons initialement proposĂ© une technique basĂ©e sur le seuillage des IMFs. Nous avons effectuĂ© une analyse comparative des performances de cette technique par rapport au dĂ©bruitage effectuĂ© Ă base d'ondelettes. Ensuite, pour remĂ©dier au problĂšme de l'emploi de filtres MMSE qui nĂ©cessite l'estimation des propriĂ©tĂ©s spectrales du bruit, nous avons introduit le filtre ACWA dans la procĂ©dure de dĂ©bruitage. L'algorithme proposĂ© consiste Ă filtrer toutes les IMFs du signal de parole bruitĂ©, soit au moyen d'un filtre ACWA, soit par seuillage. Ce filtrage, implĂ©mentĂ© dans le domaine temporel, permet en particulier de traiter le cas de bruits colorĂ©s. Finalement, afin de gĂ©rer le cas de trames de parole hybrides, constituĂ©es de mĂ©langes de sĂ©quences voisĂ©es et non voisĂ©es, nous avons introduit un indice de stationnaritĂ© dans la procĂ©dure de dĂ©bruitage afin de dĂ©tecter les trames de transition entre sons voisĂ©s et non voisĂ©s. Dans le cadre du codage des signaux audio et de parole, nous avons proposĂ© quatre techniques de compression. Les deux premiĂšres approches sont basĂ©es sur l'EMD et les suivantes exploitent l'EMD en association avec la transformĂ©e de Hilbert. En particulier, nous avons proposĂ© d'employer un codage prĂ©dictif de l'amplitude et de la frĂ©quence instantanĂ©e des IMFs. Finalement, nous avons Ă©galement Ă©tudiĂ© le problĂšme du tatouage des signaux audio et de parole dans le contexte de la protection des droits d'auteurs. Le nombre d'IMFs peut ĂȘtre variable selon l'attaque mise en oeuvre mais la procĂ©dure proposĂ©e, qui consiste Ă insĂ©rer la marque du tatouage dans le codage des extrema de la derniĂšre IMF reste robuste aux attaques classiques. De plus, nous avons introduit un code de synchronisation de la marque afin d'en faciliter l'extraction. Ces diffĂ©rentes contributions sont illustrĂ©es sur des donnĂ©es synthĂ©tiques et rĂ©elles et les rĂ©sultats comparĂ©s Ă ceux de mĂ©thodes Ă©prouvĂ©es telles que le filtre MMSE pour le dĂ©bruitage, les traitements par ondelettes et les codecs AAC et MP3 pour le codage ou les principales techniques de tatouage. Ces tests montrent les bonnes performances des algorithmes dĂ©veloppĂ©s autour de l'EMD et illustrent la puissance de cet outil pour l'analyse et le traitement des signaux audio et de parole
HHT-based audio coding
International audienceIn this paper a new audio coding scheme combining the Hilbert transform and the Empirical Mode Decomposition (EMD) is introduced. Based on the EMD, the coding is fully data-driven approach. Audio signal is first decomposed adaptively, by EMD, into intrinsic oscillatory components called Intrinsic Mode Functions (IMFs). The key idea of this work is to code both instantaneous amplitude (IA) and instantaneous frequency (IF), of the extracted IMFs, calculated using Hilbert transform. Since IA (resp. IF) is strongly correlated, it is encoded via a linear prediction technique. The decoder recovers the original signal by superposition of the demodulated IMFs. The proposed approach is applied to audio signals, and the results are compared to those obtained by AAC (Advanced Audio Coding) and MP3 codecs, and wavelets based compression. Coding performances are evaluated using the bit rate, Objective Difference Grade (ODG) and Noise to Mask Ratio (NMR) measures. Based on the analyzed audio signals, overall, our coding scheme performs better than wavelet compression, AAC and MP3 codecs. Results also show that this new scheme has good coding performances without significant perceptual distortion, resulting in an ODG in range [-1,0] and large negative NMR values