Search CORE

3,352 research outputs found

Improving subband spectral estimation using modified AR model

Author: Bonacci David
Mailhes Corinne
Publication venue: 'Elsevier BV'
Publication date: 01/05/2007
Field of study

It has already been shown that spectral estimation can be improved when applied to subband outputs of an adapted filterbank rather than to the original fullband signal. In the present paper, this procedure is applied jointly to a novel predictive autoregressive (AR) model. The model exploits time-shifting and is therefore referred to as time-shift AR (TSAR) model. Estimators are proposed for the unknown TS-AR parameters and the spectrum of the observed signal. The TS-AR model yields improved spectrum estimation by taking advantage of the correlation between subseries that after decimation. Simulation results on signals with continuous and line spectra that demonstrate the performance of the proposed method are provided

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

Theory of optimal orthonormal subband coders

Author: Vaidyanathan P. P.
Publication venue
Publication date: 01/06/1998
Field of study

The theory of the orthogonal transform coder and methods for its optimal design have been known for a long time. We derive a set of necessary and sufficient conditions for the coding-gain optimality of an orthonormal subband coder for given input statistics. We also show how these conditions can be satisfied by the construction of a sequence of optimal compaction filters one at a time. Several theoretical properties of optimal compaction filters and optimal subband coders are then derived, especially pertaining to behavior as the number of subbands increases. Significant theoretical differences between optimum subband coders, transform coders, and predictive coders are summarized. Finally, conditions are presented under which optimal orthonormal subband coders yield as much coding gain as biorthogonal ones for a fixed number of subbands

Caltech Authors

A Fully Time-domain Neural Model for Subband-based Speech Synthesizer

Author: Kim Geonmin
Kim Tae-Ho
Lee Soo-Young
Rabiee Azam
Publication venue
Publication date: 01/07/2019
Field of study

This paper introduces a deep neural network model for subband-based speech synthesizer. The model benefits from the short bandwidth of the subband signals to reduce the complexity of the time-domain speech generator. We employed the multi-level wavelet analysis/synthesis to decompose/reconstruct the signal into subbands in time domain. Inspired from the WaveNet, a convolutional neural network (CNN) model predicts subband speech signals fully in time domain. Due to the short bandwidth of the subbands, a simple network architecture is enough to train the simple patterns of the subbands accurately. In the ground truth experiments with teacher-forcing, the subband synthesizer outperforms the fullband model significantly in terms of both subjective and objective measures. In addition, by conditioning the model on the phoneme sequence using a pronunciation dictionary, we have achieved the fully time-domain neural model for subband-based text-to-speech (TTS) synthesizer, which is nearly end-to-end. The generated speech of the subband TTS shows comparable quality as the fullband one with a slighter network architecture for each subband.Comment: 5 pages, 3 figur

arXiv.org e-Print Archive

Adaptive filtering techniques for gravitational wave interferometric data: Removing long-term sinusoidal disturbances and oscillatory transients

Author: A. Abramovici
A. Abramovici et al.
B. Allen
B. Caron
B. S. Sathyaprakash
B. Widrow
E. Chassande-Mottin
H. Lück
J. R. Zeidler
K. Danzmann
K. Kudora
K. S. Thorne
O. Macchi
P. R. Saulson
S. D. Mohanty
S. Haykin
S. V. Dhurandhar
Publication venue: 'American Physical Society (APS)'
Publication date: 27/03/2000
Field of study

It is known by the experience gained from the gravitational wave detector proto-types that the interferometric output signal will be corrupted by a significant amount of non-Gaussian noise, large part of it being essentially composed of long-term sinusoids with slowly varying envelope (such as violin resonances in the suspensions, or main power harmonics) and short-term ringdown noise (which may emanate from servo control systems, electronics in a non-linear state, etc.). Since non-Gaussian noise components make the detection and estimation of the gravitational wave signature more difficult, a denoising algorithm based on adaptive filtering techniques (LMS methods) is proposed to separate and extract them from the stationary and Gaussian background noise. The strength of the method is that it does not require any precise model on the observed data: the signals are distinguished on the basis of their autocorrelation time. We believe that the robustness and simplicity of this method make it useful for data preparation and for the understanding of the first interferometric data. We present the detailed structure of the algorithm and its application to both simulated data and real data from the LIGO 40meter proto-type.Comment: 16 pages, 9 figures, submitted to Phys. Rev.

arXiv.org e-Print Archive

Information fusion for subband-HMM speaker recognition

Author: Damper R. I.
Dodd T. J.
Higgins J. E.
Publication venue
Publication date: 01/01/2001
Field of study

Southampton (e-Prints Soton)

Online Monaural Speech Enhancement Using Delayed Subband LSTM

Author: Horaud Radu
Li Xiaofei
Publication venue
Publication date: 11/05/2020
Field of study

This paper proposes a delayed subband LSTM network for online monaural (single-channel) speech enhancement. The proposed method is developed in the short time Fourier transform (STFT) domain. Online processing requires frame-by-frame signal reception and processing. A paramount feature of the proposed method is that the same LSTM is used across frequencies, which drastically reduces the number of network parameters, the amount of training data and the computational burden. Training is performed in a subband manner: the input consists of one frequency, together with a few context frequencies. The network learns a speech-to-noise discriminative function relying on the signal stationarity and on the local spectral pattern, based on which it predicts a clean-speech mask at each frequency. To exploit future information, i.e. look-ahead, we propose an output-delayed subband architecture, which allows the unidirectional forward network to process a few future frames in addition to the current frame. We leverage the proposed method to participate to the DNS real-time speech enhancement challenge. Experiments with the DNS dataset show that the proposed method achieves better performance-measuring scores than the DNS baseline method, which learns the full-band spectra using a gated recurrent unit network.Comment: Paper submitted to Interspeech 202

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL-Rennes 1

Applications of wavelet-based compression to multidimensional Earth science data

Author: Bradley Jonathan N.
Brislawn Christopher M.
Publication venue
Publication date: 01/01/1993
Field of study

A data compression algorithm involving vector quantization (VQ) and the discrete wavelet transform (DWT) is applied to two different types of multidimensional digital earth-science data. The algorithms (WVQ) is optimized for each particular application through an optimization procedure that assigns VQ parameters to the wavelet transform subbands subject to constraints on compression ratio and encoding complexity. Preliminary results of compressing global ocean model data generated on a Thinking Machines CM-200 supercomputer are presented. The WVQ scheme is used in both a predictive and nonpredictive mode. Parameters generated by the optimization algorithm are reported, as are signal-to-noise (SNR) measurements of actual quantized data. The problem of extrapolating hydrodynamic variables across the continental landmasses in order to compute the DWT on a rectangular grid is discussed. Results are also presented for compressing Landsat TM 7-band data using the WVQ scheme. The formulation of the optimization problem is presented along with SNR measurements of actual quantized data. Postprocessing applications are considered in which the seven spectral bands are clustered into 256 clusters using a k-means algorithm and analyzed using the Los Alamos multispectral data analysis program, SPECTRUM, both before and after being compressed using the WVQ program

NASA Technical Reports Server

UNT Digital Library

A Subband-Based SVM Front-End for Robust ASR

Author: Ager Matthew
Cvetkovic Zoran
Sollich Peter
Yousafzai Jibran
Publication venue
Publication date: 24/12/2013
Field of study

This work proposes a novel support vector machine (SVM) based robust automatic speech recognition (ASR) front-end that operates on an ensemble of the subband components of high-dimensional acoustic waveforms. The key issues of selecting the appropriate SVM kernels for classification in frequency subbands and the combination of individual subband classifiers using ensemble methods are addressed. The proposed front-end is compared with state-of-the-art ASR front-ends in terms of robustness to additive noise and linear filtering. Experiments performed on the TIMIT phoneme classification task demonstrate the benefits of the proposed subband based SVM front-end: it outperforms the standard cepstral front-end in the presence of noise and linear filtering for signal-to-noise ratio (SNR) below 12-dB. A combination of the proposed front-end with a conventional front-end such as MFCC yields further improvements over the individual front ends across the full range of noise levels

arXiv.org e-Print Archive

King's Research Portal

MDL Denoising Revisited

Author: Myllymäki Petri
Rissanen Jorma
Roos Teemu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/09/2006
Field of study

We refine and extend an earlier MDL denoising criterion for wavelet-based denoising. We start by showing that the denoising problem can be reformulated as a clustering problem, where the goal is to obtain separate clusters for informative and non-informative wavelet coefficients, respectively. This suggests two refinements, adding a code-length for the model index, and extending the model in order to account for subband-dependent coefficient distributions. A third refinement is derivation of soft thresholding inspired by predictive universal coding with weighted mixtures. We propose a practical method incorporating all three refinements, which is shown to achieve good performance and robustness in denoising both artificial and natural signals.Comment: Submitted to IEEE Transactions on Information Theory, June 200

arXiv.org e-Print Archive

CiteSeerX

Crossref

Feature Extracting in the Presence of Environmental Noise, using Subband Adaptive Filtering

Author: Samad Salina Abdul
Publication venue
Publication date: 01/01/2009
Field of study

In this work, a new feature extracting method in noisy environments is proposed. The approach is based on subband decomposition of speech signals followed by adaptive filtering in the noisiest subbbands of speech. The speech decomposition is obtained using low complexity octave filter bank, while adaptive filtering is performed using the normalized least mean square algorithm. The performance of the new feature was evaluated for isolated word speech recognition in the presence of a car noise. The proposed method showed higher recognition accuracy than conventional methods in noisy environments

EEPIS Repository