2,475 research outputs found

    Model-Based Speech Enhancement

    Get PDF
    Abstract A method of speech enhancement is developed that reconstructs clean speech from a set of acoustic features using a harmonic plus noise model of speech. This is a significant departure from traditional filtering-based methods of speech enhancement. A major challenge with this approach is to estimate accurately the acoustic features (voicing, fundamental frequency, spectral envelope and phase) from noisy speech. This is achieved using maximum a-posteriori (MAP) estimation methods that operate on the noisy speech. In each case a prior model of the relationship between the noisy speech features and the estimated acoustic feature is required. These models are approximated using speaker-independent GMMs of the clean speech features that are adapted to speaker-dependent models using MAP adaptation and for noise using the Unscented Transform. Objective results are presented to optimise the proposed system and a set of subjective tests compare the approach with traditional enhancement methods. Threeway listening tests examining signal quality, background noise intrusiveness and overall quality show the proposed system to be highly robust to noise, performing significantly better than conventional methods of enhancement in terms of background noise intrusiveness. However, the proposed method is shown to reduce signal quality, with overall quality measured to be roughly equivalent to that of the Wiener filter

    A Primal-Dual Proximal Algorithm for Sparse Template-Based Adaptive Filtering: Application to Seismic Multiple Removal

    Get PDF
    Unveiling meaningful geophysical information from seismic data requires to deal with both random and structured "noises". As their amplitude may be greater than signals of interest (primaries), additional prior information is especially important in performing efficient signal separation. We address here the problem of multiple reflections, caused by wave-field bouncing between layers. Since only approximate models of these phenomena are available, we propose a flexible framework for time-varying adaptive filtering of seismic signals, using sparse representations, based on inaccurate templates. We recast the joint estimation of adaptive filters and primaries in a new convex variational formulation. This approach allows us to incorporate plausible knowledge about noise statistics, data sparsity and slow filter variation in parsimony-promoting wavelet frames. The designed primal-dual algorithm solves a constrained minimization problem that alleviates standard regularization issues in finding hyperparameters. The approach demonstrates significantly good performance in low signal-to-noise ratio conditions, both for simulated and real field seismic data

    Real-time detection of auditory : steady-state brainstem potentials evoked by auditory stimuli

    Get PDF
    The auditory steady-state response (ASSR) is advantageous against other hearing techniques because of its capability in providing objective and frequency specific information. The objectives are to reduce the lengthy test duration, and improve the signal detection rate and the robustness of the detection against the background noise and unwanted artefacts.Two prominent state estimation techniques of Luenberger observer and Kalman filter have been used in the development of the autonomous ASSR detection scheme. Both techniques are real-time implementable, while the challenges faced in the application of the observer and Kalman filter techniques are the very poor SNR (could be as low as −30dB) of ASSRs and unknown statistics of the noise. Dual-channel architecture is proposed, one is for the estimate of sinusoid and the other for the estimate of the background noise. Simulation and experimental studies were also conducted to evaluate the performances of the developed ASSR detection scheme, and to compare the new method with other conventional techniques. In general, both the state estimation techniques within the detection scheme produced comparable results as compared to the conventional techniques, but achieved significant measurement time reduction in some cases. A guide is given for the determination of the observer gains, while an adaptive algorithm has been used for adjustment of the gains in the Kalman filters.In order to enhance the robustness of the ASSR detection scheme with adaptive Kalman filters against possible artefacts (outliers), a multisensory data fusion approach is used to combine both standard mean operation and median operation in the ASSR detection algorithm. In addition, a self-tuned statistical-based thresholding using the regression technique is applied in the autonomous ASSR detection scheme. The scheme with adaptive Kalman filters is capable of estimating the variances of system and background noise to improve the ASSR detection rate

    Studies on noise robust automatic speech recognition

    Get PDF
    Noise in everyday acoustic environments such as cars, traffic environments, and cafeterias remains one of the main challenges in automatic speech recognition (ASR). As a research theme, it has received wide attention in conferences and scientific journals focused on speech technology. This article collection reviews both the classic and novel approaches suggested for noise robust ASR. The articles are literature reviews written for the spring 2009 seminar course on noise robust automatic speech recognition (course code T-61.6060) held at TKK

    Proceedings of the second "international Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST'14)

    Get PDF
    The implicit objective of the biennial "international - Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST) is to foster collaboration between international scientific teams by disseminating ideas through both specific oral/poster presentations and free discussions. For its second edition, the iTWIST workshop took place in the medieval and picturesque town of Namur in Belgium, from Wednesday August 27th till Friday August 29th, 2014. The workshop was conveniently located in "The Arsenal" building within walking distance of both hotels and town center. iTWIST'14 has gathered about 70 international participants and has featured 9 invited talks, 10 oral presentations, and 14 posters on the following themes, all related to the theory, application and generalization of the "sparsity paradigm": Sparsity-driven data sensing and processing; Union of low dimensional subspaces; Beyond linear and convex inverse problem; Matrix/manifold/graph sensing/processing; Blind inverse problems and dictionary learning; Sparsity and computational neuroscience; Information theory, geometry and randomness; Complexity/accuracy tradeoffs in numerical methods; Sparsity? What's next?; Sparse machine learning and inference.Comment: 69 pages, 24 extended abstracts, iTWIST'14 website: http://sites.google.com/site/itwist1

    Multi-Condition Training for Unknown Environment Adaptation in Robust ASR Under Real Conditions

    Get PDF
    Automatic speech recognition (ASR) systems frequently work in a noisy environment. As they are often trained on clean speech data, noise reduction or adaptation techniques are applied to decrease the influence of background disturbance even in the case of unknown conditions. Speech data mixed with noise recordings from particular environment are often used for the purposes of model adaptation. This paper analyses the improvement of recognition performance within such adaptation when multi-condition training data from a real environment is used for training initial models. Although the quality of such models can decrease with the presence of noise in the training material, they are assumed to include initial information about noise and consequently support the adaptation procedure. Experimental results show significant improvement of the proposed training method in a robust ASR task under unknown noisy conditions. The decrease by 29 % and 14 % in word error rate in comparison with clean speech training data was achieved for the non-adapted and adapted system, respectively.
    corecore