Search CORE

4 research outputs found

Unsupervised Learning Algorithm for Noise Suppression and Speech Enhancement Applications

Author: Alsheibi Abdullah Zaini
Publication venue: Digital Commons @ DU
Publication date: 01/03/2023
Field of study

Smart and intelligent devices are being integrated more and more into day-to-day life to perform a multitude of tasks. These tasks include, but are not limited to, job automation, smart utility management, etc., with the aim to improve quality of life and to make normal day-to-day chores as effortless as possible. These smart devices may or may not be connected to the internet to accomplish tasks. Additionally, human-machine interaction with such devices may be touch-screen based or based on voice commands. To understand and act upon received voice commands, these devices require to enhance and distinguish the (clean) speech signal from the recorded noisy signal (that is contaminated by interference and background noise). The enhanced speech signal is then analyzed locally or in cloud to extract the command. This speech enhancement task may effectively be achieved if the number of recording microphones is large. But incorporating many microphones is only possible in large and expensive devices. With multiple microphones present, the computational complexity of speech enhancement algorithms is high, along with its power consumption requirements. However, if the device under consideration is small with limited power and computational capabilities, having multiple microphones is not possible. For example, hearing aids and cochlear implant devices. Thus, most of these devices have been developed with a single microphone. As a result of this handicap, developing a speech enhancement algorithm for assisted learning devices with a single microphone, while keeping computational complexity and power consumption of the said algorithm low, is a challenging problem. There has been considerable research to solve this problem with good speech enhancement performance. However, most real-time speech enhancement algorithms lose their effectiveness if the level of noise present in the recorded speech is high. This dissertation deals with this problem, i.e., the objective is to develop a method that enhances performance by reducing the input signal noise level. To this end, it is proposed to include a pre-processing step before applying speech enhancement algorithms. This pre-processing performs noise suppression in the transformed domain by generating an approximation of the noisy signals’ short-time Fourier transform. The approximated signal with improved input signal to noise ratio is then used by other speech enhancement algorithms to recover the underlying clean signal. This approximation is performed by using the proposed Block-Principal Component Analysis (Block-PCA) algorithm. To illustrate efficacy of the methodology, a detailed performance analysis under multiple noise types and noise levels is followed, which demonstrates that the inclusion of the pre-processing step improves considerably the performance of speech enhancement algorithms when compared to other approaches with no pre-processing steps

University of Denver

Técnicas baseadas em subespaços e aplicações

Author: Teixeira Ana Rita Assunção
Publication venue: Universidade de Aveiro
Publication date: 01/01/2011
Field of study

Doutoramento em Engenharia ElectrónicaEste trabalho focou-se no estudo de técnicas de sub-espaço tendo em vista as aplicações seguintes: eliminação de ruído em séries temporais e extracção de características para problemas de classificação supervisionada. Foram estudadas as vertentes lineares e não-lineares das referidas técnicas tendo como ponto de partida os algoritmos SSA e KPCA. No trabalho apresentam-se propostas para optimizar os algoritmos, bem como uma descrição dos mesmos numa abordagem diferente daquela que é feita na literatura. Em qualquer das vertentes, linear ou não-linear, os métodos são apresentados utilizando uma formulação algébrica consistente. O modelo de subespaço é obtido calculando a decomposição em valores e vectores próprios das matrizes de kernel ou de correlação/covariância calculadas com um conjunto de dados multidimensional. A complexidade das técnicas não lineares de subespaço é discutida, nomeadamente, o problema da pre-imagem e a decomposição em valores e vectores próprios de matrizes de dimensão elevada. Diferentes algoritmos de préimagem são apresentados bem como propostas alternativas para a sua optimização. A decomposição em vectores próprios da matriz de kernel baseada em aproximações low-rank da matriz conduz a um algoritmo mais eficiente- o Greedy KPCA. Os algoritmos são aplicados a sinais artificiais de modo a estudar a influência dos vários parâmetros na sua performance. Para além disso, a exploração destas técnicas é extendida à eliminação de artefactos em séries temporais biomédicas univariáveis, nomeadamente, sinais EEG.This work focuses on the study of linear and non-linear subspace projective techniques with two intents: noise elimination and feature extraction. The conducted study is based on the SSA, and Kernel PCA algorithms. Several approaches to optimize the algorithms are addressed along with a description of those algorithms in a distinct approach from the one made in the literature. All methods presented here follow a consistent algebraic formulation to manipulate the data. The subspace model is formed using the elements from the eigendecomposition of kernel or correlation/covariance matrices computed on multidimensional data sets. The complexity of non-linear subspace techniques is exploited, namely the preimage problem and the kernel matrix dimensionality. Different pre-image algorithms are presented together with alternative proposals to optimize them. In this work some approximations to the kernel matrix based on its low rank approximation are discussed and the Greedy KPCA algorithm is introduced. Throughout this thesis, the algorithms are applied to artificial signals in order to study the influence of the several parameters in their performance. Furthermore, the exploitation of these techniques is extended to artefact removal in univariate biomedical time series, namely, EEG signals.FCT - SFRH/BD/28404/200

Repositório Institucional da Universidade de Aveiro