199 research outputs found

    Sequential and adaptive Bayesian computation for inference and optimization

    Get PDF
    With the advent of cheap and ubiquitous measurement devices, today more data is measured, recorded, and archived in a relatively short span of time than all data recorded throughout history. Moreover, advances in computation have made it possible to model much more complicated phenomena and to use the vast amounts of data to calibrate the resulting high-dimensional models. In this thesis, we are interested in two fundamental problems which are repeatedly being faced in practice as the dimension of the models and datasets are growing steadily: the problem of inference in high-dimensional models and the problem of optimization for problems when the number of data points is very large. The inference problem gets difficult when the model one wants to calibrate and estimate is defined in a high-dimensional space. The behavior of computational algorithms in high-dimensional spaces is complicated and defies intuition. Computational methods which work accurately for inferring low-dimensional models, for example, may fail to generalize the same performance to high-dimensional models. In recent years, due to the significant interest in high-dimensional models, there has been a plethora of work in signal processing and machine learning to develop computational methods which are robust in high-dimensional spaces. In particular, the high-dimensional stochastic filtering problem has attracted significant attention as it arises in multiple fields which are of crucial importance such as geophysics, aerospace, control. In particular, a class of algorithms called particle filters has received attention and become a fruitful field of research because of their accuracy and robustness in low-dimensional systems. In short, these methods keep a cloud of particles (samples in a state space), which describe the empirical probability distribution over the state variable of interest. The particle filters use a model of the phenomenon of interest to propagate and predict the future states and use an observation model to assimilate the observations to correct the state estimates. The most common particle filter, called the bootstrap particle filter (BPF), consists of an iterative sampling-weighting-resampling scheme. However, BPFs also largely fail at inferring high-dimensional dynamical systems due to a number of reasons. In this work, we propose a novel particle filter, named the nudged particle filter (NuPF), which specifically aims at improving the performance of particle filters in high-dimensional systems. The algorithm relies on the idea of nudging, which has been widely used in the geophysics literature to tackle high-dimensional inference problems. In particular, in addition to standard sampling-weighting-resampling steps of the particle filter, we define a general nudging step based on the gradient of the likelihoods, which generalize some of the nudging schemes proposed in the literature. This step is based on modifying the particles, generated in the sampling step, using the gradients of the likelihoods. In particular, the nudging step moves a fraction of the particles to the regions under which they have high-likelihoods. This scheme results in significantly improved behavior in high-dimensional models. The resulting NuPF is able to track high-dimensional systems successfully. Unlike the proposed nudging schemes in the literature, the NuPF does not rely on Gaussianity assumptions and can be defined for a general likelihood. We analytically prove that, because we only move a fraction of the particles and not all of them, the algorithm has a convergence rate that matches standard Monte Carlo algorithms. More precisely, the NuPF has the same asymptotic convergence guarantees as the bootstrap particle filter. As a byproduct, we also show that the nudging step improves the robustness of the particle filter against model misspecification. In particular, model misspecification occurs when the true data-generating system and the model posed by the user of the algorithm differ significantly. In this case, a majority of computational inference methods fail due to the discrepancy between the modeling assumptions and the observed data. We show that the nudging step increases the robustness of particle filters against model misspecification. Specifically, we prove that the NuPF generates particle systems which have provably higher marginal likelihoods compared to the standard bootstrap particle filter. This theoretical result is attained by showing that the NuPF can be interpreted as a bootstrap particle filter for a modified state-space model. Finally, we demonstrate the empirical behavior of the NuPF with several examples. In particular, we show results on high-dimensional linear state-space models, a misspecified Lorenz 63 model, a high-dimensional Lorenz 96 model, and a misspecified object tracking model. In all examples, the NuPF infers the states successfully. The second problem, the so-called scability problem in optimization, occurs because of the large number of data points in modern datasets. With the increasing abundance of data, many problems in signal processing, statistical inference, and machine learning turn into a large-scale optimization problems. For example, in signal processing, one might be interested in estimating a sparse signal given a large number of corrupted observations. Similarly, maximum-likelihood inference problems in statistics result in large-scale optimization problems. Another significant application domain is machine learning, where all important training methods are defined as optimization problems. To tackle these problems, computational optimization methods developed over the past decades are inefficient since they need to compute function evaluations or gradients over all the data for a single iteration. Because of this reason, a class of optimization methods, termed stochastic optimization methods, have emerged. The algorithms of this class are designed to tackle problems which are defined over a big number of data points. In short, these methods utilize a subsample of the dataset in order to update the parameter estimate and do so iteratively until some convergence criterion is met. However, there is a major difficulty that has to be addressed: Although the convergence theory for these algorithms is understood, they can have unstable behavior in practice. In particular, the most commonly used stochastic optimization method, namely the stochastic gradient descent, can diverge easily if its step-size is poorly set. Over the years, practitioners have developed a number of rules of thumb to alleviate stability issues. We argue in this thesis that one way to develop robust stochastic optimization methods is to frame them as inference methods. In particular, we show that stochastic optimization schemes can be recast as inference methods and can be understood as inference algorithms. Framing the problem as an inference problem opens the way to compare these methods to the optimal inference algorithms and understand why they might be failing or producing unstable behavior. In this vein, we show that there is an intrinsic relationship between a class of stochastic optimization methods, called incremental proximal methods, and Kalman (and extended Kalman) filters. The filtering approach to stochastic optimization results in an automatic calibration of the step-size, which removes the instability problems depending on the step-sizes. The probabilistic interpretation of stochastic optimization problems also paves the way to develop new optimization methods based on strategies which are popular in the inference literature. In particular, one can use a set of sampling methods in order to solve the inference problem and hence obtain the global minimum. In this manner, we propose a parallel sequential Monte Carlo optimizer (PSMCO), which is aiming at solving stochastic optimization problems. The PSMCO is designed as a zeroth order method which does not use gradients. It only uses subsets of the data points in order to move at each iteration. The PSMCO obtains an estimate of a global minimum at each iteration by utilizing a cheap kernel density estimator. We prove that the resulting estimator converges to a global minimum almost surely as the number of Monte Carlo samples tends to infinity. We also empirically demonstrate that the algorithm is able to reconstruct multiple global minima and solve difficult global optimization problems. By further exploiting the relationship between inference and optimization, we also propose a probabilistic and online matrix factorization method, termed the dictionary filter to solve large-scale matrix factorization problems. Matrix factorization methods have received significant interest from the machine learning community due to their expressive representations of high-dimensional data and interpretability of their estimates. As the majority of the matrix factorization methods are defined as optimization problems, they suffer from the same issues as stochastic optimization methods. In particular, when using stochastic gradient descent, one might need to try and err many times before deciding to use a step-size. To alleviate these problems, we introduce a matrix-variate probabilistic model for which inference results in a matrix factorization scheme. The scheme is online, in the sense that it only uses a single data point at a time to update the factors. The algorithm bears relationship with optimization schemes, namely with the incremental proximal method defined over a matrix-variate cost function. By way of intuition we developed for the optimization-inference relationship, we devise a model which results in similar update rules for matrix factorization as for the incremental proximal method. However, the probabilistic updates are more stable and efficient. Moreover, the algorithm does not have a step-size parameter to tune, as its role is played by the posterior covariance matrix. We demonstrate the utility of the algorithm on a missing data problem and a video processing problem. We show that the algorithm can be successfully used in machine learning problems and several promising extensions of the method can be constructed easily.Programa Oficial de Doctorado en Multimedia y ComunicacionesPresidente: Ricardo Cao Abad.- Secretario: Michael Peter Wiper.- Vocal: Nicholas Paul Whitele

    Adaptive Hidden Markov Noise Modelling for Speech Enhancement

    Get PDF
    A robust and reliable noise estimation algorithm is required in many speech enhancement systems. The aim of this thesis is to propose and evaluate a robust noise estimation algorithm for highly non-stationary noisy environments. In this work, we model the non-stationary noise using a set of discrete states with each state representing a distinct noise power spectrum. In this approach, the state sequence over time is conveniently represented by a Hidden Markov Model (HMM). In this thesis, we first present an online HMM re-estimation framework that models time-varying noise using a Hidden Markov Model and tracks changes in noise characteristics by a sequential model update procedure that tracks the noise characteristics during the absence of speech. In addition the algorithm will when necessary create new model states to represent novel noise spectra and will merge existing states that have similar characteristics. We then extend our work in robust noise estimation during speech activity by incorporating a speech model into our existing noise model. The noise characteristics within each state are updated based on a speech presence probability which is derived from a modified Minima controlled recursive averaging method. We have demonstrated the effectiveness of our noise HMM in tracking both stationary and highly non-stationary noise, and shown that it gives improved performance over other conventional noise estimation methods when it is incorporated into a standard speech enhancement algorithm

    Message Passing-based Inference in Hierarchical Autoregressive Models

    Get PDF

    주변 환경에 강인한 음성인식을 위한 모델 및 데이터기반 기법

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2015. 8. 김남수.In this thesis, we propose model-based and data-driven techniques for environment-robust automatic speech recognition. The model-based technique is the feature enhancement method in the reverberant noisy environment to improve the performance of Gaussian mixture model-hidden Markov model (HMM) system. It is based on the interacting multiple model (IMM), which was originally developed in single-channel scenario. We extend the single-channel IMM algorithm such that it can handle the multi-channel inputs under the Bayesian framework. The multi-channel IMM algorithm is capable of tracking time-varying room impulse responses and background noises by updating the relevant parameters in an on-line manner. In order to reduce the computation as the number of microphones increases, a computationally efficient algorithm is also devised. In various simulated and real environmental conditions, the performance gain of the proposed method has been confirmed. The data-driven techniques are based on deep neural network (DNN)-HMM hybrid system. In order to enhance the performance of DNN-HMM system in the adverse environments, we propose three techniques. Firstly, we propose a novel supervised pre-training technique for DNN-HMM system to achieve robust speech recognition in adverse environments. In the proposed approach, our aim is to initialize the DNN parameters such that they yield abstract features robust to acoustic environment variations. In order to achieve this, we first derive the abstract features from an early fine-tuned DNN model which is trained based on a clean speech database. By using the derived abstract features as the target values, the standard error back-propagation algorithm with the stochastic gradient descent method is performed to estimate the initial parameters of the DNN. The performance of the proposed algorithm was evaluated on Aurora-4 DB and better results were observed compared to a number of conventional pre-training methods. Secondly, a new DNN-based robust speech recognition approaches taking advantage of noise estimates are proposed. A novel part of the proposed approaches is that the time-varying noise estimates are applied to the DNN as additional inputs. For this, we extract the noise estimates in a frame-by-frame manner from the IMM algorithm which has been known to show good performance in tracking slowly-varying background noise. The performance of the proposed approaches is evaluated on Aurora-4 DB and better performance is observed compared to the conventional DNN-based robust speech recognition algorithms. Finally, a new approach to DNN-based robust speech recognition using soft target labels is proposed. The soft target labeling means that each target value of the DNN output is not restricted to 0 or 1 but takes non negative values in (0,1) and their sum equals 1. In this study, the soft target labels are obtained from the forward-backward algorithm well-known in HMM training. The proposed method makes the DNN training be more robust in noisy and unseen conditions. The performance of the proposed approach was evaluated on Aurora-4 DB and various mismatched noise test conditions, and found better compared to the conventional hard target labeling method. Furthermore, in the data-driven approaches, an integrated technique using above three algorithms and model-based technique is described. In matched and mismatched noise conditions, the performance results are discussed. In matched noise conditions, the initialization method for the DNN was effective to enhance the recognition performance. In mismatched noise conditions, the combination of using the noise estimates as an DNN input and soft target labels showed the best recognition results in all the tested combinations of the proposed techniques.Abstract i Contents iv List of Figures viii List of Tables x 1 Introduction 1 2 Experimental Environments and Database 7 2.1 ASR in Hands-Free Scenario and Feature Extraction 7 2.2 Relationship between Clean and Distorted Speech in Feature Domain 10 2.3 Database 12 2.3.1 TI Digits Corpus 13 2.3.2 Aurora-4 DB 15 3 Previous Robust ASR Approaches 17 3.1 IMM-Based Feature Compensation in Noise Environment 18 3.2 Single-Channel Reverberation and Noise-Robust Feature Enhancement Based on IMM 24 3.3 Multi-Channel Feature Enhancement for Robust Speech Recognition 26 3.4 DNN-Based Robust Speech Recognition 27 4 Multi-Channel IMM-Based Feature Enhancement for Robust Speech Recognition 31 4.1 Introduction 31 4.2 Observation Model in Multi-Channel Reverberant Noisy Environment 33 4.3 Multi-Channel Feature Enhancement in a Bayesian Framework 35 4.3.1 A Priori Clean Speech Model 37 4.3.2 A Priori Model for RIR 38 4.3.3 A Priori Model for Background Noise 39 4.3.4 State Transition Formulation 40 4.3.5 Function Linearization 41 4.4 Feature Enhancement Algorithm 42 4.5 Incremental State Estimation 48 4.6 Experiments 52 4.6.1 Simulation Data 52 4.6.2 Live Recording Data 54 4.6.3 Computational Complexity 55 4.7 Summary 56 5 Supervised Denoising Pre-Training for Robust ASR with DNN-HMM 59 5.1 Introduction 59 5.2 Deep Neural Networks 61 5.3 Supervised Denoising Pre-Training 63 5.4 Experiments 65 5.4.1 Feature Extraction and GMM-HMM System 66 5.4.2 DNN Structures 66 5.4.3 Performance Evaluation 68 5.5 Summary 69 6 DNN-Based Frameworks for Robust Speech Recognition Using Noise Estimates 71 6.1 Introduction 71 6.2 DNN-Based Frameworks for Robust ASR 73 6.2.1 Robust Feature Enhancement 74 6.2.2 Robust Model Training 75 6.3 IMM-Based Noise Estimation 77 6.4 Experiments 78 6.4.1 DNN Structures 78 6.4.2 Performance Evaluations 79 6.5 Summary 82 7 DNN-Based Robust Speech Recognition Using Soft Target Labels 83 7.1 Introduction 83 7.2 DNN-HMM Hybrid System 85 7.3 Soft Target Label Estimation 87 7.4 Experiments 89 7.4.1 DNN Structures 89 7.4.2 Performance Evaluation 90 7.4.3 Effects of Control Parameter ξ 91 7.4.4 An Integration with SDPT and ESTN Methods 92 7.4.5 Performance Evaluation on Various Noise Types 93 7.4.6 DNN Training and Decoding Time 95 7.5 Summary 96 8 Conclusions 99 Bibliography 101 요약 108Docto

    Online Audio-Visual Multi-Source Tracking and Separation: A Labeled Random Finite Set Approach

    Get PDF
    The dissertation proposes an online solution for separating an unknown and time-varying number of moving sources using audio and visual data. The random finite set framework is used for the modeling and fusion of audio and visual data. This enables an online tracking algorithm to estimate the source positions and identities for each time point. With this information, a set of beamformers can be designed to separate each desired source and suppress the interfering sources

    Sistemas granulares evolutivos

    Get PDF
    Orientador: Fernando Antonio Campos GomideTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: Recentemente tem-se observado um crescente interesse em abordagens de modelagem computacional para lidar com fluxos de dados do mundo real. Métodos e algoritmos têm sido propostos para obtenção de conhecimento a partir de conjuntos de dados muito grandes e, a princípio, sem valor aparente. Este trabalho apresenta uma plataforma computacional para modelagem granular evolutiva de fluxos de dados incertos. Sistemas granulares evolutivos abrangem uma variedade de abordagens para modelagem on-line inspiradas na forma com que os humanos lidam com a complexidade. Esses sistemas exploram o fluxo de informação em ambiente dinâmico e extrai disso modelos que podem ser linguisticamente entendidos. Particularmente, a granulação da informação é uma técnica natural para dispensar atenção a detalhes desnecessários e enfatizar transparência, interpretabilidade e escalabilidade de sistemas de informação. Dados incertos (granulares) surgem a partir de percepções ou descrições imprecisas do valor de uma variável. De maneira geral, vários fatores podem afetar a escolha da representação dos dados tal que o objeto representativo reflita o significado do conceito que ele está sendo usado para representar. Neste trabalho são considerados dados numéricos, intervalares e fuzzy; e modelos intervalares, fuzzy e neuro-fuzzy. A aprendizagem de sistemas granulares é baseada em algoritmos incrementais que constroem a estrutura do modelo sem conhecimento anterior sobre o processo e adapta os parâmetros do modelo sempre que necessário. Este paradigma de aprendizagem é particularmente importante uma vez que ele evita a reconstrução e o retreinamento do modelo quando o ambiente muda. Exemplos de aplicação em classificação, aproximação de função, predição de séries temporais e controle usando dados sintéticos e reais ilustram a utilidade das abordagens de modelagem granular propostas. O comportamento de fluxos de dados não-estacionários com mudanças graduais e abruptas de regime é também analisado dentro do paradigma de computação granular evolutiva. Realçamos o papel da computação intervalar, fuzzy e neuro-fuzzy em processar dados incertos e prover soluções aproximadas de alta qualidade e sumário de regras de conjuntos de dados de entrada e saída. As abordagens e o paradigma introduzidos constituem uma extensão natural de sistemas inteligentes evolutivos para processamento de dados numéricos a sistemas granulares evolutivos para processamento de dados granularesAbstract: In recent years there has been increasing interest in computational modeling approaches to deal with real-world data streams. Methods and algorithms have been proposed to uncover meaningful knowledge from very large (often unbounded) data sets in principle with no apparent value. This thesis introduces a framework for evolving granular modeling of uncertain data streams. Evolving granular systems comprise an array of online modeling approaches inspired by the way in which humans deal with complexity. These systems explore the information flow in dynamic environments and derive from it models that can be linguistically understood. Particularly, information granulation is a natural technique to dispense unnecessary details and emphasize transparency, interpretability and scalability of information systems. Uncertain (granular) data arise from imprecise perception or description of the value of a variable. Broadly stated, various factors can affect one's choice of data representation such that the representing object conveys the meaning of the concept it is being used to represent. Of particular concern to this work are numerical, interval, and fuzzy types of granular data; and interval, fuzzy, and neurofuzzy modeling frameworks. Learning in evolving granular systems is based on incremental algorithms that build model structure from scratch on a per-sample basis and adapt model parameters whenever necessary. This learning paradigm is meaningful once it avoids redesigning and retraining models all along if the system changes. Application examples in classification, function approximation, time-series prediction and control using real and synthetic data illustrate the usefulness of the granular approaches and framework proposed. The behavior of nonstationary data streams with gradual and abrupt regime shifts is also analyzed in the realm of evolving granular computing. We shed light upon the role of interval, fuzzy, and neurofuzzy computing in processing uncertain data and providing high-quality approximate solutions and rule summary of input-output data sets. The approaches and framework introduced constitute a natural extension of evolving intelligent systems over numeric data streams to evolving granular systems over granular data streamsDoutoradoAutomaçãoDoutor em Engenharia Elétric
    corecore