284 research outputs found
Neural Networks for Complex Data
Artificial neural networks are simple and efficient machine learning tools.
Defined originally in the traditional setting of simple vector data, neural
network models have evolved to address more and more difficulties of complex
real world problems, ranging from time evolving data to sophisticated data
structures such as graphs and functions. This paper summarizes advances on
those themes from the last decade, with a focus on results obtained by members
of the SAMM team of Universit\'e Paris
Speech Recognition Using Augmented Conditional Random Fields
Acoustic modeling based on hidden Markov models (HMMs) is employed by state-of-the-art stochastic speech recognition systems. Although HMMs are a natural choice to warp the time axis and model the temporal phenomena in the speech signal, their conditional independence properties limit their ability to model spectral phenomena well. In this paper, a new acoustic modeling paradigm based on augmented conditional random fields (ACRFs) is investigated and developed. This paradigm addresses some limitations of HMMs while maintaining many of the aspects which have made them successful. In particular, the acoustic modeling problem is reformulated in a data driven, sparse, augmented space to increase discrimination. Acoustic context modeling is explicitly integrated to handle the sequential phenomena of the speech signal. We present an efficient framework for estimating these models that ensures scalability and generality. In the TIMIT phone recognition task, a phone error rate of 23.0\% was recorded on the full test set, a significant improvement over comparable HMM-based systems
Sparse Bayesian neural networks for regression: Tackling overfitting and computational challenges in uncertainty quantification
Neural networks (NNs) are primarily developed within the frequentist
statistical framework. Nevertheless, frequentist NNs lack the capability to
provide uncertainties in the predictions, and hence their robustness can not be
adequately assessed. Conversely, the Bayesian neural networks (BNNs) naturally
offer predictive uncertainty by applying Bayes' theorem. However, their
computational requirements pose significant challenges. Moreover, both
frequentist NNs and BNNs suffer from overfitting issues when dealing with noisy
and sparse data, which render their predictions unwieldy away from the
available data space. To address both these problems simultaneously, we
leverage insights from a hierarchical setting in which the parameter priors are
conditional on hyperparameters to construct a BNN by applying a semi-analytical
framework known as nonlinear sparse Bayesian learning (NSBL). We call our
network sparse Bayesian neural network (SBNN) which aims to address the
practical and computational issues associated with BNNs. Simultaneously,
imposing a sparsity-inducing prior encourages the automatic pruning of
redundant parameters based on the automatic relevance determination (ARD)
concept. This process involves removing redundant parameters by optimally
selecting the precision of the parameters prior probability density functions
(pdfs), resulting in a tractable treatment for overfitting. To demonstrate the
benefits of the SBNN algorithm, the study presents an illustrative regression
problem and compares the results of a BNN using standard Bayesian inference,
hierarchical Bayesian inference, and a BNN equipped with the proposed
algorithm. Subsequently, we demonstrate the importance of considering the full
parameter posterior by comparing the results with those obtained using the
Laplace approximation with and without NSBL
Small-variance asymptotics for Bayesian neural networks
Bayesian neural networks (BNNs) are a rich and flexible class of models that have several advantages over standard feedforward networks, but are typically expensive to train on large-scale data. In this thesis, we explore the use of small-variance asymptotics-an approach to yielding fast algorithms from probabilistic models-on various Bayesian neural network models. We first demonstrate how small-variance asymptotics shows precise connections between standard neural networks and BNNs; for example, particular sampling algorithms for BNNs reduce to standard backpropagation in the small-variance limit. We then explore a more complex BNN where the number of hidden units is additionally treated as a random variable in the model. While standard sampling schemes would be too slow to be practical, our asymptotic approach yields a simple method for extending standard backpropagation to the case where the number of hidden units is not fixed. We show on several data sets that the resulting algorithm has benefits over backpropagation on networks with a fixed architecture.2019-01-02T00:00:00
- …