145 research outputs found
Speaker Identification by BYY Automatic Local Factor Analysis based Three-Level Voting Combination
Local Factor Analysis (LFA) is known as more general and powerful than Gaussian Mixture Model (GMM) in unsupervised learning with local subspace structure analysis. In the literature of text-independent speaker identification, GMM has been widely used and investigated, with some preprocessing or postprocessing approaches, while there still lacks efforts on LFA for this task. In pursuit of fast implementation for LFA modeling, this paper focuses on the Bayesian Ying-Yang automatic learning with data smoothing based regularization (BYY-A), which makes automatic model selection during parameter learning. Furthermore for sequence classification, based on trained LFA models, we design and analyze a three-level combination, namely sequence, classifier and committee, respectively. Different combination approaches are designed with variant sequential topologies and voting schemes. Experimental results on the KING speech corpus demonstrate the proposed approaches' effectiveness and potentials
Speaker Identification by BYY Automatic Local Factor Analysis based Three-Level Voting Combination
Local Factor Analysis (LFA) is known as more general and powerful than Gaussian Mixture Model (GMM) in unsupervised learning with local subspace structure analysis. In the literature of text-independent speaker identification, GMM has been widely used and investigated, with some preprocessing or postprocessing approaches, while there still lacks efforts on LFA for this task. In pursuit of fast implementation for LFA modeling, this paper focuses on the Bayesian Ying-Yang automatic learning with data smoothing based regularization (BYY-A), which makes automatic model selection during parameter learning. Furthermore for sequence classification, based on trained LFA models, we design and analyze a three-level combination, namely sequence, classifier and committee, respectively. Different combination approaches are designed with variant sequential topologies and voting schemes. Experimental results on the KING speech corpus demonstrate the proposed approaches' effectiveness and potentials
Contributions to generative models and their applications
Generative models are a large class of machine learning models for unsupervised learning. They have various applications in machine learning and artificial intelligence. In this thesis, we discuss many aspects of generative models and their applications to other machine learning problems. In particular, we discuss several important topics in generative models, including how to stabilize discrete GAN training with importance sampling, how to do better sampling from GANs using a connection with energy-based models, how to better train auto-regressive models with the help of an energy-based model formulation, as well as two applications of generative models to other machine learning problems, one about residual networks, the other about safety verification.Les modèles génératifs sont une grande classe de modèles d’apprentissage automatique pour
l’apprentissage non supervisé. Ils ont diverses applications dans l’apprentissage automatique
et l’intelligence artificielle. Dans cette thèse, nous discutons de nombreux aspects des modèles
génératifs et de leurs applications à d’autres problèmes d’apprentissage automatique. En
particulier, nous discutons de plusieurs sujets importants dans les modèles génératifs, y
compris comment stabiliser la formation GAN discrète avec un échantillonnage d’importance,
comment faire un meilleur Ă©chantillonnage Ă partir de GAN en utilisant une connexion avec
des modèles basés sur l’énergie, comment mieux former des modèles auto-régressifs avec
l’aide d’une formulation de modèle basée sur l’énergie, ainsi que deux applications de modèles
génératifs à d’autres problèmes d’apprentissage automatique, l’une sur les réseaux résiduels,
l’autre sur la vérification de la sécurité
Harmonized-Multinational qEEG Norms (HarMNqEEG)
This paper extends the frequency domain quantitative electroencephalography (qEEG) methods pursuing higher sensitivity to detect Brain Developmental Disorders. Prior qEEG work lacked integration of cross-spectral information omitting important functional connectivity descriptors. Lack of geographical diversity precluded accounting for site-specific variance, increasing qEEG nuisance variance. We ameliorate these weaknesses. i) Create lifespan Riemannian multinational qEEG norms for cross-spectral tensors. These norms result from the HarMNqEEG project fostered by the Global Brain Consortium. We calculate the norms with data from 9 countries, 12 devices, and 14 studies, including 1564 subjects. Instead of raw data, only anonymized metadata and EEG cross-spectral tensors were shared. After visual and automatic quality control, developmental equations for the mean and standard deviation of qEEG traditional and Riemannian DPs were calculated using additive mixed-effects models. We demonstrate qEEG "batch effects" and provide methods to calculate harmonized z-scores. ii) We also show that the multinational harmonized Riemannian norms produce z-scores with increased diagnostic accuracy to predict brain dysfunction at school-age produced by malnutrition only in the first year of life. iii) We offer open code and data to calculate different individual z-scores from the HarMNqEEG dataset. These results contribute to developing bias-free, low-cost neuroimaging technologies applicable in various health settings
Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives
Part 2 of this monograph builds on the introduction to tensor networks and
their operations presented in Part 1. It focuses on tensor network models for
super-compressed higher-order representation of data/parameters and related
cost functions, while providing an outline of their applications in machine
learning and data analytics. A particular emphasis is on the tensor train (TT)
and Hierarchical Tucker (HT) decompositions, and their physically meaningful
interpretations which reflect the scalability of the tensor network approach.
Through a graphical approach, we also elucidate how, by virtue of the
underlying low-rank tensor approximations and sophisticated contractions of
core tensors, tensor networks have the ability to perform distributed
computations on otherwise prohibitively large volumes of data/parameters,
thereby alleviating or even eliminating the curse of dimensionality. The
usefulness of this concept is illustrated over a number of applied areas,
including generalized regression and classification (support tensor machines,
canonical correlation analysis, higher order partial least squares),
generalized eigenvalue decomposition, Riemannian optimization, and in the
optimization of deep neural networks. Part 1 and Part 2 of this work can be
used either as stand-alone separate texts, or indeed as a conjoint
comprehensive review of the exciting field of low-rank tensor networks and
tensor decompositions.Comment: 232 page
Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives
Part 2 of this monograph builds on the introduction to tensor networks and
their operations presented in Part 1. It focuses on tensor network models for
super-compressed higher-order representation of data/parameters and related
cost functions, while providing an outline of their applications in machine
learning and data analytics. A particular emphasis is on the tensor train (TT)
and Hierarchical Tucker (HT) decompositions, and their physically meaningful
interpretations which reflect the scalability of the tensor network approach.
Through a graphical approach, we also elucidate how, by virtue of the
underlying low-rank tensor approximations and sophisticated contractions of
core tensors, tensor networks have the ability to perform distributed
computations on otherwise prohibitively large volumes of data/parameters,
thereby alleviating or even eliminating the curse of dimensionality. The
usefulness of this concept is illustrated over a number of applied areas,
including generalized regression and classification (support tensor machines,
canonical correlation analysis, higher order partial least squares),
generalized eigenvalue decomposition, Riemannian optimization, and in the
optimization of deep neural networks. Part 1 and Part 2 of this work can be
used either as stand-alone separate texts, or indeed as a conjoint
comprehensive review of the exciting field of low-rank tensor networks and
tensor decompositions.Comment: 232 page
Advances in variational Bayesian nonlinear blind source separation
Linear data analysis methods such as factor analysis (FA), independent component analysis (ICA) and blind source separation (BSS) as well as state-space models such as the Kalman filter model are used in a wide range of applications. In many of these, linearity is just a convenient approximation while the underlying effect is nonlinear. It would therefore be more appropriate to use nonlinear methods.
In this work, nonlinear generalisations of FA and ICA/BSS are presented. The methods are based on a generative model, with a multilayer perceptron (MLP) network to model the nonlinearity from the latent variables to the observations. The model is estimated using variational Bayesian learning. The variational Bayesian method is well-suited for the nonlinear data analysis problems. The approach is also theoretically interesting, as essentially the same method is used in several different fields and can be derived from several different starting points, including statistical physics, information theory, Bayesian statistics, and information geometry. These complementary views can provide benefits for interpretation of the operation of the learning method and its results.
Much of the work presented in this thesis consists of improvements that make the nonlinear factor analysis and blind source separation methods faster and more stable, while being applicable to other learning problems as well. The improvements include methods to accelerate convergence of alternating optimisation algorithms such as the EM algorithm and an improved approximation of the moments of a nonlinear transform of a multivariate probability distribution. These improvements can be easily applied to other models besides FA and ICA/BSS, such as nonlinear state-space models. A specialised version of the nonlinear factor analysis method for post-nonlinear mixtures is presented as well.reviewe
- …