15,953 research outputs found

    Latent Variable Models with Applications to Spectral Data Analysis

    Get PDF
    Recent technological advances in automatic data acquisition have created an ever increasing need to extract meaningful information from huge amount of data. Multivariate predictive models have become important statistical tools in solving modern engineering problems. The purpose of this thesis is to develop novel predictive methods based on latent variable models and validate these methods by applying them into spectral data analysis. In this thesis, hybrid models of principal components regression (PCR) and partial least squares regression (PLS) is proposed. The basic idea of hybrid models is to develop more accurate prediction techniques by combining the merits of PCR and PLS. In the hybrid models, both principal components in PCR and latent variables in PLS are involved in the common regression process. Another major contribution of this work is to propose the robust probabilistic multivariate calibration model (RPMC) to overcome the drawback of Gaussian assumption in most latent variable models. The RPMC was designed to be robust to outliers by adopting a Student-t distribution instead of the Gaussian distribution. An efficient Expectation- Maximization algorithm was derived for parameter estimation in the RPMC. It can also be shown that some popular latent variables such as probabilistic PCA (PPCA) and supervised probabilistic PCA (SPPCA) are special cases of the RPMC. Both the predictive models developed in this thesis were assessed on the real-life spectral data datasets. The hybrid models were applied into the shaft misalignment prediction problem and the RPMC are tested on the near-infrared (NIR) dataset. For the classification problem on the NIR data, the fusion of the regularized discriminant analysis (RDA) and principal components analysis (PCA) was also proposed. The experimental results have shown the effectiveness and efficiency of the proposed methods

    Sample Complexity Analysis for Learning Overcomplete Latent Variable Models through Tensor Methods

    Full text link
    We provide guarantees for learning latent variable models emphasizing on the overcomplete regime, where the dimensionality of the latent space can exceed the observed dimensionality. In particular, we consider multiview mixtures, spherical Gaussian mixtures, ICA, and sparse coding models. We provide tight concentration bounds for empirical moments through novel covering arguments. We analyze parameter recovery through a simple tensor power update algorithm. In the semi-supervised setting, we exploit the label or prior information to get a rough estimate of the model parameters, and then refine it using the tensor method on unlabeled samples. We establish that learning is possible when the number of components scales as k=o(dp/2)k=o(d^{p/2}), where dd is the observed dimension, and pp is the order of the observed moment employed in the tensor method. Our concentration bound analysis also leads to minimax sample complexity for semi-supervised learning of spherical Gaussian mixtures. In the unsupervised setting, we use a simple initialization algorithm based on SVD of the tensor slices, and provide guarantees under the stricter condition that k≤βdk\le \beta d (where constant β\beta can be larger than 11), where the tensor method recovers the components under a polynomial running time (and exponential in β\beta). Our analysis establishes that a wide range of overcomplete latent variable models can be learned efficiently with low computational and sample complexity through tensor decomposition methods.Comment: Title change

    Score Function Features for Discriminative Learning: Matrix and Tensor Framework

    Get PDF
    Feature learning forms the cornerstone for tackling challenging learning problems in domains such as speech, computer vision and natural language processing. In this paper, we consider a novel class of matrix and tensor-valued features, which can be pre-trained using unlabeled samples. We present efficient algorithms for extracting discriminative information, given these pre-trained features and labeled samples for any related task. Our class of features are based on higher-order score functions, which capture local variations in the probability density function of the input. We establish a theoretical framework to characterize the nature of discriminative information that can be extracted from score-function features, when used in conjunction with labeled samples. We employ efficient spectral decomposition algorithms (on matrices and tensors) for extracting discriminative components. The advantage of employing tensor-valued features is that we can extract richer discriminative information in the form of an overcomplete representations. Thus, we present a novel framework for employing generative models of the input for discriminative learning.Comment: 29 page

    The supervised IBP: neighbourhood preserving infinite latent feature models

    Get PDF
    We propose a probabilistic model to infer supervised latent variables in the Hamming space from observed data. Our model allows simultaneous inference of the number of binary latent variables, and their values. The latent variables preserve neighbourhood structure of the data in a sense that objects in the same semantic concept have similar latent values, and objects in different concepts have dissimilar latent values. We formulate the supervised infinite latent variable problem based on an intuitive principle of pulling objects together if they are of the same type, and pushing them apart if they are not. We then combine this principle with a flexible Indian Buffet Process prior on the latent variables. We show that the inferred supervised latent variables can be directly used to perform a nearest neighbour search for the purpose of retrieval. We introduce a new application of dynamically extending hash codes, and show how to effectively couple the structure of the hash codes with continuously growing structure of the neighbourhood preserving infinite latent feature space
    • …
    corecore