6 research outputs found
Isotone additive latent variable models
For manifest variables with additive noise and for a given number of latent variables with an assumed distribution, we propose to nonparametrically estimate the association between latent and manifest variables. Our estimation is a two step procedure: first it employs standard factor analysis to estimate the latent variables as theoretical quantiles of the assumed distribution; second, it employs the additive models' backfitting procedure to estimate the monotone nonlinear associations between latent and manifest variables. The estimated fit may suggest a different latent distribution or point to nonlinear associations. We show on simulated data how, based on mean squared errors, the nonparametric estimation improves on factor analysis. We then employ the new estimator on real data to illustrate its use for exploratory data analysi
Identifiable and interpretable nonparametric factor analysis
Factor models have been widely used to summarize the variability of
high-dimensional data through a set of factors with much lower dimensionality.
Gaussian linear factor models have been particularly popular due to their
interpretability and ease of computation. However, in practice, data often
violate the multivariate Gaussian assumption. To characterize higher-order
dependence and nonlinearity, models that include factors as predictors in
flexible multivariate regression are popular, with GP-LVMs using Gaussian
process (GP) priors for the regression function and VAEs using deep neural
networks. Unfortunately, such approaches lack identifiability and
interpretability and tend to produce brittle and non-reproducible results. To
address these problems by simplifying the nonparametric factor model while
maintaining flexibility, we propose the NIFTY framework, which parsimoniously
transforms uniform latent variables using one-dimensional nonlinear mappings
and then applies a linear generative model. The induced multivariate
distribution falls into a flexible class while maintaining simple computation
and interpretation. We prove that this model is identifiable and empirically
study NIFTY using simulated data, observing good performance in density
estimation and data visualization. We then apply NIFTY to bird song data in an
environmental monitoring application.Comment: 50 pages, 17 figure
Generalised latent variable models for location, scale, and shape parameters
Latent Variable Models (LVM) are widely used in social, behavioural, and educational sciences to uncover underlying associations in multivariate data using a smaller number of latent variables. However, the classical LVM framework has certain assumptions that can be restrictive in empirical applications. In particular, the distribution of the observed variables being from the exponential family and the latent variables influencing only the conditional mean of the observed variables. This thesis addresses these limitations and contributes to the current literature in two ways. First, we propose a novel class of models called Generalised Latent Variable Models for Location, Scale, and Shape parameters (GLVM-LSS). These models use linear functions of latent factors to model location, scale, and shape parameters of the items’ conditional distributions. By doing so, we model higher order moments such as variance, skewness, and kurtosis in terms of the latent variables, providing a more flexible framework compared to classical factor models. The model parameters are estimated using maximum likelihood estimation. Second, we address the challenge of interpreting the GLVM-LSS, which can be complex due to its increased number of parameters. We propose a penalised maximum likelihood estimation approach with automatic selection of tuning parameters. This extends previous work on penalised estimation in the LVM literature to cases without closed-form solutions. Our findings suggest that modelling the entire distribution of items, not just the conditional mean, leads to improved model fit and deeper insights into how the items reflect the latent constructs they are intended to measure. To assess the performance of the proposed methods, we conduct extensive simulation studies and apply it to real-world data from educational testing and public opinion research. The results highlight the efficacy of the GLVM-LSS framework in capturing complex relationships between observed variables and latent factors, providing valuable insights for researchers in various fields
Isotone additive latent variable models
For manifest variables with additive noise and for a given number of latent variables with an assumed distribution, we propose to nonparametrically estimate the association between latent and manifest variables. Our estimation is a two step procedure: first it employs standard factor analysis to estimate the latent variables as theoretical quantiles of the assumed distribution; second, it employs the additive models' backfitting procedure to estimate the monotone nonlinear associations between latent and manifest variables. The estimated fit may suggest a different latent distribution or point to nonlinear associations. We show on simulated data how, based on mean squared errors, the nonparametric estimation improves on factor analysis. We then employ the new estimator on real data to illustrate its use for exploratory data analysis