6,852 research outputs found
Semiparametric Latent Factor Models
We propose a semiparametric model for regression and classification problems involving multiple response variables. The model makes use of a set of Gaussian processes to model the relationship to the inputs in a nonparametric fashion. Conditional dependencies between the responses can be captured through a linear mixture of the driving processes. This feature becomes important if some of the responses of predictive interest are less densely supplied by observed data than related auxiliary ones. We propose an efficient approximate inference scheme for this semiparametric model whose complexity is linear in the number of training data points
A Latent Variable Transformation Model Approach for Exploring Dysphagia
Multiple outcomes are often collected in applications where the quantity of interest cannot be measured directly or is difficult or expensive to measure. In a head and neck cancer study conducted at Dana‐Farber Cancer Institute, the investigators wanted to determine the effect of clinical and treatment factors on unobservable dysphagia through collected multiple outcomes of mixed types. Latent variable models are commonly adopted in this setting. These models stipulate that multiple collected outcomes are conditionally independent given the latent factor. Mixed types of outcomes (e.g., continuous vs. ordinal) and censored outcomes present statistical challenges, however, as a natural analog of the multivariate normal distribution does not exist for mixed data. Recently, Lin et al . proposed a semiparametric latent variable transformation model for mixed outcome data; however, it may not readily accommodate event time outcomes where censoring is present. In this paper, we extend the work of Lin et al . by proposing both semiparametric and parametric latent variable models that allow for the estimation of the latent factor in the presence of measurable outcomes of mixed types, including censored outcomes. Both approaches allow for a direct estimate of the treatment (or other covariate) effect on the unobserved latent variable, greatly enhancing the interpretability of the models. The semiparametric approach has the added advantage of allowing the relationship between the measurable outcomes and latent variables to be unspecified, rendering more robust inference. The parametric and semiparametric models can also be used together, providing a comprehensive modeling strategy for complicated latent variable problems. Copyright © 2014 John Wiley & Sons, Ltd.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/108613/1/sim6239.pd
Identification of a competing risks model with unknown transformations of latent failure times
This paper is concerned with identification of a competing risks model with unknown
transformations of latent failure times. The model in this paper includes, as special
cases, competing risks versions of proportional hazards, mixed proportional hazards,
and accelerated failure time models. It is shown that covariate effects on latent failure
times, cause-specific link functions, and the joint survivor function of the disturbance
terms can be identified without relying on modelling the dependence between latent
failure times parametrically nor using an exclusion restriction among covariates. As a
result, the paper provides an identification result on the joint survivor function of the
latent failure times conditional on covariates
Recommended from our members
Beyond Standard Assumptions - Semiparametric Models, A Dyadic Item Response Theory Model, and Cluster-Endogenous Random Intercept Models
In most statistical analyses, quantitative education researchers often make simplifying assumptions regarding the manner in which their data was generated in order to answer some of these questions. These assumptions can help to reduce the complexity of the problem, and allow the researcher to describe their data using a simpler, and often times more interpretable, statistical model. However, making some of these assumptions when they are not true can lead to biased estimates and misleading answers. While the standard sets of assumptions associated with commonly-used statistical models are usually sufficient in a wide range of contexts, it will always be beneficial for education researchers to understand what they are, when they are reasonable, and how to modify them if necessary. This dissertation focuses on three of the most common models used in quantitative education research (viz. parametric models like Linear Models (LMs), Item Response Theory (IRT) models, and Random-Intercept Models (RIMs)), discusses the standard sets of assumptions that accompany these models, and then describes related models with less stringent sets of assumptions. In each of the following three chapters, we either explicitly unpack existing models that are useful but are currently still uncommon in the field of education research, or propose novel models and/or estimation strategies for these models. We begin in Chapter 1 with a common parametric model known as the Gaussian LM, and use it as a scaffold to better understand semiparametric models and their estimation. We begin by reviewing how the coefficients of the Gaussian LM are usually estimated using Maximum Likelihood (ML) or Least-Squares (LS). We then introduce the notion of an -estimator as well as that of a Regular Asymptotically Linear estimator, and show how they relate to the ML estimator. In particular, we introduce the notion of influence functions/curves and discuss their geometry together with concepts such as Hilbert spaces and tangent spaces. We then demonstrate, concretely, how to derive the so-called efficient influence function under the Gaussian LM, and show that it is precisely the influence function of the ML and (Ordinary) LS estimators. This shows that the ML estimator (at least under the Gaussian LM) is efficient. Using the foundation built, we move on from the Gaussian LM by relaxing both the assumption that the residuals are normally distributed, as well as the assumption that they have a constant variance, and define this as the Heteroskedastic Linear Model. Unlike the Gaussian LM, this is a semiparametric model. Where possible, we make use of intuition and analogous results from the parametric setting to help describe the workflow for obtaining an efficient estimator for the coefficients of the Heteroskedastic Linear Model. In particular, we derive the nuisance tangent space for this semiparametric model, and use it to obtain the efficient influence function for our model. We then show how to use the efficient influence function to obtain an efficient estimator (which happens to be the Weighted LS estimator) from the (Ordinary) LS estimator via a one-step approach as well as an estimating equations approach. We then conclude by directing readers to more advanced material, including references on more modern approaches to estimating more general semiparametric models such as Targeted Maximum Likelihood Estimation. In Chapter 2, we focus on a class of measurement models known as Item Response Theory models which are useful for measuring latent traits of a subject based on the subject's response to items. We relax the condition that the responses are only a result of the individual's latent trait (and possibly an external rater), and propose a dyadic Item Response Theory (dIRT) model for measuring interactions of pairs of individuals when the responses to items represent the actions (or behaviors, perceptions, etc.) of each individual (actor) made within the context of a dyad formed with another individual (partner). Examples of its use in education include the assessment of collaborative problem solving among students, or the evaluation of intra-departmental dynamics among teachers. The dIRT model generalizes both Item Response Theory models for measurement and the Social Relations Model for dyadic data. Here, the responses of an actor when paired with a partner are modeled as a function of not only the actor's inclination to act and the partner's tendency to elicit that action, but also the unique relationship of the pair, represented by two directional, possibly correlated, interaction latent variables. We discuss generalizations such as accommodating triads or larger groups, but focus on demonstrating the key idea in the dyadic case. We show that estimation may be performed using Markov-chain Monte Carlo implemented in \texttt{Stan}, making it straightforward to extend the dIRT model in various ways. Specifically, we show how the basic dIRT model can be extended to accommodate latent regressions, random effects, distal outcomes. We perform a simulation study that demonstrates that our estimation approach performs well. In the absence of educational data of this form, we demonstrate the usefulness of our proposed approach using speed-dating data instead, and find new evidence of pairwise interactions between participants, describing a mutual attraction that is inadequately characterized by individual properties alone.Finally, in Chapter 3, we consider the often implicit assumption made when estimating the coefficients of structural Random Intercept Models (RIMs) that covariates at all levels do not co-vary with the random intercepts. A violation of this assumption (called cluster-level endogeneity) leads to inconsistent estimates when using standard estimation procedures. For two-level RIMs with such endogeneity, Hausman and Taylor (HT) devised a consistent multi-step instrumental variable estimator using only internal instruments. We, instead, approach this problem by explicitly modeling the endogeneity using a Structural Equation Model (SEM). In this chapter, we compare, through simulation, the HT and SEM estimators, and evaluate their asymptotic and finite sample properties. We show that the SEM approach is also flexible enough to deal with different exchangeability assumptions for the covariates (e.g., whether the correlations between pairs of all units in a cluster are the same) and investigate how these exchangeability assumptions affect finite sample properties of the HT estimator. For the simulations, we propose a new procedure for generating cluster- and unit-level covariates and random intercepts with a fully flexible covariance structure. We also compare our approach to another common approach known as Multilevel Matching using data from the High School and Beyond survey
A Bayesian semiparametric latent variable model for mixed responses
In this article we introduce a latent variable model (LVM) for mixed ordinal and continuous responses, where covariate effects on the continuous latent variables are modelled through a flexible semiparametric predictor. We extend existing LVM with simple linear covariate effects by including nonparametric components for nonlinear effects of continuous covariates and interactions with other covariates as well as spatial effects. Full Bayesian modelling is based on penalized spline and Markov random field priors and is performed by computationally efficient Markov chain Monte Carlo (MCMC) methods. We apply our approach to a large German social science survey which motivated our methodological development
Geoadditive Latent Variable Modelling of Count Data on Multiple Sexual Partnering in Nigeria
The 2005 National HIV/AIDS and Reproductive Health Survey in Nigeria provides evidence that multiple sexual partnering increases the risk of contracting HIV and other sexually transmitted diseases. Therefore, partner reduction is one of the prevention strategies to accomplish the Millenium development goal of halting and reversing the spread of HIV/AIDS. In order to explore possible association between sexual partnering and some risk factors, this paper utilizes a novel Bayesian geoadditive latent variable model for count outcomes. This allows us to simultaneously analyze linear and nonlinear effects of covariates as well as spatial variations of one or more latent variables, such as attitude towards multiple partnering, which in turn directly influences the multivariate observable outcomes or indicators. Influence of demographic factors such as age, gender, locality, state of residence, educational attainment, etc., and knowledge about HIV/AIDS on attitude towards multiple partnering is also investigated. Results can provide insights to policy makers with the aim of reducing the spread of HIV and AIDS among the Nigerian populace through partner reduction
Identification of a competing risks model with unknown transformations of latent failure times
This paper is concerned with identification of a competing risks model with unknown
transformations of latent failure times. The model in this paper includes, as special
cases, competing risks versions of proportional hazards, mixed proportional hazards,
and accelerated failure time models. It is shown that covariate effects on latent failure
times, cause-specific link functions, and the joint survivor function of the disturbance
terms can be identified without relying on modelling the dependence between latent
failure times parametrically nor using an exclusion restriction among covariates. As a
result, the paper provides an identification result on the joint survivor function of the
latent failure times conditional on covariates
A geoadditive Bayesian latent variable model for Poisson indicators
We introduce a new latent variable model with count variable indicators, where usual linear parametric effects of covariates, nonparametric effects of continuous covariates and spatial effects on the continuous latent variables are modelled through a geoadditive predictor. Bayesian modelling of nonparametric functions and spatial effects is based on penalized spline and Markov random field priors. Full Bayesian inference is performed via an auxiliary variable Gibbs sampling technique, using a recent suggestion of Frühwirth-Schnatter and Wagner (2006). As an advantage, our Poisson indicator latent variable model can be combined with semiparametric latent variable models for mixed binary, ordinal and continuous indicator variables within an unified and coherent framework for modelling and inference. A simulation study investigates performance, and an application to post war human security in Cambodia illustrates the approach
- …