1,706,184 research outputs found
Sparse Probit Linear Mixed Model
Linear Mixed Models (LMMs) are important tools in statistical genetics. When
used for feature selection, they allow to find a sparse set of genetic traits
that best predict a continuous phenotype of interest, while simultaneously
correcting for various confounding factors such as age, ethnicity and
population structure. Formulated as models for linear regression, LMMs have
been restricted to continuous phenotypes. We introduce the Sparse Probit Linear
Mixed Model (Probit-LMM), where we generalize the LMM modeling paradigm to
binary phenotypes. As a technical challenge, the model no longer possesses a
closed-form likelihood function. In this paper, we present a scalable
approximate inference algorithm that lets us fit the model to high-dimensional
data sets. We show on three real-world examples from different domains that in
the setup of binary labels, our algorithm leads to better prediction accuracies
and also selects features which show less correlation with the confounding
factors.Comment: Published version, 21 pages, 6 figure
A vine copula mixed effect model for trivariate meta-analysis of diagnostic test accuracy studies accounting for disease prevalence
A bivariate copula mixed model has been recently proposed to synthesize diagnostic test accuracy studies and it has been shown that it is superior to the standard generalized linear mixed model in this context. Here, we call trivariate vine copulas to extend the bivariate meta-analysis of diagnostic test accuracy studies by accounting for disease prevalence. Our vine copula mixed model includes the trivariate generalized linear mixed model as a special case and can also operate on the original scale of sensitivity, specificity, and disease prevalence. Our general methodology is illustrated by re-analyzing the data of two published meta-analyses. Our study suggests that there can be an improvement on trivariate generalized linear mixed model in fit to data and makes the argument for moving to vine copula random effects models especially because of their richness, including reflection asymmetric tail dependence, and computational feasibility despite their three dimensionality
Model Selection in Linear Mixed Models
Linear mixed effects models are highly flexible in handling a broad range of
data types and are therefore widely used in applications. A key part in the
analysis of data is model selection, which often aims to choose a parsimonious
model with other desirable properties from a possibly very large set of
candidate statistical models. Over the last 5-10 years the literature on model
selection in linear mixed models has grown extremely rapidly. The problem is
much more complicated than in linear regression because selection on the
covariance structure is not straightforward due to computational issues and
boundary problems arising from positive semidefinite constraints on covariance
matrices. To obtain a better understanding of the available methods, their
properties and the relationships between them, we review a large body of
literature on linear mixed model selection. We arrange, implement, discuss and
compare model selection methods based on four major approaches: information
criteria such as AIC or BIC, shrinkage methods based on penalized loss
functions such as LASSO, the Fence procedure and Bayesian techniques.Comment: Published in at http://dx.doi.org/10.1214/12-STS410 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Functional Linear Mixed Models for Irregularly or Sparsely Sampled Data
We propose an estimation approach to analyse correlated functional data which
are observed on unequal grids or even sparsely. The model we use is a
functional linear mixed model, a functional analogue of the linear mixed model.
Estimation is based on dimension reduction via functional principal component
analysis and on mixed model methodology. Our procedure allows the decomposition
of the variability in the data as well as the estimation of mean effects of
interest and borrows strength across curves. Confidence bands for mean effects
can be constructed conditional on estimated principal components. We provide
R-code implementing our approach. The method is motivated by and applied to
data from speech production research
Latitude: A Model for Mixed Linear-Tropical Matrix Factorization
Nonnegative matrix factorization (NMF) is one of the most frequently-used
matrix factorization models in data analysis. A significant reason to the
popularity of NMF is its interpretability and the `parts of whole'
interpretation of its components. Recently, max-times, or subtropical, matrix
factorization (SMF) has been introduced as an alternative model with equally
interpretable `winner takes it all' interpretation. In this paper we propose a
new mixed linear--tropical model, and a new algorithm, called Latitude, that
combines NMF and SMF, being able to smoothly alternate between the two. In our
model, the data is modeled using the latent factors and latent parameters that
control whether the factors are interpreted as NMF or SMF features, or their
mixtures. We present an algorithm for our novel matrix factorization. Our
experiments show that our algorithm improves over both baselines, and can yield
interpretable results that reveal more of the latent structure than either NMF
or SMF alone.Comment: 14 pages, 6 figures. To appear in 2018 SIAM International Conference
on Data Mining (SDM '18). For the source code, see
https://people.mpi-inf.mpg.de/~pmiettin/linear-tropical
Linear mixed models with endogenous covariates: modeling sequential treatment effects with application to a mobile health study
Mobile health is a rapidly developing field in which behavioral treatments
are delivered to individuals via wearables or smartphones to facilitate
health-related behavior change. Micro-randomized trials (MRT) are an
experimental design for developing mobile health interventions. In an MRT the
treatments are randomized numerous times for each individual over course of the
trial. Along with assessing treatment effects, behavioral scientists aim to
understand between-person heterogeneity in the treatment effect. A natural
approach is the familiar linear mixed model. However, directly applying linear
mixed models is problematic because potential moderators of the treatment
effect are frequently endogenous---that is, may depend on prior treatment. We
discuss model interpretation and biases that arise in the absence of additional
assumptions when endogenous covariates are included in a linear mixed model. In
particular, when there are endogenous covariates, the coefficients no longer
have the customary marginal interpretation. However, these coefficients still
have a conditional-on-the-random-effect interpretation. We provide an
additional assumption that, if true, allows scientists to use standard software
to fit linear mixed model with endogenous covariates, and person-specific
predictions of effects can be provided. As an illustration, we assess the
effect of activity suggestion in the HeartSteps MRT and analyze the
between-person treatment effect heterogeneity
A Note on the Identifiability of Generalized Linear Mixed Models
I present here a simple proof that, under general regularity conditions, the
standard parametrization of generalized linear mixed model is identifiable. The
proof is based on the assumptions of generalized linear mixed models on the
first and second order moments and some general mild regularity conditions,
and, therefore, is extensible to quasi-likelihood based generalized linear
models. In particular, binomial and Poisson mixed models with dispersion
parameter are identifiable when equipped with the standard parametrization.Comment: 9 pages, no figure
- …
