Search CORE

6,043 research outputs found

Sparse Probit Linear Mixed Model

Author: Cunningham John P.
Kloft Marius
Lippert Christoph
Mandt Stephan
Nakajima Shinichi
Wenzel Florian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/07/2017
Field of study

Linear Mixed Models (LMMs) are important tools in statistical genetics. When used for feature selection, they allow to find a sparse set of genetic traits that best predict a continuous phenotype of interest, while simultaneously correcting for various confounding factors such as age, ethnicity and population structure. Formulated as models for linear regression, LMMs have been restricted to continuous phenotypes. We introduce the Sparse Probit Linear Mixed Model (Probit-LMM), where we generalize the LMM modeling paradigm to binary phenotypes. As a technical challenge, the model no longer possesses a closed-form likelihood function. In this paper, we present a scalable approximate inference algorithm that lets us fit the model to high-dimensional data sets. We show on three real-world examples from different domains that in the setup of binary labels, our algorithm leads to better prediction accuracies and also selects features which show less correlation with the confounding factors.Comment: Published version, 21 pages, 6 figure

arXiv.org e-Print Archive

MDC Repository

Differential Privacy Applications to Bayesian and Linear Mixed Model Estimation

Author: Abowd John M.
Schneider Matthew J
Vilhuber Lars
Publication venue: DigitalCommons@ILR
Publication date: 01/01/2012
Field of study

We consider a particular maximum likelihood estimator (MLE) and a computationally-intensive Bayesian method for differentially private estimation of the linear mixed-effects model (LMM) with normal random errors. The LMM is important because it is used in small area estimation and detailed industry tabulations that present significant challenges for confidentiality protection of the underlying data. The differentially private MLE performs well compared to the regular MLE, and deteriorates as the protection increases for a problem in which the small-area variation is at the county level. More dimensions of random effects are needed to adequately represent the time- dimension of the data, and for these cases the differentially private MLE cannot be computed. The direct Bayesian approach for the same model uses an informative, but reasonably diffuse, prior to compute the posterior predictive distribution for the random effects. The differential privacy of this approach is estimated by direct computation of the relevant odds ratios after deleting influential observations according to various criteria

CiteSeerX

DigitalCommons@ILR

eCommons@Cornell

Tensor Regression with Applications in Neuroimaging Data Analysis

Author: Caffo B.
Casey B.
Davatzikos C.
de Lathauwer L.
de Leeuw J.
de Leeuw J.
Fan J.
Frank I. E.
Friston K. J.
Hinrichs C.
Hongtu Zhu
Hua Zhou
Hung H.
Kang H.
Kolda T. G.
Lange K.
Lazar N. A.
Lexin Li
Li B.
Li Y.
Li Y.
Lindquist M.
Liu X.
Martino F. D.
McCullagh P.
Park S. W.
Polzehl J.
Qiu P.
Qiu P.
Rao C. R.
Reiss P.
Rothenberg T. J.
Ryali S.
Sidiropoulos N. D.
Sowell E. R.
Tibshirani R.
Valera E. M.
van der Vaart A. W.
Worsley K. J.
Yue Y.
Zhou H.
Zou H.
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2012
Field of study

Classical regression methods treat covariates as a vector and estimate a corresponding vector of regression coefficients. Modern applications in medical imaging generate covariates of more complex form such as multidimensional arrays (tensors). Traditional statistical and computational methods are proving insufficient for analysis of these high-throughput data due to their ultrahigh dimensionality as well as complex structure. In this article, we propose a new family of tensor regression models that efficiently exploit the special structure of tensor covariates. Under this framework, ultrahigh dimensionality is reduced to a manageable level, resulting in efficient estimation and prediction. A fast and highly scalable estimation algorithm is proposed for maximum likelihood estimation and its associated asymptotic properties are studied. Effectiveness of the new methods is demonstrated on both synthetic and real MRI imaging data.Comment: 27 pages, 4 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

PubMed Central

Carolina Digital Repository