29,960 research outputs found
A Cluster Elastic Net for Multivariate Regression
We propose a method for estimating coefficients in multivariate regression
when there is a clustering structure to the response variables. The proposed
method includes a fusion penalty, to shrink the difference in fitted values
from responses in the same cluster, and an L1 penalty for simultaneous variable
selection and estimation. The method can be used when the grouping structure of
the response variables is known or unknown. When the clustering structure is
unknown the method will simultaneously estimate the clusters of the response
and the regression coefficients. Theoretical results are presented for the
penalized least squares case, including asymptotic results allowing for p >> n.
We extend our method to the setting where the responses are binomial variables.
We propose a coordinate descent algorithm for both the normal and binomial
likelihood, which can easily be extended to other generalized linear model
(GLM) settings. Simulations and data examples from business operations and
genomics are presented to show the merits of both the least squares and
binomial methods.Comment: 37 Pages, 11 Figure
Bayesian correction for covariate measurement error: a frequentist evaluation and comparison with regression calibration
Bayesian approaches for handling covariate measurement error are well
established, and yet arguably are still relatively little used by researchers.
For some this is likely due to unfamiliarity or disagreement with the Bayesian
inferential paradigm. For others a contributory factor is the inability of
standard statistical packages to perform such Bayesian analyses. In this paper
we first give an overview of the Bayesian approach to handling covariate
measurement error, and contrast it with regression calibration (RC), arguably
the most commonly adopted approach. We then argue why the Bayesian approach has
a number of statistical advantages compared to RC, and demonstrate that
implementing the Bayesian approach is usually quite feasible for the analyst.
Next we describe the closely related maximum likelihood and multiple imputation
approaches, and explain why we believe the Bayesian approach to generally be
preferable. We then empirically compare the frequentist properties of RC and
the Bayesian approach through simulation studies. The flexibility of the
Bayesian approach to handle both measurement error and missing data is then
illustrated through an analysis of data from the Third National Health and
Nutrition Examination Survey
- …