141,678 research outputs found
Analyse von Längsschnittdaten mit fehlenden Werten: Grundlagen, Verfahren und Anwendungen.
The first part gives an overview over foundations of empirical social research and an introduction into the estimation of linear fixed and random effects panel models. In addition, the semi parametric estimation of binary panel models based on generalized estimating equations (GEE) is addressed. The standard GEE approach, where the covariance structure parameters are treated as nuisance parameters is then generalized to include estimating equations for both, mean and covariance structure parameters. This approach allows the estimation of simultaneous equations panel models with mixed continuous and ordered categorical outcomes which is discussed in detail. As a measure of the explanatory power of the model a pseudo-R^2 measure is developed and evaluated. In the second part, fundamental concepts important with respect to the analysis of data sets with missing values are introduced and discussed and various approaches and methods to compensate for missing data are reviewed. The method of multiple imputation and its application is treated in detail. The approaches and techniques proposed and discussed in the first two parts are tested and illustrated with the help of various simulation studies and examples, respectively.The last chapter deals with possibly time changing effects of variables that can be interpreted as social investments on variables that can be interpreted as subjective and objective gratification variables. The resulting two-equation panel model with mixed continuous and ordered categorical outcomes is estimated with the approach described in the first part based on a data set with missing values. To compensate for missing data, a mixed weighting and multiple imputation approach is adopted
Analysis on binary responses with ordered covariates and missing data
We consider the situation of two ordered categorical variables and a binary outcome variable, where one or both of the categorical variables may have missing values. The goal is to estimate the probability of response of the outcome variable for each cell of the contingency table of categorical variables while incorporating the fact that the categorical variables are ordered. The probability of response is assumed to change monotonically as each of the categorical variables changes level. A probability model is used in which the response is binomial with parameters p ij for each cell ( i , j ) and the number of observations in each cell is multinomial. Estimation approaches that incorporate Gibbs sampling with order restrictions on p ij induced via a prior distribution, two-dimensional isotonic regression and multiple imputation to handle missing values are considered. The methods are compared in a simulation study. Using a fully Bayesian approach with a strong prior distribution to induce ordering can lead to large gains in efficiency, but can also induce bias. Utilizing isotonic regression can lead to modest gains in efficiency, while minimizing bias and guaranteeing that the order constraints are satisfied. A hybrid of isotonic regression and Gibbs sampling appears to work well across a variety of scenarios. The methods are applied to a pancreatic cancer case–control study with two biomarkers. Copyright © 2007 John Wiley & Sons, Ltd.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/56130/1/2815_ftp.pd
Penalized EM algorithm and copula skeptic graphical models for inferring networks for mixed variables
In this article, we consider the problem of reconstructing networks for
continuous, binary, count and discrete ordinal variables by estimating sparse
precision matrix in Gaussian copula graphical models. We propose two
approaches: penalized extended rank likelihood with Monte Carlo
Expectation-Maximization algorithm (copula EM glasso) and copula skeptic with
pair-wise copula estimation for copula Gaussian graphical models. The proposed
approaches help to infer networks arising from nonnormal and mixed variables.
We demonstrate the performance of our methods through simulation studies and
analysis of breast cancer genomic and clinical data and maize genetics data
Kernel-based system identification from noisy and incomplete input-output data
In this contribution, we propose a kernel-based method for the identification
of linear systems from noisy and incomplete input-output datasets. We model the
impulse response of the system as a Gaussian process whose covariance matrix is
given by the recently introduced stable spline kernel. We adopt an empirical
Bayes approach to estimate the posterior distribution of the impulse response
given the data. The noiseless and missing data samples, together with the
kernel hyperparameters, are estimated maximizing the joint marginal likelihood
of the input and output measurements. To compute the marginal-likelihood
maximizer, we build a solution scheme based on the Expectation-Maximization
method. Simulations on a benchmark dataset show the effectiveness of the
method.Comment: 16 pages, submitted to IEEE Conference on Decision and Control 201
- …