141,678 research outputs found

    Analyse von Längsschnittdaten mit fehlenden Werten: Grundlagen, Verfahren und Anwendungen.

    Get PDF
    The first part gives an overview over foundations of empirical social research and an introduction into the estimation of linear fixed and random effects panel models. In addition, the semi parametric estimation of binary panel models based on generalized estimating equations (GEE) is addressed. The standard GEE approach, where the covariance structure parameters are treated as nuisance parameters is then generalized to include estimating equations for both, mean and covariance structure parameters. This approach allows the estimation of simultaneous equations panel models with mixed continuous and ordered categorical outcomes which is discussed in detail. As a measure of the explanatory power of the model a pseudo-R^2 measure is developed and evaluated. In the second part, fundamental concepts important with respect to the analysis of data sets with missing values are introduced and discussed and various approaches and methods to compensate for missing data are reviewed. The method of multiple imputation and its application is treated in detail. The approaches and techniques proposed and discussed in the first two parts are tested and illustrated with the help of various simulation studies and examples, respectively.The last chapter deals with possibly time changing effects of variables that can be interpreted as social investments on variables that can be interpreted as subjective and objective gratification variables. The resulting two-equation panel model with mixed continuous and ordered categorical outcomes is estimated with the approach described in the first part based on a data set with missing values. To compensate for missing data, a mixed weighting and multiple imputation approach is adopted

    Analysis on binary responses with ordered covariates and missing data

    Full text link
    We consider the situation of two ordered categorical variables and a binary outcome variable, where one or both of the categorical variables may have missing values. The goal is to estimate the probability of response of the outcome variable for each cell of the contingency table of categorical variables while incorporating the fact that the categorical variables are ordered. The probability of response is assumed to change monotonically as each of the categorical variables changes level. A probability model is used in which the response is binomial with parameters p ij for each cell ( i , j ) and the number of observations in each cell is multinomial. Estimation approaches that incorporate Gibbs sampling with order restrictions on p ij induced via a prior distribution, two-dimensional isotonic regression and multiple imputation to handle missing values are considered. The methods are compared in a simulation study. Using a fully Bayesian approach with a strong prior distribution to induce ordering can lead to large gains in efficiency, but can also induce bias. Utilizing isotonic regression can lead to modest gains in efficiency, while minimizing bias and guaranteeing that the order constraints are satisfied. A hybrid of isotonic regression and Gibbs sampling appears to work well across a variety of scenarios. The methods are applied to a pancreatic cancer case–control study with two biomarkers. Copyright © 2007 John Wiley & Sons, Ltd.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/56130/1/2815_ftp.pd

    Penalized EM algorithm and copula skeptic graphical models for inferring networks for mixed variables

    Full text link
    In this article, we consider the problem of reconstructing networks for continuous, binary, count and discrete ordinal variables by estimating sparse precision matrix in Gaussian copula graphical models. We propose two approaches: 1\ell_1 penalized extended rank likelihood with Monte Carlo Expectation-Maximization algorithm (copula EM glasso) and copula skeptic with pair-wise copula estimation for copula Gaussian graphical models. The proposed approaches help to infer networks arising from nonnormal and mixed variables. We demonstrate the performance of our methods through simulation studies and analysis of breast cancer genomic and clinical data and maize genetics data

    Kernel-based system identification from noisy and incomplete input-output data

    Full text link
    In this contribution, we propose a kernel-based method for the identification of linear systems from noisy and incomplete input-output datasets. We model the impulse response of the system as a Gaussian process whose covariance matrix is given by the recently introduced stable spline kernel. We adopt an empirical Bayes approach to estimate the posterior distribution of the impulse response given the data. The noiseless and missing data samples, together with the kernel hyperparameters, are estimated maximizing the joint marginal likelihood of the input and output measurements. To compute the marginal-likelihood maximizer, we build a solution scheme based on the Expectation-Maximization method. Simulations on a benchmark dataset show the effectiveness of the method.Comment: 16 pages, submitted to IEEE Conference on Decision and Control 201
    corecore