166 research outputs found

    Stop or Continue Data Collection: A Nonignorable Missing Data Approach for Continuous Variables

    Full text link
    We present an approach to inform decisions about nonresponse follow-up sampling. The basic idea is (i) to create completed samples by imputing nonrespondents' data under various assumptions about the nonresponse mechanisms, (ii) take hypothetical samples of varying sizes from the completed samples, and (iii) compute and compare measures of accuracy and cost for different proposed sample sizes. As part of the methodology, we present a new approach for generating imputations for multivariate continuous data with nonignorable unit nonresponse. We fit mixtures of multivariate normal distributions to the respondents' data, and adjust the probabilities of the mixture components to generate nonrespondents' distributions with desired features. We illustrate the approaches using data from the 2007 U. S. Census of Manufactures

    Mixed hidden Markov quantile regression models for longitudinal data with possibly incomplete sequences

    No full text
    Quantile regression provides a detailed and robust picture of the distribution of a response variable, conditional on a set of observed covariates. Recently, it has be been extended to the analysis of longitudinal continuous outcomes using either time-constant or time-varying random parameters. However, in real-life data, we frequently observe both temporal shocks in the overall trend and individual-specific heterogeneity in model parameters. A benchmark dataset on HIV progression gives a clear example. Here, the evolution of the CD4 log counts exhibits both sudden temporal changes in the overall trend and heterogeneity in the effect of the time since seroconversion on the response dynamics. To accommodate such situations, we propose a quantile regression model, where time-varying and time-constant random coefficients are jointly considered. Since observed data may be incomplete due to early drop-out, we also extend the proposed model in a pattern mixture perspective. We assess the performance of the proposals via a large-scale simulation study and the analysis of the CD4 count data

    Multiple Imputation Using Influential Exponential Tilting in Case of Non-Ignorable Missing Data

    Get PDF
    Modern research strategies rely predominantly on three steps, data collection, data analysis, and inference. In research, if the data is not collected as designed, researchers may face challenges of having incomplete data, especially when it is non-ignorable. These situations affect the subsequent steps of evaluation and make them difficult to perform. Inference with incomplete data is a challenging task in data analysis and clinical trials when missing data related to the condition under the study. Moreover, results obtained from incomplete data are prone to biases. Parameter estimation with non-ignorable missing data is even more challenging to handle and extract useful information. This dissertation proposes a method based on the influential tilting resampling approach to address non-ignorable missing data in statistical inference. This robust approach is motivated by a brief use of the importance resampling approach used by Samawi et al. (1998) for power estimation. The exponential tilting also inspires it for non-ignorable missing data proposed by Kim & Yu (2011). One of the proposed approach bases is assuming that the non-respondents\u27 model corresponds to an exponential tilting of the respondents\u27 model. The tilted model\u27s specified function is the influential function of the function of interest (parameter). The other bases of the proposed approach are to use the importance resampling techniques to draw inference about some model parameters. Extensive simulation studies were conducted to investigate the performance of the proposed methods. We provided the theoretical justification, as well as application to real data

    Multiple Imputation and Quantile Regression Methods for Biomarker Data subject to Detection Limits

    Get PDF
    Biomarkers are increasingly used in biomedical studies to better understand the natural history and development of a disease, identify the patients at high-risk and guide the therapeutic strategies for intervention. However, the measurement of these markers is often limited by the sensitivity of the given assay, resulting in data that are censored either at the lower limit or upper limit of detection. Ignoring censoring issue in any analysis may lead to the biased results. For a regression analysis where multiple censored biomarkers are included as predictors, we develop multiple imputation methods based on Gibbs sampling approach. The simulation study shows that our method significantly reduces the estimation bias as compared to the other simple imputation methods when the correlation between markers is high or the censoring proportion is high. The likelihood based mean regression for repeatedly measured biomarkers often assume a multivariate normal distribution that may not hold for biomarker data even after transformations. We consider a robust alternative, median regression, for censored longitudinal data. We develop an estimating equation approach that can incorporate the serial correlations between repeated measurements. We conduct simulation studies to evaluate the proposed estimators and compare median regression model with the mixed models under various specifications of distributions and covariance structures. Missing data is a common problem with longitudinal study. Under the assumptions that the missing pattern is monotonic and the missingness may only depend on the observed data, we propose a weighted estimating equation approach for the censored quantile regression models. The contribution of each individual to the estimating equation is weighted by the inverse probability of dropout at the given occasion. The resultant regression estimators are consistent when the dropout process is correctly specified. The performance of our estimating procedure is evaluated via simulation study. We illustrate all the proposed methods using the biomarker data of the Genetic and Inflammatory Markers of Sepsis (GenIMS) study. Appropriate handling of censored data in biomarker analysis is of public health importance because it will improve the understanding of the biological mechanisms of the underlying disease and aid in the successful development of future effective treatments

    Posterior Inference in Bayesian Quantile Regression with Asymmetric Laplace Likelihood

    Full text link
    Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/135059/1/insr12114.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/135059/2/insr12114_am.pd

    Empirical likelihood for estimating equations with nonignorably missing data

    Get PDF
    We develop an empirical likelihood (EL) inference on parameters in generalized estimating equations with nonignorably missing response data. We consider an exponential tilting model for the nonignorably missing mechanism, and propose modified estimating equations by imputing missing data through a kernel regression method. We establish some asymptotic properties of the EL estimators of the unknown parameters under different scenarios. With the use of auxiliary information, the EL estimators are statistically more efficient. Simulation studies are used to assess the finite sample performance of our proposed EL estimators. We apply our EL estimators to investigate a data set on earnings obtained from the New York Social Indicators Survey
    • …
    corecore