173 research outputs found

    Bayesian modeling longitudinal dyadic data with nonignorable dropout, with application to a breast cancer study

    Full text link
    Dyadic data are common in the social and behavioral sciences, in which members of dyads are correlated due to the interdependence structure within dyads. The analysis of longitudinal dyadic data becomes complex when nonignorable dropouts occur. We propose a fully Bayesian selection-model-based approach to analyze longitudinal dyadic data with nonignorable dropouts. We model repeated measures on subjects by a transition model and account for within-dyad correlations by random effects. In the model, we allow subject's outcome to depend on his/her own characteristics and measure history, as well as those of the other member in the dyad. We further account for the nonignorable missing data mechanism using a selection model in which the probability of dropout depends on the missing outcome. We propose a Gibbs sampler algorithm to fit the model. Simulation studies show that the proposed method effectively addresses the problem of nonignorable dropouts. We illustrate our methodology using a longitudinal breast cancer study.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS515 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A BAYESIAN SHRINKAGE MODEL FOR INCOMPLETE LONGITUDINAL BINARY DATA WITH APPLICATION TO THE BREAST CANCER PREVENTION TRIAL

    Get PDF
    We consider inference in randomized studies, in which repeatedly measured outcomes may be informatively missing due to drop out. In this setting, it is well known that full data estimands are not identified unless unverified assumptions are imposed. We assume a non-future dependence model for the drop-out mechanism and posit an exponential tilt model that links non-identifiable and identifiable distributions. This model is indexed by non-identified parameters, which are assumed to have an informative prior distribution, elicited from subject-matter experts. Under this model, full data estimands are shown to be expressed as functionals of the distribution of the observed data. To avoid the curse of dimensionality, we model the distribution of the observed data using a Bayesian shrinkage model. In a simulation study, we compare our approach to a fully parametric and a fully saturated model for the distribution of the observed data. Our methodology is motivated and applied to data from the Breast Cancer Prevention Trial

    Reducing Attrition Bias using Targeted Refreshment Sampling and Matching

    Get PDF
    This paper examines the possibility of reducing attrition bias in panel data using targeted refreshment sampling and matched imputation. The targeted refreshment sampling approach consists of collecting new data from the original sampling population from individuals who would never usually respond to surveys. Using propensity score matching and imputation in conjunction with refreshment sampling it is suggested that the dropouts from a panel can effectively be 'replaced'. The procedure allows us to identify underlying joint distributions in the data. The method is illustrated using data from the Youth Cohort Surveys in the UK which suffer 45% attrition in the second wave. A comparison of the results of this method with other techniques for attrition modeling suggest that the technique could be an effective way to overcome a substantial part of the bias associated with attrition.attrition, refreshment sampling

    A shared-parameter continuous-time hidden Markov and survival model for longitudinal data with informative dropout

    Get PDF
    A shared-parameter approach for jointly modeling longitudinal and survival data is proposed. With respect to available approaches, it allows for time-varying random effects that affect both the longitudinal and the survival processes. The distribution of these random effects is modeled according to a continuous-time hidden Markov chain so that transitions may occur at any time point. For maximum likelihood estimation, we propose an algorithm based on a discretization of time until censoring in an arbitrary number of time windows. The observed information matrix is used to obtain standard errors. We illustrate the approach by simulation, even with respect to the effect of the number of time windows on the precision of the estimates, and by an application to data about patients suffering from mildly dilated cardiomyopathy

    Mixed hidden Markov quantile regression models for longitudinal data with possibly incomplete sequences

    No full text
    Quantile regression provides a detailed and robust picture of the distribution of a response variable, conditional on a set of observed covariates. Recently, it has be been extended to the analysis of longitudinal continuous outcomes using either time-constant or time-varying random parameters. However, in real-life data, we frequently observe both temporal shocks in the overall trend and individual-specific heterogeneity in model parameters. A benchmark dataset on HIV progression gives a clear example. Here, the evolution of the CD4 log counts exhibits both sudden temporal changes in the overall trend and heterogeneity in the effect of the time since seroconversion on the response dynamics. To accommodate such situations, we propose a quantile regression model, where time-varying and time-constant random coefficients are jointly considered. Since observed data may be incomplete due to early drop-out, we also extend the proposed model in a pattern mixture perspective. We assess the performance of the proposals via a large-scale simulation study and the analysis of the CD4 count data

    Statistical Methods for Non-Ignorable Missing Data With Applications to Quality-of-Life Data.

    Get PDF
    Researchers increasingly use more and more survey studies, and design medical studies to better understand the relationships of patients, physicians, their health care system utilization, and their decision making processes in disease prevention and management. Longitudinal data is widely used to capture trends occurring over time. Each subject is observed as time progresses, but a common problem is that repeated measurements are not fully observed due to missing response or loss to follow up. An individual can move in and out of the observed data set during a study, giving rise to a large class of distinct non-monotone missingness patterns. In such medical studies, sample sizes are often limited due to restrictions on disease type, study design and medical information availability. Small sample sizes with large proportions of missing information are problematic for researchers trying to understand the experience of the total population. The information in the data collected may produce biased estimators if, for example, the patients who don\u27t respond have worse outcomes, or the patients who answered unknown are those without access to medical or non-medical information or care. Data modeled without considering this missing information may cause biased results. A first-order Markov dependence structure is a natural data structure to model the tendency of changes. In my first project, we developed a Markov transition model using a full-likelihood based algorithm to provide robust estimation accounting for non-ignorable\u27\u27 missingness information, and applied it to data from the Penn Center of Excellence in Cancer Communication Research. In my second project, we extended the method to a pseudo-likelihood based approach by considering only pairs of adjacent observations to significantly ease the computational complexities of the full-likelihood based method proposed in the first project. In my third project, we proposed a two stage pseudo hidden Markov model to analyze the association between quality of life measurements and cancer treatments from a randomized phase III trial (RTOG 9402) in brain cancer patients. By incorporating selection models and shared parameter models with a hidden Markov model, this approach provides targeted identification of treatment effects

    Identification and estimation of panel data models with attrition using refreshment samples

    Get PDF
    This thesis deals with attrition in panel data. The problem associated with attrition is that it can lead to estimation results that suffer from selection bias. This can be avoided by using attrition models that are sufficiently unrestrictive to allow for a wide range of potential selection. In chapter 2, I propose the Sequential Additively Nonignorable (SAN) attrition model. This model combines an Additive Nonignorability assumption with the Sequential Attrition assumption, to just-identify the joint population distribution in Panel data with any number of waves. The identification requires the availability of refreshment samples. Just-identification means that the SAN model has no testable implications. In other words, less restrictive identified models do not exist. To estimate SAN models, I propose a weighted Generalized Method of Moments estimator, and derive its repeated sampling behaviour in large samples. This estimator is applied to the Dutch Transportation Panel and the English Longitudinal Study of Ageing. In chapter 4, a likelihood-based alternative estimation approach is proposed, by means of an EM algorithm. Maximum Likelihood estimates can be useful if it is hard to obtain an explicit expression for the score function implied by the likelihood. In that case, the weighted GMM approach is not applicable

    Handling non-ignorable dropouts in longitudinal data: A conditional model based on a latent Markov heterogeneity structure

    Full text link
    We illustrate a class of conditional models for the analysis of longitudinal data suffering attrition in random effects models framework, where the subject-specific random effects are assumed to be discrete and to follow a time-dependent latent process. The latent process accounts for unobserved heterogeneity and correlation between individuals in a dynamic fashion, and for dependence between the observed process and the missing data mechanism. Of particular interest is the case where the missing mechanism is non-ignorable. To deal with the topic we introduce a conditional to dropout model. A shape change in the random effects distribution is considered by directly modeling the effect of the missing data process on the evolution of the latent structure. To estimate the resulting model, we rely on the conditional maximum likelihood approach and for this aim we outline an EM algorithm. The proposal is illustrated via simulations and then applied on a dataset concerning skin cancers. Comparisons with other well-established methods are provided as well
    • …
    corecore