thesis

Analysis of Non-ignorable Missing and Left-Censored Longitudinal Biomarker Data

Abstract

In a longitudinal study of biomarker data collected during a hospital stay, observations may be missing due to administrative reasons, the death of the subject or the subject's discharge from the hospital, resulting in non-ignorable missing data. Standard likelihood-based methods for the analysis of longitudinal data, e.g, mixed models, do not include a mechanism that accounts for the different reasons for missingness. Rather than specifying a full likelihood function for the observed and missing data, we have proposed a weighted pseudo likelihood (WPL) method. Using this method a model can be built based on available data by accounting for the unobserved data via weights which are then treated as nuisance parameters in the model. The WPL method accounts for the nuisance parameters in the computation of the variances of parameter estimates. The performance of the proposed method has been compared with a number of widely used methods. The WPL method is illustrated using an example from the Genetic and Inflammatory Marker of Sepsis (GenIMS) study. A simulation study has been conducted to study the properties of the proposed method and the results are competitive with the widely used methods.In the second part, our goal is to address the problem of analyzing left-censored longitudinally measured biomarker data when subjects are lost due to the above mentioned reasons. We propose to analyze one such biomarker, IL-6, obtained from the GenIMS study, using a weighted random effects Tobit (WRT) model. We have compared the results of the WRT model with the random effects Tobit model. The simulation study shows that the WRT model estimates are approximately unbiased. The correct standard error has been computed using asymptotic pseudo likelihood theory. The use of multiple weights across the panel improves the estimate and produces smaller root mean square error. Therefore, the WRT model with multiple weights across panels is the recommended model for analyzing non-ignorable missing and left-censored biomarker longitudinal data. Model selection is an extremely important part of the analysis of any data set. As illustrated in these analyses, conclusions, which can directly impact public health, depend heavily on the data analytic approach

    Similar works