77 research outputs found

    Asymptotic Variance Estimation for the Misclassification SIMEX

    Get PDF
    Most epidemiological studies suffer from misclassification in the response and/or the covariates. Since ignoring misclassification induces bias on the parameter estimates, correction for such errors is important. For measurement error, the continuous analog to misclassification, a general approach for bias correction is the SIMEX (simulation extrapolation) originally suggested by Cook and Stefanski (1994). This approach has been recently extended to regression models with a possibly misclassified categorical response and/or the covariates by Küchenhoff et al. (2005), and is called the MC-SIMEX approach. To assess the importance of a regressor not only its (corrected) estimate is needed, but also its standard error. For the original SIMEX approach. Carroll et al. (1996) developed a method for estimating the asymptotic variance. Here we derive the asymptotic variance estimators for the MC-SIMEX approach, extending the methodology of Carroll et al. (1996). We also include the case where the misclassification probabilities are estimated by a validation study. An extensive simulation study shows the good performance of our approach. The approach is illustrated using an example in caries research including a logistic regression model, where the response and a binary covariate are possibly misclassified

    Some Recent Advances in Measurement Error Models and Methods

    Get PDF
    A measurement error model is a regression model with (substantial) measurement errors in the variables. Disregarding these measurement errors in estimating the regression parameters results in asymptotically biased estimators. Several methods have been proposed to eliminate, or at least to reduce, this bias, and the relative efficiency and robustness of these methods have been compared. The paper gives an account of these endeavors. In another context, when data are of a categorical nature, classification errors play a similar role as measurement errors in continuous data. The paper also reviews some recent advances in this field

    Application of the Misclassification Simulation Extrapolation (Mc-Simex) Procedure to Log-Logistic Accelerated Failure Time (Aft) Models In Survival Analysis

    Get PDF
    Survival analysis is the study of time to event outcomes. Accelerated Failure Time models (AFT) serve as a useful tool in survival analysis to study the time of occurrence of an event and its relation to the covariates of interest. The accuracy of estimation of parameters in a model depends upon the correct measurement of covariates. Considering that perfect measurement of covariates is highly unlikely, it is imperative that the performance of the existing bias-correction methods be analyzed in AFT models. However, certain areas of bias-correction in AFT models still remain unexplored. One of these unexplored areas, is a situation where the survival times follow a log-logistic distribution. In this dissertation, we evaluate the performance of the Misclassification simulation extrapolation (MC-SIMEX) procedure, a well-known procedure for bias-correction due to misclassification, in AFT models where the survival times follow a standard log-logistic distribution. In addition, a modified version of the MC-SIMEX procedure is also proposed, that provides an advantage in situations where the sensitivity and specificity of classification are unknown. Lastly, the performance of the original MC-SIMEX procedure in lung cancer data provided by the North Central Cancer Treatment Group (NCCTG), is also evaluated

    Adjustment of Recall Errors in Duration Data Using SIMEX

    Get PDF
    It is widely accepted that due to memory failures retrospective survey questions tend to be prone to measurement error. However, the proportion of studies using such data that attempt to adjust for the measurement problem is shockingly low. Arguably, to a great extent this is due to both the complexity of the methods available and the need to access a subsample containing either a gold standard or replicated values. Here I suggest the implementation of a version of SIMEX capable of adjusting for the types of multiplicative measurement errors associated with memory failures in the retrospective report of durations of life-course events. SIMEX is a method relatively simple to implement and it does not require the use of replicated or validation data so long as the error process can be adequately specified. To assess the effectiveness of the method I use simulated data. I create twelve scenarios based on the combinations of three outcome models (linear, logit and Poisson) and four types of multiplicative errors (non-systematic, systematic negative, systematic positive and heteroscedastic) affecting one of the explanatory variables. I show that SIMEX can be satisfactorily implemented in each of these scenarios. Furthermore, the method can also achieve partial adjustments even in scenarios where the actual distribution and prevalence of the measurement error differs substantially from what is assumed in the adjustment, which makes it an interesting sensitivity tool in those cases where all that is known about the error process is reduced to an educated guess

    Measurement Error and Misclassification in Interval-Censored Life History Data

    Get PDF
    In practice, data are frequently incomplete in one way or another. It can be a significant challenge to make valid inferences about the parameters of interest in this situation. In this thesis, three problems involving such data are addressed. The first two problems involve interval-censored life history data with mismeasured covariates. Data of this type are incomplete in two ways. First, the exact event times are unknown due to censoring. Second, the true covariate is missing for most, if not all, individuals. This work focuses primarily on the impact of covariate measurement error in progressive multi-state models with data arising from panel (i.e., interval-censored) observation. These types of problems arise frequently in clinical settings (e.g. when disease progression is of interest and patient information is collected during irregularly spaced clinic visits). Two and three state models are considered in this thesis. This work is motivated by a research program on psoriatic arthritis (PsA) where the effects of error-prone covariates on rates of disease progression are of interest and patient information is collected at clinic visits (Gladman et al. 1995; Bond et al. 2006). Information regarding the error distributions were available based on results from a separate study conducted to evaluate the reliability of clinical measurements that are used in PsA treatment and follow-up (Gladman et al. 2004). The asymptotic bias of covariate effects obtained ignoring error in covariates is investigated and shown to be substantial in some settings. In a series of simulation studies, the performance of corrected likelihood methods and methods based on a simulation-extrapolation (SIMEX) algorithm (Cook \& Stefanski 1994) were investigated to address covariate measurement error. The methods implemented were shown to result in much smaller empirical biases and empirical coverage probabilities which were closer to the nominal levels. The third problem considered involves an extreme case of interval censoring known as current status data. Current status data arise when individuals are observed only at a single point in time and it is then determined whether they have experienced the event of interest. To complicate matters, in the problem considered here, an unknown proportion of the population will never experience the event of interest. Again, this type of data is incomplete in two ways. One assessment is made on each individual to determine whether or not an event has occurred. Therefore, the exact event times are unknown for those who will eventually experience the event. In addition, whether or not the individuals will ever experience the event is unknown for those who have not experienced the event by the assessment time. This problem was motivated by a series of orthopedic trials looking at the effect of blood thinners in hip and knee replacement surgeries. These blood thinners can cause a negative serological response in some patients. This response was the outcome of interest and the only available information regarding it was the seroconversion time under current status observation. In this thesis, latent class models with parametric, nonparametric and piecewise constant forms of the seroconversion time distribution are described. They account for the fact that only a proportion of the population will experience the event of interest. Estimators based on an EM algorithm were evaluated via simulation and the orthopedic surgery data were analyzed based on this methodology

    Partially Identified Prevalence Estimation under Misclassification using the Kappa Coefficient

    Get PDF
    We discuss a new strategy for prevalence estimation in the presence of misclassification. Our method is applicable when misclassification probabilities are unknown but independent replicate measurements are available. This yields the kappa coefficient, which indicates the agreement between the two measurements. From this information, a direct correction for misclassification is not feasible due to non-identifiability. However, it is possible to derive estimation intervals relying on the concept of partial identification. These intervals give interesting insights into possible bias due to misclassification. Furthermore, confidence intervals can be constructed. Our method is illustrated in several theoretical scenarios and in an example from oral health, where prevalence estimation of caries in children is the issue

    Generalizations to Corrections of Measurement Error Effects for Dynamic Treatment Regimes

    Get PDF
    Measurement error is a pervasive issue in questions of estimation and inference. Generally, any data which are measured with error will render the results of an analysis which ignores this error unreliable. This is a particular concern in health research, where many quantities of interest are typically subject to measurement error. One particular field of health research, precision medicine, has not yet seen a substantive attempt to account for measurement error. Dynamic treatment regimes (DTRs), which can be used to represent sequences of treatment decisions in a medical setting, have historically been analyzed assuming, implicitly, that all quantities are perfectly observable. We consider the problem of optimal DTR estimation where quantities of interest may be subject to measurement error. The nature of this problem is such that many existing techniques to account for the effects of measurement error need to be expanded in order to accommodate the data which are available in practice. This expansion further highlights theoretical shortcomings in the existing methodologies. This thesis begins by expanding existing methods for correcting for the effects of measurement error to accommodate issues which are frequently observed in real-world data. We expand the most commonly applied measurement error corrections (regression calibration and simulation extrapolation), demonstrating how they are able to be conducted with non-identically distributed replicate measurements. We further expand simulation extrapolation, which typically assumes normality of the underlying error terms, proposing a nonparametric simulation extrapolation. These expansions are conducted generally, separate from the specific context of optimal DTR estimation. Following the expansion of these extant techniques, we consider the problem of errors in covariates within the DTR framework. We apply the aforementioned generalized error correction techniques to this setting, and demonstrate how valid estimation and inference can proceed. Finally, we consider problems which are present when there is treatment misclassification in DTRs, proposing techniques to restore consistency and perform valid inference. To our knowledge this work represents the first substantive attempt to explore these problems. Thus, in addition to proposing methodological solutions, we also elucidate the particular challenges of estimation in this setting. All proposed techniques are explored theoretically, using simulation studies, and through real-world data analyses
    corecore