27 research outputs found

    Robust and Efficient Statistical Inference for Clustered Observational Data in Comparative Effectiveness Research

    Get PDF
    Treatment allocations in observational studies are nonrandom and result in the confounding problem and potentially biase treatment effect estimates. Propensity score (PS) methods are commonly used in practice to address the confounding problem. Among different PS methods, PS regression is frequently used in clinical research. Even though the treatment effect estimate from the PS regression model is unbiased under the strongly ignorable treatment assignment assumption, the default variance estimate is biased. In the first topic of this dissertation, an improved variance estimator for the treatment effect estimate is proposed. Many observational data are clustered, for example, by physicians, and are therefore, not independent. A few PS methods consider correlated or clustered samples using mixed effects models with a strong normality assumption on the cluster effects. In the second part of this dissertation, a robust semi-nonparametric propensity score (SNP-PS) regression model is proposed. We relax the normality assumption and model the complex heterogeneity structure in treatment allocation process nonparametrically. The proposed SNP-PS model is robust and provides unbiased treatment effect estimates while parametric mixed effects PS models fail to do so when the cluster effects are non-normally distributed. We establish the asymptotic result for the treatment effect estimate and propose an unbiased variance estimator for it. Computationally, we propose an adaptive quadrature integration EM (expectation-maximization) algorithm to avoid potential large Monte Carlo errors of existing Monte Carlo EM algorithms. Many real world medical record data are not only clustered but also multilevel clustered with millions of samples and hundreds of thousands of clusters. The SNP-PS framework is in theory applicable to these large datasets. However, in practice, it is computationally prohibited. In the third topic of this dissertation, we propose a flexible mixed effects PS model (FM-PS) that is computationally efficient for large multilevel clustered data. The FM-PS model relaxes a critical independence assumption that the random effects are independent of the fixed effect covariates made in the standard mixed effects PS (SM-PS) models. The FM-PS model provides an unbiased treatment effect estimate regardless whether the independence assumption holds or not. Though the treatment effect estimate from the SM-PS model is biased when the independence assumption does not hold, it is unbiased and more efficient than the estimate from the FM-PS model when the independence assumption holds. We propose a likelihood ratio statistics for testing the independence assumption which allows us to choose between the FM-PS and SM-PS models. A cluster bootstrapping procedure to estimate the variance of treatment effect estimate is proposed. The FM-PS model is robust to various model misspecifications as demonstrated by our extensive simulations.Doctor of Philosoph

    Nasal Nitric Oxide and Lifestyle Exposure to Tobacco Smoke

    Get PDF
    Nitric oxide (NO) is a reactive gas generated by inflammatory cells and mucosal epithelial cells of the nose and paranasal sinuses and is an important mediator in nonspecific host defense against infectious agents. However, NO also mediates physiologic events such as vasodilation, mucus hypersecretion, and mucosal disruption that are associated with inflammatory conditions, and it is a regulator of ciliary beat frequency. In the present study, we hypothesized that lifestyle exposure to tobacco smoke, whether through active smoking or by inadvertent exposure to secondhand tobacco smoke, would result in higher detectable levels of nasal NO (nNO) than are found in well-documented nonsmokers

    A mixed-effects two-part model for twin-data and an application on identifying important factors associated with extremely preterm children’s health disorders

    Get PDF
    Our recent studies identifying factors significantly associated with the positive child health index (PCHI) in a mixed cohort of preterm-born singletons, twins, and triplets posed some analytic and modeling challenges. The PCHI transforms the total number of health disorders experienced (of the eleven ascertained) to a scale from 0 to 100%. While some of the children had none of the eleven health disorders (i.e., PCHI = 1), others experienced a subset or all (i.e., 0 ≤PCHI< 1). This indicates the existence of two distinct data processes—one for the healthy children, and another for those with at least one health disorder, necessitating a two-part model to accommodate both. Further, the scores for twins and triplets are potentially correlated since these children share similar genetics and early environments. The existing approach for analyzing PCHI data dichotomizes the data (i.e., number of health disorders) and uses a mixed-effects logistic or multiple logistic regression to model the binary feature of the PCHI (1 vs. < 1). To provide an alternate analytic framework, in this study we jointly model the two data processes under a mixed-effects two-part model framework that accounts for the sample correlations between and within the two data processes. The proposed method increases power to detect factors associated with disorders. Extensive numerical studies demonstrate that the proposed joint-test procedure consistently outperforms the existing method when the type I error is controlled at the same level. Our numerical studies also show that the proposed method is robust to model misspecifications and it is applicable to a set of correlated semi-continuous data

    On variance estimate for covariate adjustment by propensity score analysis: On variance estimate for covariate adjustment by propensity score analysis

    Get PDF
    Propensity score (PS) methods have been used extensively to adjust for confounding factors in the statistical analysis of observational data in comparative effectiveness research. There are four major PS-based adjustment approaches: PS matching, PS stratification, covariate adjustment by PS, and PS-based inverse probability weighting (IPW). Though covariate adjustment by PS is one of the most frequently used PS-based methods in clinical research, the conventional variance estimation of the treatment effects estimate under covariate adjustment by PS is biased. As Stampf et al. have shown, this bias in variance estimation is likely to lead to invalid statistical inference and could result in erroneous public health conclusions (e.g. food and drug safety, adverse events surveillance). To address this issue, we propose a two-stage analytic procedure to develop a valid variance estimator for the covariate adjustment by PS analysis strategy. We also carry out a simple empirical bootstrap resampling scheme. Both proposed procedures are implemented in an R function for public use. Extensive simulation results demonstrate the bias in the conventional variance estimator, and show that both proposed variance estimators offer valid estimates for the true variance and they are robust to complex confounding structures. The proposed methods are illustrated for a post-surgery pain study

    Antithymocyte Globulin Plus G-CSF Combination Therapy Leads to Sustained Immunomodulatory and Metabolic Effects in a Subset of Responders With Established Type 1 Diabetes.

    Get PDF
    Low-dose antithymocyte globulin (ATG) plus pegylated granulocyte colony-stimulating factor (G-CSF) preserves β-cell function for at least 12 months in type 1 diabetes. Herein, we describe metabolic and immunological parameters 24 months following treatment. Patients with established type 1 diabetes (duration 4-24 months) were randomized to ATG and pegylated G-CSF (ATG+G-CSF) (N = 17) or placebo (N = 8). Primary outcomes included C-peptide area under the curve (AUC) following a mixed-meal tolerance test (MMTT) and flow cytometry. "Responders" (12-month C-peptide ≥ baseline), "super responders" (24-month C-peptide ≥ baseline), and "nonresponders" (12-month C-peptide &lt; baseline) were evaluated for biomarkers of outcome. At 24 months, MMTT-stimulated AUC C-peptide was not significantly different in ATG+G-CSF (0.49 nmol/L/min) versus placebo (0.29 nmol/L/min). Subjects treated with ATG+G-CSF demonstrated reduced CD4+ T cells and CD4+/CD8+ T-cell ratio and increased CD16+CD56hi natural killer cells (NK), CD4+ effector memory T cells (Tem), CD4+PD-1+ central memory T cells (Tcm), Tcm PD-1 expression, and neutrophils. FOXP3+Helios+ regulatory T cells (Treg) were elevated in ATG+G-CSF subjects at 6, 12, and 18 but not 24 months. Immunophenotyping identified differential HLA-DR expression on monocytes and NK and altered CXCR3 and PD-1 expression on T-cell subsets. As such, a group of metabolic and immunological responders was identified. A phase II study of ATG+G-CSF in patients with new-onset type 1 diabetes is ongoing and may support ATG+G-CSF as a prevention strategy in high-risk subjects

    On model selections for repeated measurement data in clinical studies: On model selections for repeated measurement data in clinical studies

    Get PDF
    Repeated measurement designs have been widely used in various randomized controlled trials for evaluating long term intervention efficacies. For some clinical trials, the primary research question is to compare two treatments at a fixed time, using a t-test. Though simple, robust, and convenient, this type of analysis fails to utilize a large amount of collected information. Alternatively, the mixed effects model is commonly used for repeated measurement data. It models all available data jointly and allows explicit assessment of the overall treatment effects across the entire time spectrum. In this paper, we propose an analytic strategy for longitudinal clinical trial data where the mixed effects model is coupled with a model selection scheme. The proposed test statistics not only make full use of all available data but also utilize the information from the optimal model deemed for the data. The performance of the proposed method under various setups, including different data missing mechanisms, is evaluated via extensive Monte Carlo simulations. Our numerical results demonstrate that the proposed analytic procedure is more powerful than the t-test when the primary interest is to test for the treatment effect at the last time point. Simulations also reveal that the proposed method outperforms the usual mixed effects model for testing the overall treatment effects across time. In addition, the proposed framework is more robust and flexible in dealing with missing data compared to several competing methods. The utility of the proposed method is demonstrated by analyzing a clinical trial on the cognitive effect of testosterone in geriatric men with low baseline testosterone levels
    corecore