271 research outputs found

    Analysis of Randomised Trials Including Multiple Births When Birth Size Is Informative

    Get PDF
    BACKGROUND: Informative birth size occurs when the average outcome depends on the number of infants per birth. Although analysis methods have been proposed for handling informative birth size, their performance is not well understood. Our aim was to evaluate the performance of these methods and to provide recommendations for their application in randomised trials including infants from single and multiple births. METHODS: Three generalised estimating equation (GEE) approaches were considered for estimating the effect of treatment on a continuous or binary outcome: cluster weighted GEEs, which produce treatment effects with a mother-level interpretation when birth size is informative; standard GEEs with an independence working correlation structure, which produce treatment effects with an infant-level interpretation when birth size is informative; and standard GEEs with an exchangeable working correlation structure, which do not account for informative birth size. The methods were compared through simulation and analysis of an example dataset. RESULTS: Treatment effect estimates were affected by informative birth size in the simulation study when the effect of treatment in singletons differed from that in multiples (i.e. in the presence of a treatment group by multiple birth interaction). The strength of evidence supporting the effectiveness of treatment varied between methods in the example dataset. CONCLUSIONS: Informative birth size is always a possibility in randomised trials including infants from both single and multiple births, and analysis methods should be pre-specified with this in mind. We recommend estimating treatment effects using standard GEEs with an independence working correlation structure to give an infant-level interpretation

    Assessing the causal effect of binary interventions from observational panel data with few treated units

    Get PDF
    Researchers are often challenged with assessing the impact of an intervention on an outcome of interest in situations where the intervention is non-randomised, the intervention is only applied to one or few units, the intervention is binary, and outcome measurements are available at multiple time points. In this paper, we review existing methods for causal inference in these situations. We detail the assumptions underlying each method, emphasize connections between the different approaches and provide guidelines regarding their practical implementation. Several open problems are identified thus highlighting the need for future research

    Estimation of required sample size for external validation of risk models for binary outcomes

    Get PDF
    Risk-prediction models for health outcomes are used in practice as part of clinical decision-making, and it is essential that their performance be externally validated. An important aspect in the design of a validation study is choosing an adequate sample size. In this paper, we investigate the sample size requirements for validation studies with binary outcomes to estimate measures of predictive performance (C-statistic for discrimination and calibration slope and calibration in the large). We aim for sufficient precision in the estimated measures. In addition, we investigate the sample size to achieve sufficient power to detect a difference from a target value. Under normality assumptions on the distribution of the linear predictor, we obtain simple estimators for sample size calculations based on the measures above. Simulation studies show that the estimators perform well for common values of the C-statistic and outcome prevalence when the linear predictor is marginally Normal. Their performance deteriorates only slightly when the normality assumptions are violated. We also propose estimators which do not require normality assumptions but require specification of the marginal distribution of the linear predictor and require the use of numerical integration. These estimators were also seen to perform very well under marginal normality. Our sample size equations require a specified standard error (SE) and the anticipated C-statistic and outcome prevalence. The sample size requirement varies according to the prognostic strength of the model, outcome prevalence, choice of the performance measure and study objective. For example, to achieve an SE < 0.025 for the C-statistic, 60-170 events are required if the true C-statistic and outcome prevalence are between 0.64-0.85 and 0.05-0.3, respectively. For the calibration slope and calibration in the large, achieving SE < 0.15   would require 40-280 and 50-100 events, respectively. Our estimators may also be used for survival outcomes when the proportion of censored observations is high

    Response to Klebanoff

    Get PDF

    Joint modelling rationale for chained equations

    Get PDF
    BACKGROUND: Chained equations imputation is widely used in medical research. It uses a set of conditional models, so is more flexible than joint modelling imputation for the imputation of different types of variables (e.g. binary, ordinal or unordered categorical). However, chained equations imputation does not correspond to drawing from a joint distribution when the conditional models are incompatible. Concurrently with our work, other authors have shown the equivalence of the two imputation methods in finite samples. METHODS: Taking a different approach, we prove, in finite samples, sufficient conditions for chained equations and joint modelling to yield imputations from the same predictive distribution. Further, we apply this proof in four specific cases and conduct a simulation study which explores the consequences when the conditional models are compatible but the conditions otherwise are not satisfied. RESULTS: We provide an additional “non-informative margins” condition which, together with compatibility, is sufficient. We show that the non-informative margins condition is not satisfied, despite compatible conditional models, in a situation as simple as two continuous variables and one binary variable. Our simulation study demonstrates that as a consequence of this violation order effects can occur; that is, systematic differences depending upon the ordering of the variables in the chained equations algorithm. However, the order effects appear to be small, especially when associations between variables are weak. CONCLUSIONS: Since chained equations is typically used in medical research for datasets with different types of variables, researchers must be aware that order effects are likely to be ubiquitous, but our results suggest they may be small enough to be negligibl

    Comparative Effectiveness of Adalimumab vs Tofacitinib in Patients With Rheumatoid Arthritis in Australia

    Get PDF
    Importance: There is a need for observational studies to supplement evidence from clinical trials, and the target trial emulation (TTE) framework can help avoid biases that can be introduced when treatments are compared crudely using observational data by applying design principles for randomized clinical trials. Adalimumab (ADA) and tofacitinib (TOF) were shown to be equivalent in patients with rheumatoid arthritis (RA) in a randomized clinical trial, but to our knowledge, these drugs have not been compared head-to-head using routinely collected clinical data and the TTE framework. Objective: To emulate a randomized clinical trial comparing ADA vs TOF in patients with RA who were new users of a biologic or targeted synthetic disease-modifying antirheumatic drug (b/tsDMARD). Design, Setting, and Participants: This comparative effectiveness study emulating a randomized clinical trial of ADA vs TOF included Australian adults aged 18 years or older with RA in the Optimising Patient Outcomes in Australian Rheumatology (OPAL) data set. Patients were included if they initiated ADA or TOF between October 1, 2015, and April 1, 2021; were new b/tsDMARD users; and had at least 1 component of the disease activity score in 28 joints using C-reactive protein (DAS28-CRP) recorded at baseline or during follow-up. Intervention: Treatment with either ADA (40 mg every 14 days) or TOF (10 mg daily). Main Outcomes and Measures: The main outcome was the estimated average treatment effect, defined as the difference in mean DAS28-CRP among patients receiving TOF compared with those receiving ADA at 3 and 9 months after initiating treatment. Missing DAS28-CRP data were multiply imputed. Stable balancing weights were used to account for nonrandomized treatment assignment. Results: A total of 842 patients were identified, including 569 treated with ADA (387 [68.0%] female; median age, 56 years [IQR, 47-66 years]) and 273 treated with TOF (201 [73.6%] female; median age, 59 years [IQR, 51-68 years]). After applying stable balancing weights, mean DAS28-CRP in the ADA group was 5.3 (95% CI, 5.2-5.4) at baseline, 2.6 (95% CI, 2.5-2.7) at 3 months, and 2.3 (95% CI, 2.2-2.4) at 9 months; in the TOF group, it was 5.3 (95% CI, 5.2-5.4) at baseline, 2.4 (95% CI, 2.2-2.5) at 3 months, and 2.3 (95% CI, 2.1-2.4) at 9 months. The estimated average treatment effect was -0.2 (95% CI, -0.4 to -0.03; P = .02) at 3 months and -0.03 (95% CI, -0.2 to 0.1; P = .60) at 9 months. Conclusions and Relevance: In this study, there was a modest but statistically significant reduction in DAS28-CRP at 3 months for patients receiving TOF compared with those receiving ADA and no difference between treatment groups at 9 months. Three months of treatment with either drug led to clinically relevant average reductions in mean DAS28-CRP, consistent with remission

    Practical Issues in Imputation-Based Association Mapping

    Get PDF
    Imputation-based association methods provide a powerful framework for testing untyped variants for association with phenotypes and for combining results from multiple studies that use different genotyping platforms. Here, we consider several issues that arise when applying these methods in practice, including: (i) factors affecting imputation accuracy, including choice of reference panel; (ii) the effects of imputation accuracy on power to detect associations; (iii) the relative merits of Bayesian and frequentist approaches to testing imputed genotypes for association with phenotype; and (iv) how to quickly and accurately compute Bayes factors for testing imputed SNPs. We find that imputation-based methods can be robust to imputation accuracy and can improve power to detect associations, even when average imputation accuracy is poor. We explain how ranking SNPs for association by a standard likelihood ratio test gives the same results as a Bayesian procedure that uses an unnatural prior assumption—specifically, that difficult-to-impute SNPs tend to have larger effects—and assess the power gained from using a Bayesian approach that does not make this assumption. Within the Bayesian framework, we find that good approximations to a full analysis can be achieved by simply replacing unknown genotypes with a point estimate—their posterior mean. This approximation considerably reduces computational expense compared with published sampling-based approaches, and the methods we present are practical on a genome-wide scale with very modest computational resources (e.g., a single desktop computer). The approximation also facilitates combining information across studies, using only summary data for each SNP. Methods discussed here are implemented in the software package BIMBAM, which is available from http://stephenslab.uchicago.edu/software.html

    Propensity score analysis with partially observed covariates: How should multiple imputation be used?

    Get PDF
    Inverse probability of treatment weighting is a popular propensity score-based approach to estimate marginal treatment effects in observational studies at risk of confounding bias. A major issue when estimating the propensity score is the presence of partially observed covariates. Multiple imputation is a natural approach to handle missing data on covariates: covariates are imputed and a propensity score analysis is performed in each imputed dataset to estimate the treatment effect. The treatment effect estimates from each imputed dataset are then combined to obtain an overall estimate. We call this method MIte. However, an alternative approach has been proposed, in which the propensity scores are combined across the imputed datasets (MIps). Therefore, there are remaining uncertainties about how to implement multiple imputation for propensity score analysis: (a) should we apply Rubin's rules to the inverse probability of treatment weighting treatment effect estimates or to the propensity score estimates themselves? (b) does the outcome have to be included in the imputation model? (c) how should we estimate the variance of the inverse probability of treatment weighting estimator after multiple imputation? We studied the consistency and balancing properties of the MIte and MIps estimators and performed a simulation study to empirically assess their performance for the analysis of a binary outcome. We also compared the performance of these methods to complete case analysis and the missingness pattern approach, which uses a different propensity score model for each pattern of missingness, and a third multiple imputation approach in which the propensity score parameters are combined rather than the propensity scores themselves (MIpar). Under a missing at random mechanism, complete case and missingness pattern analyses were biased in most cases for estimating the marginal treatment effect, whereas multiple imputation approaches were approximately unbiased as long as the outcome was included in the imputation model. Only MIte was unbiased in all the studied scenarios and Rubin's rules provided good variance estimates for MIte. The propensity score estimated in the MIte approach showed good balancing properties. In conclusion, when using multiple imputation in the inverse probability of treatment weighting context, MIte with the outcome included in the imputation model is the preferred approach

    Identifying the favored mutation in a positive selective sweep.

    Get PDF
    Most approaches that capture signatures of selective sweeps in population genomics data do not identify the specific mutation favored by selection. We present iSAFE (for "integrated selection of allele favored by evolution"), a method that enables researchers to accurately pinpoint the favored mutation in a large region (∼5 Mbp) by using a statistic derived solely from population genetics signals. iSAFE does not require knowledge of demography, the phenotype under selection, or functional annotations of mutations
    corecore