6 research outputs found
Hierarchical Bias-Driven Stratification for Interpretable Causal Effect Estimation
Interpretability and transparency are essential for incorporating causal
effect models from observational data into policy decision-making. They can
provide trust for the model in the absence of ground truth labels to evaluate
the accuracy of such models. To date, attempts at transparent causal effect
estimation consist of applying post hoc explanation methods to black-box
models, which are not interpretable. Here, we present BICauseTree: an
interpretable balancing method that identifies clusters where natural
experiments occur locally. Our approach builds on decision trees with a
customized objective function to improve balancing and reduce treatment
allocation bias. Consequently, it can additionally detect subgroups presenting
positivity violations, exclude them, and provide a covariate-based definition
of the target population we can infer from and generalize to. We evaluate the
method's performance using synthetic and realistic datasets, explore its
bias-interpretability tradeoff, and show that it is comparable with existing
approaches
Validity of machine learning in biology and medicine increased through collaborations across fields of expertise
Machine learning (ML) has become an essential asset for the life sciences and medicine. We selected 250 articles describing ML applications from 17 journals sampling 26 different fields between 2011 and 2016. Independent evaluation by two readers highlighted three results. First, only half of the articles shared software, 64% shared data and 81% applied any kind of evaluation. Although crucial for ensuring the validity of ML applications, these aspects were met more by publications in lower-ranked journals. Second, the authors’ scientific backgrounds highly influenced how technical aspects were addressed: reproducibility and computational evaluation methods were more prominent with computational co-authors; experimental proofs more with experimentalists. Third, 73% of the ML applications resulted from interdisciplinary collaborations comprising authors from at least two of the three disciplines: computational sciences, biology, and medicine. The results suggested collaborations between computational and experimental scientists to generate more scientifically sound and impactful work integrating knowledge from both domains. Although scientifically more valid solutions and collaborations involving diverse expertise did not correlate with impact factors, such collaborations provide opportunities to both sides: computational scientists are given access to novel and challenging real-world biological data, increasing the scientific impact of their research, and experimentalists benefit from more in-depth computational analyses improving the technical correctness of work