Search CORE

1,811 research outputs found

Creating Fair Models of Atherosclerotic Cardiovascular Disease Risk

Author: Coulet Adrien
Marafino Ben
Palaniappan Latha
Pfohl Stephen
Rodriguez Fatima
Shah Nigam H.
Publication venue
Publication date: 27/01/2019
Field of study

Guidelines for the management of atherosclerotic cardiovascular disease (ASCVD) recommend the use of risk stratification models to identify patients most likely to benefit from cholesterol-lowering and other therapies. These models have differential performance across race and gender groups with inconsistent behavior across studies, potentially resulting in an inequitable distribution of beneficial therapy. In this work, we leverage adversarial learning and a large observational cohort extracted from electronic health records (EHRs) to develop a "fair" ASCVD risk prediction model with reduced variability in error rates across groups. We empirically demonstrate that our approach is capable of aligning the distribution of risk predictions conditioned on the outcome across several groups simultaneously for models built from high-dimensional EHR data. We also discuss the relevance of these results in the context of the empirical trade-off between fairness and model performance

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Machine Learning Framework for Real-World Electronic Health Records Regarding Missingness, Interpretability, and Fairness

Author: Liu Jing Lucas
Publication venue: UKnowledge
Publication date: 01/01/2023
Field of study

Machine learning (ML) and deep learning (DL) techniques have shown promising results in healthcare applications using Electronic Health Records (EHRs) data. However, their adoption in real-world healthcare settings is hindered by three major challenges. Firstly, real-world EHR data typically contains numerous missing values. Secondly, traditional ML/DL models are typically considered black-boxes, whereas interpretability is required for real-world healthcare applications. Finally, differences in data distributions may lead to unfairness and performance disparities, particularly in subpopulations. This dissertation proposes methods to address missing data, interpretability, and fairness issues. The first work proposes an ensemble prediction framework for EHR data with large missing rates using multiple subsets with lower missing rates. The second method introduces the integration of medical knowledge graphs and double attention mechanism with the long short-term memory (LSTM) model to enhance interpretability by providing knowledge-based model interpretation. The third method develops an LSTM variant that integrates medical knowledge graphs and additional time-aware gates to handle multi-variable temporal missing issues and interpretability concerns. Finally, a transformer-based model is proposed to learn unbiased and fair representations of diverse subpopulations using domain classifiers and three attention mechanisms

University of Kentucky

Learning Tasks for Multitask Learning: Heterogenous Patient Populations in the ICU

Author: Buolamwini Joy
Caruana Rich
Ghassemi Marzyeh
Ngufor C.
Shankar Shreya
Wang Xiang
Xu Kelvin
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 07/06/2018
Field of study

Machine learning approaches have been effective in predicting adverse outcomes in different clinical settings. These models are often developed and evaluated on datasets with heterogeneous patient populations. However, good predictive performance on the aggregate population does not imply good performance for specific groups. In this work, we present a two-step framework to 1) learn relevant patient subgroups, and 2) predict an outcome for separate patient populations in a multi-task framework, where each population is a separate task. We demonstrate how to discover relevant groups in an unsupervised way with a sequence-to-sequence autoencoder. We show that using these groups in a multi-task framework leads to better predictive performance of in-hospital mortality both across groups and overall. We also highlight the need for more granular evaluation of performance when dealing with heterogeneous populations.Comment: KDD 201

arXiv.org e-Print Archive

Crossref

DSpace@MIT