Search CORE

5 research outputs found

Prediction-Constrained Topic Models for Antidepressant Recommendation

Author: Doshi-Velez Finale
Hope Gabriel
Hughes Michael C.
McCoy Thomas H.
Perlis Roy H.
Sudderth Erik B.
Weiner Leah
Publication venue
Publication date: 01/12/2017
Field of study

Supervisory signals can help topic models discover low-dimensional data representations that are more interpretable for clinical tasks. We propose a framework for training supervised latent Dirichlet allocation that balances two goals: faithful generative explanations of high-dimensional data and accurate prediction of associated class labels. Existing approaches fail to balance these goals by not properly handling a fundamental asymmetry: the intended task is always predicting labels from data, not data from labels. Our new prediction-constrained objective trains models that predict labels from heldout data well while also producing good generative likelihoods and interpretable topic-word parameters. In a case study on predicting depression medications from electronic health records, we demonstrate improved recommendations compared to previous supervised topic models and high- dimensional logistic regression from words alone.Comment: Accepted poster at NIPS 2017 Workshop on Machine Learning for Health (https://ml4health.github.io/2017/

arXiv.org e-Print Archive

Prediction Focused Topic Models for Electronic Health Records

Author: Doshi-Velez Finale
Kunes Russell
Ren Jason
Publication venue
Publication date: 15/11/2019
Field of study

Electronic Health Record (EHR) data can be represented as discrete counts over a high dimensional set of possible procedures, diagnoses, and medications. Supervised topic models present an attractive option for incorporating EHR data as features into a prediction problem: given a patient's record, we estimate a set of latent factors that are predictive of the response variable. However, existing methods for supervised topic modeling struggle to balance prediction quality and coherence of the latent factors. We introduce a novel approach, the prediction-focused topic model, that uses the supervisory signal to retain only features that improve, or do not hinder, prediction performance. By removing features with irrelevant signal, the topic model is able to learn task-relevant, interpretable topics. We demonstrate on a EHR dataset and a movie review dataset that compared to existing approaches, prediction-focused topic models are able to learn much more coherent topics while maintaining competitive predictions.Comment: Machine Learning for Health (ML4H) at NeurIPS 2019 - Extended Abstract. arXiv admin note: substantial text overlap with arXiv:1910.0549

arXiv.org e-Print Archive

Projected BNNs: Avoiding weight-space pathologies by learning latent representations of neural network weights

Author: Doshi-velez Finale
Ghosh Soumya
Pan Weiwei
Pradier Melanie F.
Yao Jiayu
Publication venue
Publication date: 12/06/2019
Field of study

As machine learning systems get widely adopted for high-stake decisions, quantifying uncertainty over predictions becomes crucial. While modern neural networks are making remarkable gains in terms of predictive accuracy, characterizing uncertainty over the parameters of these models is challenging because of the high dimensionality and complex correlations of the network parameter space. This paper introduces a novel variational inference framework for Bayesian neural networks that (1) encodes complex distributions in high-dimensional parameter space with representations in a low-dimensional latent space, and (2) performs inference efficiently on the low-dimensional representations. Across a large array of synthetic and real-world datasets, we show that our method improves uncertainty characterization and model generalization when compared with methods that work directly in the parameter space

arXiv.org e-Print Archive

Prediction Focused Topic Models via Feature Selection

Author: Doshi-Velez Finale
Kunes Russell
Ren Jason
Publication venue
Publication date: 29/02/2020
Field of study

Supervised topic models are often sought to balance prediction quality and interpretability. However, when models are (inevitably) misspecified, standard approaches rarely deliver on both. We introduce a novel approach, the prediction-focused topic model, that uses the supervisory signal to retain only vocabulary terms that improve, or at least do not hinder, prediction performance. By removing terms with irrelevant signal, the topic model is able to learn task-relevant, coherent topics. We demonstrate on several data sets that compared to existing approaches, prediction-focused topic models learn much more coherent topics while maintaining competitive predictions.Comment: AISTATS 2020. arXiv admin note: substantial text overlap with arXiv:1911.0855

arXiv.org e-Print Archive

Machine Learning and Visualization in Clinical Decision Support: Current State and Future Directions

Author: Elhadad Noémie
Kuperman Gilad J.
Levy-Fix Gal
Publication venue
Publication date: 06/06/2019
Field of study

Deep learning, an area of machine learning, is set to revolutionize patient care. But it is not yet part of standard of care, especially when it comes to individual patient care. In fact, it is unclear to what extent data-driven techniques are being used to support clinical decision making (CDS). Heretofore, there has not been a review of ways in which research in machine learning and other types of data-driven techniques can contribute effectively to clinical care and the types of support they can bring to clinicians. In this paper, we consider ways in which two data driven domains - machine learning and data visualizations - can contribute to the next generation of clinical decision support systems. We review the literature regarding the ways heuristic knowledge, machine learning, and visualization are - and can be - applied to three types of CDS. There has been substantial research into the use of predictive modeling for alerts, however current CDS systems are not utilizing these methods. Approaches that leverage interactive visualizations and machine-learning inferences to organize and review patient data are gaining popularity but are still at the prototype stage and are not yet in use. CDS systems that could benefit from prescriptive machine learning (e.g., treatment recommendations for specific patients) have not yet been developed. We discuss potential reasons for the lack of deployment of data-driven methods in CDS and directions for future research

arXiv.org e-Print Archive