Search CORE

36 research outputs found

A Framework for Understanding Unintended Consequences of Machine Learning

Author: Guttag John V.
Suresh Harini
Publication venue
Publication date: 17/02/2020
Field of study

As machine learning increasingly affects people and society, it is important that we strive for a comprehensive and unified understanding of potential sources of unwanted consequences. For instance, downstream harms to particular groups are often blamed on "biased data," but this concept encompass too many issues to be useful in developing solutions. In this paper, we provide a framework that partitions sources of downstream harm in machine learning into six distinct categories spanning the data generation and machine learning pipeline. We describe how these issues arise, how they are relevant to particular applications, and how they motivate different solutions. In doing so, we aim to facilitate the development of solutions that stem from an understanding of application-specific populations and data generation processes, rather than relying on general statements about what may or may not be "fair."Comment: 6 pages, 2 figures; updated with corrected figure

arXiv.org e-Print Archive

DSpace@MIT

Intuitively Assessing ML Model Reliability through Example-Based Explanations and Editing Model Inputs

Author: Guttag John V.
Lewis Kathleen M.
Satyanarayan Arvind
Suresh Harini
Publication venue
Publication date: 09/07/2021
Field of study

Interpretability methods aim to help users build trust in and understand the capabilities of machine learning models. However, existing approaches often rely on abstract, complex visualizations that poorly map to the task at hand or require non-trivial ML expertise to interpret. Here, we present two visual analytics modules that facilitate an intuitive assessment of model reliability. To help users better characterize and reason about a model's uncertainty, we visualize raw and aggregate information about a given input's nearest neighbors. Using an interactive editor, users can manipulate this input in semantically-meaningful ways, determine the effect on the output, and compare against their prior expectations. We evaluate our interface using an electrocardiogram beat classification case study. Compared to a baseline feature importance interface, we find that 14 physicians are better able to align the model's uncertainty with domain-relevant factors and build intuition about its capabilities and limitations

arXiv.org e-Print Archive

DSpace@MIT

Feminicide & machine learning : detecting gender-based violence to strengthen civil sector activism

Author: Cruxên Isadora
D’Ignazio Catherine
Fumega Silvana
Suresh Harini
Val Helena Suárez
Publication venue
Publication date: 01/08/2020
Field of study

Although governments have passed legislation criminalizing feminicide, it is unaccompanied by relevant policy or robust data collection. This participatory action research project is designed to help sustain activist efforts to collect feminicide data through partially automated detection using machine learning. As a way to counter the impunity surrounding feminicide, activists have taken upon themselves to do the work that states have neglected. Partially automating detection supports efforts to systematize and sort data collection across contexts, and helps to inform policy advocacy through standardizing definitions and taxonomies. The ability to prioritize articles by likelihood of feminicide will make this intense research less gruelling

International Development Research Centre: IDRC Digital Library

MIMIC-Extract: A Data Extraction, Preprocessing, and Representation Pipeline for MIMIC-III

Author: Bergstra James
D'Aragon Frederick
Ghassemi Marzyeh
Johnson Alistair EW
Lipton Zachary C
McDermott Matthew BA
Nestor Bret
Raghu A.
Suresh Harini
Wes
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/04/2020
Field of study

Robust machine learning relies on access to data that can be used with standardized frameworks in important tasks and the ability to develop models whose performance can be reasonably reproduced. In machine learning for healthcare, the community faces reproducibility challenges due to a lack of publicly accessible data and a lack of standardized data processing frameworks. We present MIMIC-Extract, an open-source pipeline for transforming raw electronic health record (EHR) data for critical care patients contained in the publicly-available MIMIC-III database into dataframes that are directly usable in common machine learning pipelines. MIMIC-Extract addresses three primary challenges in making complex health records data accessible to the broader machine learning community. First, it provides standardized data processing functions, including unit conversion, outlier detection, and aggregating semantically equivalent features, thus accounting for duplication and reducing missingness. Second, it preserves the time series nature of clinical data and can be easily integrated into clinically actionable prediction tasks in machine learning for health. Finally, it is highly extensible so that other researchers with related questions can easily use the same pipeline. We demonstrate the utility of this pipeline by showcasing several benchmark tasks and baseline results

arXiv.org e-Print Archive

Crossref

Scipedia

Non-task expert physicians benefit from correct explainable AI advice when reviewing X-rays

Author: Ackery Alun D
Colak Errol
Coughlin Joseph F
Frey Dieter
Gaube Susanne
Ghassemi Marzyeh
Grover Samir C
Hudecek Matthias FC
Kitamura Felipe C
Koch Timo K
Lermer Eva
Raue Martina
Suresh Harini
Publication venue: NATURE PORTFOLIO
Publication date: 25/01/2023
Field of study

Artificial intelligence (AI)-generated clinical advice is becoming more prevalent in healthcare. However, the impact of AI-generated advice on physicians’ decision-making is underexplored. In this study, physicians received X-rays with correct diagnostic advice and were asked to make a diagnosis, rate the advice’s quality, and judge their own confidence. We manipulated whether the advice came with or without a visual annotation on the X-rays, and whether it was labeled as coming from an AI or a human radiologist. Overall, receiving annotated advice from an AI resulted in the highest diagnostic accuracy. Physicians rated the quality of AI advice higher than human advice. We did not find a strong effect of either manipulation on participants’ confidence. The magnitude of the effects varied between task experts and non-task experts, with the latter benefiting considerably from correct explainable AI advice. These findings raise important considerations for the deployment of diagnostic advice in healthcare

UCL Discovery

PubMed Central