36 research outputs found

    A Framework for Understanding Unintended Consequences of Machine Learning

    Full text link
    As machine learning increasingly affects people and society, it is important that we strive for a comprehensive and unified understanding of potential sources of unwanted consequences. For instance, downstream harms to particular groups are often blamed on "biased data," but this concept encompass too many issues to be useful in developing solutions. In this paper, we provide a framework that partitions sources of downstream harm in machine learning into six distinct categories spanning the data generation and machine learning pipeline. We describe how these issues arise, how they are relevant to particular applications, and how they motivate different solutions. In doing so, we aim to facilitate the development of solutions that stem from an understanding of application-specific populations and data generation processes, rather than relying on general statements about what may or may not be "fair."Comment: 6 pages, 2 figures; updated with corrected figure

    Intuitively Assessing ML Model Reliability through Example-Based Explanations and Editing Model Inputs

    Full text link
    Interpretability methods aim to help users build trust in and understand the capabilities of machine learning models. However, existing approaches often rely on abstract, complex visualizations that poorly map to the task at hand or require non-trivial ML expertise to interpret. Here, we present two visual analytics modules that facilitate an intuitive assessment of model reliability. To help users better characterize and reason about a model's uncertainty, we visualize raw and aggregate information about a given input's nearest neighbors. Using an interactive editor, users can manipulate this input in semantically-meaningful ways, determine the effect on the output, and compare against their prior expectations. We evaluate our interface using an electrocardiogram beat classification case study. Compared to a baseline feature importance interface, we find that 14 physicians are better able to align the model's uncertainty with domain-relevant factors and build intuition about its capabilities and limitations

    Feminicide & machine learning : detecting gender-based violence to strengthen civil sector activism

    Get PDF
    Although governments have passed legislation criminalizing feminicide, it is unaccompanied by relevant policy or robust data collection. This participatory action research project is designed to help sustain activist efforts to collect feminicide data through partially automated detection using machine learning. As a way to counter the impunity surrounding feminicide, activists have taken upon themselves to do the work that states have neglected. Partially automating detection supports efforts to systematize and sort data collection across contexts, and helps to inform policy advocacy through standardizing definitions and taxonomies. The ability to prioritize articles by likelihood of feminicide will make this intense research less gruelling

    MIMIC-Extract: A Data Extraction, Preprocessing, and Representation Pipeline for MIMIC-III

    Full text link
    Robust machine learning relies on access to data that can be used with standardized frameworks in important tasks and the ability to develop models whose performance can be reasonably reproduced. In machine learning for healthcare, the community faces reproducibility challenges due to a lack of publicly accessible data and a lack of standardized data processing frameworks. We present MIMIC-Extract, an open-source pipeline for transforming raw electronic health record (EHR) data for critical care patients contained in the publicly-available MIMIC-III database into dataframes that are directly usable in common machine learning pipelines. MIMIC-Extract addresses three primary challenges in making complex health records data accessible to the broader machine learning community. First, it provides standardized data processing functions, including unit conversion, outlier detection, and aggregating semantically equivalent features, thus accounting for duplication and reducing missingness. Second, it preserves the time series nature of clinical data and can be easily integrated into clinically actionable prediction tasks in machine learning for health. Finally, it is highly extensible so that other researchers with related questions can easily use the same pipeline. We demonstrate the utility of this pipeline by showcasing several benchmark tasks and baseline results

    Non-task expert physicians benefit from correct explainable AI advice when reviewing X-rays

    Get PDF
    Artificial intelligence (AI)-generated clinical advice is becoming more prevalent in healthcare. However, the impact of AI-generated advice on physicians’ decision-making is underexplored. In this study, physicians received X-rays with correct diagnostic advice and were asked to make a diagnosis, rate the advice’s quality, and judge their own confidence. We manipulated whether the advice came with or without a visual annotation on the X-rays, and whether it was labeled as coming from an AI or a human radiologist. Overall, receiving annotated advice from an AI resulted in the highest diagnostic accuracy. Physicians rated the quality of AI advice higher than human advice. We did not find a strong effect of either manipulation on participants’ confidence. The magnitude of the effects varied between task experts and non-task experts, with the latter benefiting considerably from correct explainable AI advice. These findings raise important considerations for the deployment of diagnostic advice in healthcare
    corecore