9 research outputs found

    Local Explanation-Based Method for Healthcare Risk Stratification

    No full text
    International audienceDecision support tools in healthcare require a strong confidence in the developed Machine Learning (ML) models both in terms of performances and in their ability to provide users a deeper understanding of the underlying situation. This study presents a novel method to construct a risk stratification based on ML and local explanations. An open-source dataset was used to demonstrate the efficiency of this method that well identified the main subgroups of patients. Therefore, this method could help practitioners adjust and build protocols to improve care deliveries that would better reflect patient's risk level and profile

    Coalitional Strategies for Efficient Individual Prediction Explanation

    No full text
    International audienceAs Machine Learning (ML) is now widely applied in many domains, in both research and industry, an understanding of what is happening inside the black box is becoming a growing demand, especially by non-experts of these models. Several approaches had thus been developed to provide clear insights of a model prediction for a particular observation but at the cost of long computation time or restrictive hypothesis that does not fully take into account interaction between attributes. This paper provides methods based on the detection of relevant groups of attributes-named coalitionsinfluencing a prediction and compares them with the literature. Our results show that these coalitional methods are more efficient than existing ones such as SHapley Additive exPlanation (SHAP). Computation time is shortened while preserving an acceptable accuracy of individual prediction explanations. Therefore, this enables wider practical use of explanation methods to increase trust between developed ML models, end-users, and whoever impacted by any decision where these models played a role

    How to make the most of local explanations: effective clustering based on influences

    No full text
    International audienceMachine Learning is now commonly used to model complex phenomena, providing robust predictions and data exploration analysis. However, the lack of explanations for predictions leads to a black box effect which the domain called Explainability (XAI) attempts to overcome. In particular, XAI local attribution methods quantify the contribution of each attribute on each instance prediction, named influences. This type of explanation is the most precise as it focuses on each instance of the dataset and allows the detection of individual differences. Moreover, all local explanations can be aggregated to get further analysis of the underlying data. In this context, influences can be seen as new data space to understand and reveal complex data patterns. We then hypothesise that influences obtained through ML modelling are more informative than the original raw data, particularly in identifying homogeneous groups. The most efficient way to identify such groups is to consider a clustering approach. We thus compare clusters based on raw data against those based on influences (computed through several XAI local attribution methods). Our results indicate that clusters based on influences perform better than those based on raw data, even with low-accuracy models

    Stratégies coalitionnelles pour une explication efficace des prédictions individuelles

    No full text
    International audienceCe papier est un résumé des travaux publiés dans le journal Information Systems Frontiers (Ferrettini et al., 2021). Face aux nombreuses applications de l'apprentissage machine (ML) dans de nombreux domaines, la nécessité de comprendre le fonctionnement des modèles en boite noire est devenu croissante, particulièrement chez les non-experts. Plusieurs méthodes fournissant des explications sur les prédictions des modèles existent, avec des temps de calculs longs ou des hypothèses restrictives sur les interactions entre attributs. Ce papier détaille des méthodes basées sur la détection de groupes d'attributs pertinents-appelés coalitions-influençant la prédiction. Nos résultats montrent que les méthodes coalitionnelles sont plus performantes que celles existantes, comme SHAP. Le temps d'exécution est réduit en préservant la précision des explications. Ces méthodes permettent une augmentation des cas d'utilisation afin d'accroître la confiance entre les modèles ML, les utilisateurs et toute personne affectée par une décision impliquant ces modèles

    A comparative study of additive local explanation methods based on feature influences

    No full text
    International audienceLocal additive explanation methods are increasingly used to understand the predictions of complex Machine Learning (ML) models. The most used additive methods, SHAP and LIME, suffer from limitations that are rarely measured in the literature. This paper aims to measure these limitations on a wide range (304) of OpenML datasets, and also evaluate emergent coalitional-based methods to tackle the weaknesses of other methods. We illustrate and validate results on a specific medical dataset, SA-Heart. Our findings reveal that LIME and SHAP's approximations are particularly efficient in high dimension and generate intelligible global explanations, but they suffer from a lack of precision regarding local explanations. Coalitional-based methods are computationally expensive in high dimension, but offer higher quality local explanations. Finally, we present a roadmap summarizing our work by pointing out the most appropriate method depending on dataset dimensionality and user's objectives

    Une approche quantitative pour la comparaison des méthodes d'explications locales additives

    No full text
    Edited by Lukasz Golab, Kostas Stefanidis ; DOLAP 2022: 24th International Workshop on Design, Optimization, Languages and Analytical Processing of Big Data, co-located with EDBT 2022, Edinburgh, UK, March 29, 2022International audienceLocal additive explanation methods are increasingly used to understand the predictions of complex Machine Learning (ML) models. The most used additive methods, SHAP and LIME, suffer from limitations that are rarely measured in the literature. This paper aims to measure these limitations on a wide range (304) of OpenML datasets using six quantitative metrics, and also evaluate emergent coalitional-based methods to tackle the weaknesses of other methods. We illustrate and validate results on a specific medical dataset, SA-Heart. Our findings reveal that LIME and SHAP's approximations are particularly efficient in high dimension and generate intelligible global explanations, but they suffer from a lack of precision regarding local explanations and possibly unwanted behavior when changing the method's parameters. Coalitional-based methods are computationally expensive in high dimension, but offer higher quality local explanations. Finally, we present a roadmap summarizing our work by pointing out the most appropriate method depending on dataset dimensionality and user's objectives

    Exploration de données basée sur les Explications locales attributives : un exemple d'utilisation médical

    No full text
    International audienceExploratory data analysis allows to discover knowledge and patterns and to test hypotheses. Modelling predictive tools associated with explainability made it possible to explore more and more complex relationships between attributes. This study presents a method to use local explanations as a new data space to retrieve precise and pertinent information. We aim to apply this method to a medical dataset and underline the benefit of using explanations to gain knowledge. In particular, we show that clusters based on local explanations, combined with decision rules, allow to better characterise patient subgroups

    Evolution of hospitalized patient characteristics through the first three COVID-19 waves in Paris area using machine learning analysis

    No full text
    Characteristics of patients at risk of developing severe forms of COVID-19 disease have been widely described, but very few studies describe their evolution through the following waves. Data was collected retrospectively from a prospectively maintained database from a University Hospital in Paris area, over a year corresponding to the first three waves of COVID-19 in France. Evolution of patient characteristics between non-severe and severe cases through the waves was analyzed with a classical multivariate logistic regression along with a complementary Machine-Learning-based analysis using explainability methods. On 1076 hospitalized patients, severe forms concerned 29% (123/429), 31% (66/214) and 18% (79/433) of each wave. Risk factors of the first wave included old age (≥ 70 years), male gender, diabetes and obesity while cardiovascular issues appeared to be a protective factor. Influence of age, gender and comorbidities on the occurrence of severe COVID-19 was less marked in the 3rd wave compared to the first 2, and the interactions between age and comorbidities less important. Typology of hospitalized patients with severe forms evolved rapidly through the waves. This evolution may be due to the changes of hospital practices and the early vaccination campaign targeting the people at high risk such as elderly and patients with comorbidities
    corecore