3,979 research outputs found
Interpretability of machine learning solutions in public healthcare : the CRISP-ML approach
Public healthcare has a history of cautious adoption for artificial intelligence (AI) systems. The rapid growth of data collection and linking capabilities combined with the increasing diversity of the data-driven AI techniques, including machine learning (ML), has brought both ubiquitous opportunities for data analytics projects and increased demands for the regulation and accountability of the outcomes of these projects. As a result, the area of interpretability and explainability of ML is gaining significant research momentum. While there has been some progress in the development of ML methods, the methodological side has shown limited progress. This limits the practicality of using ML in the health domain: the issues with explaining the outcomes of ML algorithms to medical practitioners and policy makers in public health has been a recognized obstacle to the broader adoption of data science approaches in this domain. This study builds on the earlier work which introduced CRISP-ML, a methodology that determines the interpretability level required by stakeholders for a successful real-world solution and then helps in achieving it. CRISP-ML was built on the strengths of CRISP-DM, addressing the gaps in handling interpretability. Its application in the Public Healthcare sector follows its successful deployment in a number of recent real-world projects across several industries and fields, including credit risk, insurance, utilities, and sport. This study elaborates on the CRISP-ML methodology on the determination, measurement, and achievement of the necessary level of interpretability of ML solutions in the Public Healthcare sector. It demonstrates how CRISP-ML addressed the problems with data diversity, the unstructured nature of data, and relatively low linkage between diverse data sets in the healthcare domain. The characteristics of the case study, used in the study, are typical for healthcare data, and CRISP-ML managed to deliver on these issues, ensuring the required level of interpretability of the ML solutions discussed in the project. The approach used ensured that interpretability requirements were met, taking into account public healthcare specifics, regulatory requirements, project stakeholders, project objectives, and data characteristics. The study concludes with the three main directions for the development of the presented cross-industry standard process
A Review on Explainable Artificial Intelligence for Healthcare: Why, How, and When?
Artificial intelligence (AI) models are increasingly finding applications in
the field of medicine. Concerns have been raised about the explainability of
the decisions that are made by these AI models. In this article, we give a
systematic analysis of explainable artificial intelligence (XAI), with a
primary focus on models that are currently being used in the field of
healthcare. The literature search is conducted following the preferred
reporting items for systematic reviews and meta-analyses (PRISMA) standards for
relevant work published from 1 January 2012 to 02 February 2022. The review
analyzes the prevailing trends in XAI and lays out the major directions in
which research is headed. We investigate the why, how, and when of the uses of
these XAI models and their implications. We present a comprehensive examination
of XAI methodologies as well as an explanation of how a trustworthy AI can be
derived from describing AI models for healthcare fields. The discussion of this
work will contribute to the formalization of the XAI field.Comment: 15 pages, 3 figures, accepted for publication in the IEEE
Transactions on Artificial Intelligenc
Explainable AI for clinical risk prediction: a survey of concepts, methods, and modalities
Recent advancements in AI applications to healthcare have shown incredible
promise in surpassing human performance in diagnosis and disease prognosis.
With the increasing complexity of AI models, however, concerns regarding their
opacity, potential biases, and the need for interpretability. To ensure trust
and reliability in AI systems, especially in clinical risk prediction models,
explainability becomes crucial. Explainability is usually referred to as an AI
system's ability to provide a robust interpretation of its decision-making
logic or the decisions themselves to human stakeholders. In clinical risk
prediction, other aspects of explainability like fairness, bias, trust, and
transparency also represent important concepts beyond just interpretability. In
this review, we address the relationship between these concepts as they are
often used together or interchangeably. This review also discusses recent
progress in developing explainable models for clinical risk prediction,
highlighting the importance of quantitative and clinical evaluation and
validation across multiple common modalities in clinical practice. It
emphasizes the need for external validation and the combination of diverse
interpretability methods to enhance trust and fairness. Adopting rigorous
testing, such as using synthetic datasets with known generative factors, can
further improve the reliability of explainability methods. Open access and
code-sharing resources are essential for transparency and reproducibility,
enabling the growth and trustworthiness of explainable research. While
challenges exist, an end-to-end approach to explainability in clinical risk
prediction, incorporating stakeholders from clinicians to developers, is
essential for success
GAM(e) changer or not? An evaluation of interpretable machine learning models based on additive model constraints
The number of information systems (IS) studies dealing with explainable artificial intelligence (XAI) is currently exploding as the field demands more transparency about the internal decision logic of machine learning (ML) models. However, most techniques subsumed under XAI provide post-hoc-analytical explanations, which have to be considered with caution as they only use approximations of the underlying ML model. Therefore, our paper investigates a series of intrinsically interpretable ML models and discusses their suitability for the IS community. More specifically, our focus is on advanced extensions of generalized additive models (GAM) in which predictors are modeled independently in a non-linear way to generate shape functions that can capture arbitrary patterns but remain fully interpretable. In our study, we evaluate the prediction qualities of five GAMs as compared to six traditional ML models and assess their visual outputs for model interpretability. On this basis, we investigate their merits and limitations and derive design implications for further improvements
A Performance-Explainability-Fairness Framework For Benchmarking ML Models
Machine learning (ML) models have achieved remarkable success in various applications; however, ensuring their robustness and fairness remains a critical challenge. In this research, we present a comprehensive framework designed to evaluate and benchmark ML models through the lenses of performance, explainability, and fairness. This framework addresses the increasing need for a holistic assessment of ML models, considering not only their predictive power but also their interpretability and equitable deployment.
The proposed framework leverages a multi-faceted evaluation approach, integrating performance metrics with explainability and fairness assessments. Performance evaluation incorporates standard measures such as accuracy, precision, and recall, but extends to overall balanced error rate, overall area under the receiver operating characteristic (ROC) curve (AUC), to capture model behavior across different performance aspects. Explainability assessment employs state-of-the-art techniques to quantify the interpretability of model decisions, ensuring that model behavior can be understood and trusted by stakeholders. The fairness evaluation
examines model predictions in terms of demographic parity, equalized odds, thereby addressing concerns of bias and discrimination in the deployment of ML systems.
To demonstrate the practical utility of the framework, we apply it to a diverse set of ML algorithms across various functional domains, including finance, criminology, education, and healthcare prediction. The results showcase the importance of a balanced evaluation approach, revealing trade-offs between performance, explainability, and fairness that can inform model selection and deployment decisions. Furthermore, we provide insights into the analysis of tradeoffs in selecting the appropriate model for use cases where performance, interpretability and fairness are important.
In summary, the Performance-Explainability-Fairness Framework offers a unified methodology for evaluating and benchmarking ML models, enabling practitioners and researchers to make informed decisions about model suitability and ensuring responsible and equitable AI deployment. We believe that this framework represents a crucial step towards building trustworthy and accountable ML systems in an era where AI plays an increasingly prominent role in decision-making processes
Exploring the Potential of Convolutional Neural Networks in Healthcare Engineering for Skin Disease Identification
Skin disorders affect millions of individuals worldwide, underscoring the urgency of swift and accurate detection for optimal treatment outcomes. Convolutional Neural Networks (CNNs) have emerged as valuable assets for automating the identification of skin ailments. This paper conducts an exhaustive examination of the latest advancements in CNN-driven skin condition detection. Within dermatological applications, CNNs proficiently analyze intricate visual motifs and extricate distinctive features from skin imaging datasets. By undergoing training on extensive data repositories, CNNs proficiently classify an array of skin maladies such as melanoma, psoriasis, eczema, and acne. The paper spotlights pivotal progressions in CNN-centered skin ailment diagnosis, encompassing diverse CNN architectures, refinement methodologies, and data augmentation tactics. Moreover, the integration of transfer learning and ensemble approaches has further amplified the efficacy of CNN models. Despite their substantial potential, there exist pertinent challenges. The comprehensive portrayal of skin afflictions and the mitigation of biases mandate access to extensive and varied data pools. The quest for comprehending the decision-making processes propelling CNN models remains an ongoing endeavor. Ethical quandaries like algorithmic predisposition and data privacy also warrant significant consideration. By meticulously scrutinizing the evolutions, obstacles, and potential of CNN-oriented skin disorder diagnosis, this critique provides invaluable insights to researchers and medical professionals. It underscores the importance of precise and efficacious diagnostic instruments in ameliorating patient outcomes and curbing healthcare expenditures
- …