1,091 research outputs found
Analyzing Machine Learning Models for Credit Scoring with Explainable AI and Optimizing Investment Decisions
This paper examines two different yet related questions related to
explainable AI (XAI) practices. Machine learning (ML) is increasingly important
in financial services, such as pre-approval, credit underwriting, investments,
and various front-end and back-end activities. Machine Learning can
automatically detect non-linearities and interactions in training data,
facilitating faster and more accurate credit decisions. However, machine
learning models are opaque and hard to explain, which are critical elements
needed for establishing a reliable technology. The study compares various
machine learning models, including single classifiers (logistic regression,
decision trees, LDA, QDA), heterogeneous ensembles (AdaBoost, Random Forest),
and sequential neural networks. The results indicate that ensemble classifiers
and neural networks outperform. In addition, two advanced post-hoc model
agnostic explainability techniques - LIME and SHAP are utilized to assess
ML-based credit scoring models using the open-access datasets offered by
US-based P2P Lending Platform, Lending Club. For this study, we are also using
machine learning algorithms to develop new investment models and explore
portfolio strategies that can maximize profitability while minimizing risk
Unwrapping black box models: a case study in credit risk
The past two decades have witnessed the rapid development of machine learning
techniques, which have proven to be powerful tools for the construction of predictive
models, such as those used in credit risk management. A considerable volume of
published work has looked at the utility of machine learning for this purpose, the
increased predictive capacities delivered and how new types of data can be
exploited. However, these benefits come at the cost of increased complexity, which
may render the models uninterpretable. To overcome this issue a new field has
emerged under the name of explainable artificial intelligence, with numerous tools
being proposed to gain an insight into the inner workings of these models. This type
of understanding is fundamental in credit risk in order to ensure compliance with the
existing regulatory requirements and to comprehend the factors driving the
predictions and their macro-economic implications. This paper studies the
effectiveness of some of the most widely-used interpretability techniques on a neural
network trained on real data. These techniques are found to be useful for
understanding the model, even though some limitations have been encountered.En las dos últimas décadas se ha observado un rápido desarrollo de las técnicas
de aprendizaje automático, que han demostrado ser herramientas muy potentes
para elaborar modelos de predicción, como los utilizados en la gestión del riesgo de
crédito. En un volumen considerable de trabajos publicados se analizan la utilidad del
aprendizaje automático para este fin, las mayores capacidades predictivas que
ofrece y la forma en la que se pueden explotar nuevos tipos de datos. Sin embargo,
estas ventajas llevan aparejada una mayor complejidad, que puede imposibilitar la
interpretación de los modelos. Para solventar este punto ha surgido un nuevo campo
de investigación, denominado «inteligencia artificial explicable» (del inglés explicable
artificial intelligence), en el que se proponen numerosas herramientas para obtener
información relativa al funcionamiento interno de estos modelos. Este tipo de
conocimiento es fundamental en materia de riesgo de crédito para garantizar que se
cumplen los requerimientos regulatorios existentes y para comprender los factores
determinantes de las predicciones y sus implicaciones macroeconómicas. En este
artículo se estudia la eficacia de algunas de las técnicas de interpretabilidad más
utilizadas en una red neuronal entrenada con datos reales. Estas técnicas se
consideran útiles para la comprensión del modelo, pese a que se han detectado
algunas limitaciones
Explainable AI for Interpretable Credit Scoring
With the ever-growing achievements in Artificial Intelligence (AI) and the
recent boosted enthusiasm in Financial Technology (FinTech), applications such
as credit scoring have gained substantial academic interest. Credit scoring
helps financial experts make better decisions regarding whether or not to
accept a loan application, such that loans with a high probability of default
are not accepted. Apart from the noisy and highly imbalanced data challenges
faced by such credit scoring models, recent regulations such as the `right to
explanation' introduced by the General Data Protection Regulation (GDPR) and
the Equal Credit Opportunity Act (ECOA) have added the need for model
interpretability to ensure that algorithmic decisions are understandable and
coherent. An interesting concept that has been recently introduced is
eXplainable AI (XAI), which focuses on making black-box models more
interpretable. In this work, we present a credit scoring model that is both
accurate and interpretable. For classification, state-of-the-art performance on
the Home Equity Line of Credit (HELOC) and Lending Club (LC) Datasets is
achieved using the Extreme Gradient Boosting (XGBoost) model. The model is then
further enhanced with a 360-degree explanation framework, which provides
different explanations (i.e. global, local feature-based and local
instance-based) that are required by different people in different situations.
Evaluation through the use of functionallygrounded, application-grounded and
human-grounded analysis show that the explanations provided are simple,
consistent as well as satisfy the six predetermined hypotheses testing for
correctness, effectiveness, easy understanding, detail sufficiency and
trustworthiness.Comment: 19 pages, David C. Wyld et al. (Eds): ACITY, DPPR, VLSI, WeST, DSA,
CNDC, IoTE, AIAA, NLPTA - 202
Explainable AI in Fintech and Insurtech
The growing application of black-box Artificial Intelligence algorithms in many real-world application is raising the importance of understanding how models make their decision. The research field that aims to look into the inner workings of the black-box and to make predictions more interpretable is referred to as eXplainable Artificial Intelligence (XAI).
Over the recent years, the research domain of XAI has seen important contributions and continuous developments, achieving great results with theoretically sound applied methodologies. These achievements enable both industry and regulators to improve on existing models
and their supervision; this is done in term of explainability, which is the main purpose of these models, but it also brings new possibilities, namely the employment of eXplainable AI models and their outputs as an intermediate step to new applications, greatly expanding their usefulness beyond explainability of model decisions.
This thesis is composed of six chapters: an introduction and a conclusion plus four self contained sections reporting the corresponding papers. Chapter 1 proposes the use of Shapley values in similarity networks and clustering models in order to bring out new pieces of information, useful for classification and analysis of the customer base, in an insurtech setting. In chapter 2 a comparison between SHAP and LIME, two of the most important XAI models, evaluating their parameters attribution methodologies and the information they are capable of include thereof, in italian Small and Medium Enterprises’ Probability of Default (PD) estimation, with balance sheet data as inputs. Chapter 3 introduces the use of Shapley values in feature selection techniques, with the analysis of wrapper and embedded feature selection algorithms and their ability to select relevant features with both raw data and their Shapley values, again in the setting of SME PD estimation. In chapter 4, a new methodology of model selection based on Lorenz Zoonoid is introduced, highlighting similarities with the game-theoretical concept of Shapley values and their variability decomposition attribution to independent variables as well as some advantages in terms of model comparability and standardization. These properties are explored through both a simulated
example and the application to a real world dataset, provided by EU-certified rating agency Modefinance.The growing application of black-box Artificial Intelligence algorithms in many real-world application is raising the importance of understanding how models make their decision. The research field that aims to look into the inner workings of the black-box and to make predictions more interpretable is referred to as eXplainable Artificial Intelligence (XAI).
Over the recent years, the research domain of XAI has seen important contributions and continuous developments, achieving great results with theoretically sound applied methodologies. These achievements enable both industry and regulators to improve on existing models
and their supervision; this is done in term of explainability, which is the main purpose of these models, but it also brings new possibilities, namely the employment of eXplainable AI models and their outputs as an intermediate step to new applications, greatly expanding their usefulness beyond explainability of model decisions.
This thesis is composed of six chapters: an introduction and a conclusion plus four self contained sections reporting the corresponding papers. Chapter 1 proposes the use of Shapley values in similarity networks and clustering models in order to bring out new pieces of information, useful for classification and analysis of the customer base, in an insurtech setting. In chapter 2 a comparison between SHAP and LIME, two of the most important XAI models, evaluating their parameters attribution methodologies and the information they are capable of include thereof, in italian Small and Medium Enterprises’ Probability of Default (PD) estimation, with balance sheet data as inputs. Chapter 3 introduces the use of Shapley values in feature selection techniques, with the analysis of wrapper and embedded feature selection algorithms and their ability to select relevant features with both raw data and their Shapley values, again in the setting of SME PD estimation. In chapter 4, a new methodology of model selection based on Lorenz Zoonoid is introduced, highlighting similarities with the game-theoretical concept of Shapley values and their variability decomposition attribution to independent variables as well as some advantages in terms of model comparability and standardization. These properties are explored through both a simulated
example and the application to a real world dataset, provided by EU-certified rating agency Modefinance
A Survey on Explainable Anomaly Detection
In the past two decades, most research on anomaly detection has focused on
improving the accuracy of the detection, while largely ignoring the
explainability of the corresponding methods and thus leaving the explanation of
outcomes to practitioners. As anomaly detection algorithms are increasingly
used in safety-critical domains, providing explanations for the high-stakes
decisions made in those domains has become an ethical and regulatory
requirement. Therefore, this work provides a comprehensive and structured
survey on state-of-the-art explainable anomaly detection techniques. We propose
a taxonomy based on the main aspects that characterize each explainable anomaly
detection technique, aiming to help practitioners and researchers find the
explainable anomaly detection method that best suits their needs.Comment: Paper accepted by the ACM Transactions on Knowledge Discovery from
Data (TKDD) for publication (preprint version
MODEL INTERPRETATION AND EXPLAINABILITY Towards Creating Transparency in Prediction Models
Explainable AI (XAI) has a counterpart in analytical modeling which we refer to as model explainability. We tackle the issue of model explainability in the context of prediction models. We analyze a dataset of loans from a credit card company and apply three stages: execute and compare four different prediction methods, apply the best known explainability techniques in the current literature to the model training sets to identify feature importance (FI) (static case), and finally to cross-check whether the FI set holds up under “what if” prediction scenarios for continuous and categorical variables (dynamic case). We found inconsistency in FI identification between the static and dynamic cases. We summarize the “state of the art” in model explainability and suggest further research to advance the field
- …