1,091 research outputs found

    Analyzing Machine Learning Models for Credit Scoring with Explainable AI and Optimizing Investment Decisions

    Full text link
    This paper examines two different yet related questions related to explainable AI (XAI) practices. Machine learning (ML) is increasingly important in financial services, such as pre-approval, credit underwriting, investments, and various front-end and back-end activities. Machine Learning can automatically detect non-linearities and interactions in training data, facilitating faster and more accurate credit decisions. However, machine learning models are opaque and hard to explain, which are critical elements needed for establishing a reliable technology. The study compares various machine learning models, including single classifiers (logistic regression, decision trees, LDA, QDA), heterogeneous ensembles (AdaBoost, Random Forest), and sequential neural networks. The results indicate that ensemble classifiers and neural networks outperform. In addition, two advanced post-hoc model agnostic explainability techniques - LIME and SHAP are utilized to assess ML-based credit scoring models using the open-access datasets offered by US-based P2P Lending Platform, Lending Club. For this study, we are also using machine learning algorithms to develop new investment models and explore portfolio strategies that can maximize profitability while minimizing risk

    Unwrapping black box models: a case study in credit risk

    Get PDF
    The past two decades have witnessed the rapid development of machine learning techniques, which have proven to be powerful tools for the construction of predictive models, such as those used in credit risk management. A considerable volume of published work has looked at the utility of machine learning for this purpose, the increased predictive capacities delivered and how new types of data can be exploited. However, these benefits come at the cost of increased complexity, which may render the models uninterpretable. To overcome this issue a new field has emerged under the name of explainable artificial intelligence, with numerous tools being proposed to gain an insight into the inner workings of these models. This type of understanding is fundamental in credit risk in order to ensure compliance with the existing regulatory requirements and to comprehend the factors driving the predictions and their macro-economic implications. This paper studies the effectiveness of some of the most widely-used interpretability techniques on a neural network trained on real data. These techniques are found to be useful for understanding the model, even though some limitations have been encountered.En las dos últimas décadas se ha observado un rápido desarrollo de las técnicas de aprendizaje automático, que han demostrado ser herramientas muy potentes para elaborar modelos de predicción, como los utilizados en la gestión del riesgo de crédito. En un volumen considerable de trabajos publicados se analizan la utilidad del aprendizaje automático para este fin, las mayores capacidades predictivas que ofrece y la forma en la que se pueden explotar nuevos tipos de datos. Sin embargo, estas ventajas llevan aparejada una mayor complejidad, que puede imposibilitar la interpretación de los modelos. Para solventar este punto ha surgido un nuevo campo de investigación, denominado «inteligencia artificial explicable» (del inglés explicable artificial intelligence), en el que se proponen numerosas herramientas para obtener información relativa al funcionamiento interno de estos modelos. Este tipo de conocimiento es fundamental en materia de riesgo de crédito para garantizar que se cumplen los requerimientos regulatorios existentes y para comprender los factores determinantes de las predicciones y sus implicaciones macroeconómicas. En este artículo se estudia la eficacia de algunas de las técnicas de interpretabilidad más utilizadas en una red neuronal entrenada con datos reales. Estas técnicas se consideran útiles para la comprensión del modelo, pese a que se han detectado algunas limitaciones

    Explainable AI for Interpretable Credit Scoring

    Full text link
    With the ever-growing achievements in Artificial Intelligence (AI) and the recent boosted enthusiasm in Financial Technology (FinTech), applications such as credit scoring have gained substantial academic interest. Credit scoring helps financial experts make better decisions regarding whether or not to accept a loan application, such that loans with a high probability of default are not accepted. Apart from the noisy and highly imbalanced data challenges faced by such credit scoring models, recent regulations such as the `right to explanation' introduced by the General Data Protection Regulation (GDPR) and the Equal Credit Opportunity Act (ECOA) have added the need for model interpretability to ensure that algorithmic decisions are understandable and coherent. An interesting concept that has been recently introduced is eXplainable AI (XAI), which focuses on making black-box models more interpretable. In this work, we present a credit scoring model that is both accurate and interpretable. For classification, state-of-the-art performance on the Home Equity Line of Credit (HELOC) and Lending Club (LC) Datasets is achieved using the Extreme Gradient Boosting (XGBoost) model. The model is then further enhanced with a 360-degree explanation framework, which provides different explanations (i.e. global, local feature-based and local instance-based) that are required by different people in different situations. Evaluation through the use of functionallygrounded, application-grounded and human-grounded analysis show that the explanations provided are simple, consistent as well as satisfy the six predetermined hypotheses testing for correctness, effectiveness, easy understanding, detail sufficiency and trustworthiness.Comment: 19 pages, David C. Wyld et al. (Eds): ACITY, DPPR, VLSI, WeST, DSA, CNDC, IoTE, AIAA, NLPTA - 202

    Explainable AI in Fintech and Insurtech

    Get PDF
    The growing application of black-box Artificial Intelligence algorithms in many real-world application is raising the importance of understanding how models make their decision. The research field that aims to look into the inner workings of the black-box and to make predictions more interpretable is referred to as eXplainable Artificial Intelligence (XAI). Over the recent years, the research domain of XAI has seen important contributions and continuous developments, achieving great results with theoretically sound applied methodologies. These achievements enable both industry and regulators to improve on existing models and their supervision; this is done in term of explainability, which is the main purpose of these models, but it also brings new possibilities, namely the employment of eXplainable AI models and their outputs as an intermediate step to new applications, greatly expanding their usefulness beyond explainability of model decisions. This thesis is composed of six chapters: an introduction and a conclusion plus four self contained sections reporting the corresponding papers. Chapter 1 proposes the use of Shapley values in similarity networks and clustering models in order to bring out new pieces of information, useful for classification and analysis of the customer base, in an insurtech setting. In chapter 2 a comparison between SHAP and LIME, two of the most important XAI models, evaluating their parameters attribution methodologies and the information they are capable of include thereof, in italian Small and Medium Enterprises’ Probability of Default (PD) estimation, with balance sheet data as inputs. Chapter 3 introduces the use of Shapley values in feature selection techniques, with the analysis of wrapper and embedded feature selection algorithms and their ability to select relevant features with both raw data and their Shapley values, again in the setting of SME PD estimation. In chapter 4, a new methodology of model selection based on Lorenz Zoonoid is introduced, highlighting similarities with the game-theoretical concept of Shapley values and their variability decomposition attribution to independent variables as well as some advantages in terms of model comparability and standardization. These properties are explored through both a simulated example and the application to a real world dataset, provided by EU-certified rating agency Modefinance.The growing application of black-box Artificial Intelligence algorithms in many real-world application is raising the importance of understanding how models make their decision. The research field that aims to look into the inner workings of the black-box and to make predictions more interpretable is referred to as eXplainable Artificial Intelligence (XAI). Over the recent years, the research domain of XAI has seen important contributions and continuous developments, achieving great results with theoretically sound applied methodologies. These achievements enable both industry and regulators to improve on existing models and their supervision; this is done in term of explainability, which is the main purpose of these models, but it also brings new possibilities, namely the employment of eXplainable AI models and their outputs as an intermediate step to new applications, greatly expanding their usefulness beyond explainability of model decisions. This thesis is composed of six chapters: an introduction and a conclusion plus four self contained sections reporting the corresponding papers. Chapter 1 proposes the use of Shapley values in similarity networks and clustering models in order to bring out new pieces of information, useful for classification and analysis of the customer base, in an insurtech setting. In chapter 2 a comparison between SHAP and LIME, two of the most important XAI models, evaluating their parameters attribution methodologies and the information they are capable of include thereof, in italian Small and Medium Enterprises’ Probability of Default (PD) estimation, with balance sheet data as inputs. Chapter 3 introduces the use of Shapley values in feature selection techniques, with the analysis of wrapper and embedded feature selection algorithms and their ability to select relevant features with both raw data and their Shapley values, again in the setting of SME PD estimation. In chapter 4, a new methodology of model selection based on Lorenz Zoonoid is introduced, highlighting similarities with the game-theoretical concept of Shapley values and their variability decomposition attribution to independent variables as well as some advantages in terms of model comparability and standardization. These properties are explored through both a simulated example and the application to a real world dataset, provided by EU-certified rating agency Modefinance

    A Survey on Explainable Anomaly Detection

    Full text link
    In the past two decades, most research on anomaly detection has focused on improving the accuracy of the detection, while largely ignoring the explainability of the corresponding methods and thus leaving the explanation of outcomes to practitioners. As anomaly detection algorithms are increasingly used in safety-critical domains, providing explanations for the high-stakes decisions made in those domains has become an ethical and regulatory requirement. Therefore, this work provides a comprehensive and structured survey on state-of-the-art explainable anomaly detection techniques. We propose a taxonomy based on the main aspects that characterize each explainable anomaly detection technique, aiming to help practitioners and researchers find the explainable anomaly detection method that best suits their needs.Comment: Paper accepted by the ACM Transactions on Knowledge Discovery from Data (TKDD) for publication (preprint version

    MODEL INTERPRETATION AND EXPLAINABILITY Towards Creating Transparency in Prediction Models

    Get PDF
    Explainable AI (XAI) has a counterpart in analytical modeling which we refer to as model explainability. We tackle the issue of model explainability in the context of prediction models. We analyze a dataset of loans from a credit card company and apply three stages: execute and compare four different prediction methods, apply the best known explainability techniques in the current literature to the model training sets to identify feature importance (FI) (static case), and finally to cross-check whether the FI set holds up under “what if” prediction scenarios for continuous and categorical variables (dynamic case). We found inconsistency in FI identification between the static and dynamic cases. We summarize the “state of the art” in model explainability and suggest further research to advance the field

    Revista de Estabilidad Financiera. Nº 43 (otoño 2022)

    Get PDF
    corecore