895 research outputs found

    Feature selection in credit risk modeling: an international evidence

    Get PDF
    This paper aims to discover a suitable combination of contemporary feature selection techniques and robust prediction classifiers. As such, to examine the impact of the feature selection method on classifier performance, we use two Chinese and three other real-world credit scoring datasets. The utilized feature selection methods are the least absolute shrinkage and selection operator (LASSO), multivariate adaptive regression splines (MARS). In contrast, the examined classifiers are the classification and regression trees (CART), logistic regression (LR), artificial neural network (ANN), and support vector machines (SVM). Empirical findings confirm that LASSO’s feature selection method, followed by robust classifier SVM, demonstrates remarkable improvement and outperforms other competitive classifiers. Moreover, ANN also offers improved accuracy with feature selection methods; LR only can improve classification efficiency through performing feature selection via LASSO. Nonetheless, CART does not provide any indication of improvement in any combination. The proposed credit scoring modeling strategy may use to develop policy, progressive ideas, operational guidelines for effective credit risk management of lending, and other financial institutions. The finding of this study has practical value, as to date, there is no consensus about the combination of feature selection method and prediction classifiers

    Improved credit scoring model using XGBoost with Bayesian hyper-parameter optimization

    Get PDF
    Several credit-scoring models have been developed using ensemble classifiers in order to improve the accuracy of assessment. However, among the ensemble models, little consideration has been focused on the hyper-parameters tuning of base learners, although these are crucial to constructing ensemble models. This study proposes an improved credit scoring model based on the extreme gradient boosting (XGB) classifier using Bayesian hyper-parameters optimization (XGB-BO). The model comprises two steps. Firstly, data pre-processing is utilized to handle missing values and scale the data. Secondly, Bayesian hyper-parameter optimization is applied to tune the hyper-parameters of the XGB classifier and used to train the model. The model is evaluated on four widely public datasets, i.e., the German, Australia, lending club, and Polish datasets. Several state-of-the-art classification algorithms are implemented for predictive comparison with the proposed method. The results of the proposed model showed promising results, with an improvement in accuracy of 4.10%, 3.03%, and 2.76% on the German, lending club, and Australian datasets, respectively. The proposed model outperformed commonly used techniques, e.g., decision tree, support vector machine, neural network, logistic regression, random forest, and bagging, according to the evaluation results. The experimental results confirmed that the XGB-BO model is suitable for assessing the creditworthiness of applicants

    Forecasting Financial Distress With Machine Learning – A Review

    Get PDF
    Purpose – Evaluate the various academic researches with multiple views on credit risk and artificial intelligence (AI) and their evolution.Theoretical framework – The study is divided as follows: Section 1 introduces the article. Section 2 deals with credit risk and its relationship with computational models and techniques. Section 3 presents the methodology. Section 4 addresses a discussion of the results and challenges on the topic. Finally, section 5 presents the conclusions.Design/methodology/approach – A systematic review of the literature was carried out without defining the time period and using the Web of Science and Scopus database.Findings – The application of computational technology in the scope of credit risk analysis has drawn attention in a unique way. It was found that the demand for identification and introduction of new variables, classifiers and more assertive methods is constant. The effort to improve the interpretation of data and models is intense.Research, Practical & Social implications – It contributes to the verification of the theory, providing information in relation to the most used methods and techniques, it brings a wide analysis to deepen the knowledge of the factors and variables on the theme. It categorizes the lines of research and provides a summary of the literature, which serves as a reference, in addition to suggesting future research.Originality/value – Research in the area of Artificial Intelligence and Machine Learning is recent and requires attention and investigation, thus, this study contributes to the opening of new views in order to deepen the work on this topic

    Machine learning applied to banking supervision a literature review

    Get PDF
    Guerra, P., & Castelli, M. (2021). Machine learning applied to banking supervision a literature review. Risks, 9(7), 1-24. [136]. https://doi.org/10.3390/risks9070136Machine learning (ML) has revolutionised data analysis over the past decade. Like in-numerous other industries heavily reliant on accurate information, banking supervision stands to benefit greatly from this technological advance. The objective of this review is to provide a compre-hensive walk-through of how the most common ML techniques have been applied to risk assessment in banking, focusing on a supervisory perspective. We searched Google Scholar, Springer Link, and ScienceDirect databases for articles including the search terms “machine learning” and (“bank” or “banking” or “supervision”). No language, date, or Journal filter was applied. Papers were then screened and selected according to their relevance. The final article base consisted of 41 papers and 2 book chapters, 53% of which were published in the top quartile journals in their field. Results are presented in a timeline according to the publication date and categorised by time slots. Credit risk assessment and stress testing are highlighted topics as well as other risk perspectives, with some references to ML application surveys. The most relevant ML techniques encompass k-nearest neigh-bours (KNN), support vector machines (SVM), tree-based models, ensembles, boosting techniques, and artificial neural networks (ANN). Recent trends include developing early warning systems (EWS) for bankruptcy and refining stress testing. One limitation of this study is the paucity of contributions using supervisory data, which justifies the need for additional investigation in this field. However, there is increasing evidence that ML techniques can enhance data analysis and decision making in the banking industry.publishersversionpublishe

    Credit risk prediction in an imbalanced social lending environment

    Full text link
    © 2018, the Authors. Credit risk prediction is an effective way of evaluating whether a potential borrower will repay a loan, particularly in peer-to-peer lending where class imbalance problems are prevalent. However, few credit risk prediction models for social lending consider imbalanced data and, further, the best resampling technique to use with imbalanced data is still controversial. In an attempt to address these problems, this paper presents an empirical comparison of various combinations of classifiers and resampling techniques within a novel risk assessment methodology that incorporates imbalanced data. The credit predictions from each combination are evaluated with a G-mean measure to avoid bias towards the majority class, which has not been considered in similar studies. The results reveal that combining random forest and random under-sampling may be an effective strategy for calculating the credit risk associated with loan applicants in social lending markets

    A multimodal neuroimaging classifier for alcohol dependence

    Get PDF
    With progress in magnetic resonance imaging technology and a broader dissemination of state-of-the-art imaging facilities, the acquisition of multiple neuroimaging modalities is becoming increasingly feasible. One particular hope associated with multimodal neuroimaging is the development of reliable data-driven diagnostic classifiers for psychiatric disorders, yet previous studies have often failed to find a benefit of combining multiple modalities. As a psychiatric disorder with established neurobiological effects at several levels of description, alcohol dependence is particularly well-suited for multimodal classification. To this aim, we developed a multimodal classification scheme and applied it to a rich neuroimaging battery (structural, functional task-based and functional resting-state data) collected in a matched sample of alcohol-dependent patients (N = 119) and controls (N = 97). We found that our classification scheme yielded 79.3% diagnostic accuracy, which outperformed the strongest individual modality - grey-matter density - by 2.7%. We found that this moderate benefit of multimodal classification depended on a number of critical design choices: a procedure to select optimal modality-specific classifiers, a fine-grained ensemble prediction based on cross-modal weight matrices and continuous classifier decision values. We conclude that the combination of multiple neuroimaging modalities is able to moderately improve the accuracy of machine-learning-based diagnostic classification in alcohol dependence

    A multimodal neuroimaging classifier for alcohol dependence

    Get PDF
    With progress in magnetic resonance imaging technology and a broader dissemination of state-of-the-art imaging facilities, the acquisition of multiple neuroimaging modalities is becoming increasingly feasible. One particular hope associated with multimodal neuroimaging is the development of reliable data-driven diagnostic classifiers for psychiatric disorders, yet previous studies have often failed to find a benefit of combining multiple modalities. As a psychiatric disorder with established neurobiological effects at several levels of description, alcohol dependence is particularly well-suited for multimodal classification. To this aim, we developed a multimodal classification scheme and applied it to a rich neuroimaging battery (structural, functional task-based and functional resting-state data) collected in a matched sample of alcohol-dependent patients (N = 119) and controls (N = 97). We found that our classification scheme yielded 79.3% diagnostic accuracy, which outperformed the strongest individual modality - grey-matter density - by 2.7%. We found that this moderate benefit of multimodal classification depended on a number of critical design choices: a procedure to select optimal modality-specific classifiers, a fine-grained ensemble prediction based on cross-modal weight matrices and continuous classifier decision values. We conclude that the combination of multiple neuroimaging modalities is able to moderately improve the accuracy of machine-learning-based diagnostic classification in alcohol dependence
    corecore