14 research outputs found

    A Comparative Performance Analysis of Hybrid and Classical Machine Learning Method in Predicting Diabetes

    Get PDF
    Diabetes mellitus is one of medical science’s most important research topics because of the disease’s severe consequences. High blood glucose levels characterize it. Early detection of diabetes is made possible by machine learning techniques with their intelligent capabilities to accurately predict diabetes and prevent its complications. Therefore, this study aims to find a machine learning approach that can more accurately predict diabetes. This study compares the performance of various classical machine learning models with the hybrid machine learning approach. The hybrid model includes the homogenous model, which comprises Random Forest, AdaBoost, XGBoost, Extra Trees, Gradient Booster, and the heterogeneous model that uses stacking ensemble methods. The stacking ensemble or stacked generalization approach is a meta-classifier in which multiple learners collaborate for prediction. The performance of the homogeneous hybrid models, Stacked Generalization and the classic machine learning methods such as Naive Bayes and Multilayer Perceptron, k-Nearest Neighbour, and support vector machine are compared. The experimental analysis using Pima Indians and the early-stage diabetes dataset demonstrates that the hybrid models achieve higher accuracy in diagnosing diabetes than the classical models. In the comparison of all the hybrid models, the heterogeneous model using the Stacked Generalization approach outperformed other models by achieving 83.9% and 98.5%. Doi: 10.28991/ESJ-2023-07-01-08 Full Text: PD

    Improved credit scoring model using XGBoost with Bayesian hyper-parameter optimization

    Get PDF
    Several credit-scoring models have been developed using ensemble classifiers in order to improve the accuracy of assessment. However, among the ensemble models, little consideration has been focused on the hyper-parameters tuning of base learners, although these are crucial to constructing ensemble models. This study proposes an improved credit scoring model based on the extreme gradient boosting (XGB) classifier using Bayesian hyper-parameters optimization (XGB-BO). The model comprises two steps. Firstly, data pre-processing is utilized to handle missing values and scale the data. Secondly, Bayesian hyper-parameter optimization is applied to tune the hyper-parameters of the XGB classifier and used to train the model. The model is evaluated on four widely public datasets, i.e., the German, Australia, lending club, and Polish datasets. Several state-of-the-art classification algorithms are implemented for predictive comparison with the proposed method. The results of the proposed model showed promising results, with an improvement in accuracy of 4.10%, 3.03%, and 2.76% on the German, lending club, and Australian datasets, respectively. The proposed model outperformed commonly used techniques, e.g., decision tree, support vector machine, neural network, logistic regression, random forest, and bagging, according to the evaluation results. The experimental results confirmed that the XGB-BO model is suitable for assessing the creditworthiness of applicants

    EC3: Combining Clustering and Classification for Ensemble Learning

    Full text link
    Classification and clustering algorithms have been proved to be successful individually in different contexts. Both of them have their own advantages and limitations. For instance, although classification algorithms are more powerful than clustering methods in predicting class labels of objects, they do not perform well when there is a lack of sufficient manually labeled reliable data. On the other hand, although clustering algorithms do not produce label information for objects, they provide supplementary constraints (e.g., if two objects are clustered together, it is more likely that the same label is assigned to both of them) that one can leverage for label prediction of a set of unknown objects. Therefore, systematic utilization of both these types of algorithms together can lead to better prediction performance. In this paper, We propose a novel algorithm, called EC3 that merges classification and clustering together in order to support both binary and multi-class classification. EC3 is based on a principled combination of multiple classification and multiple clustering methods using an optimization function. We theoretically show the convexity and optimality of the problem and solve it by block coordinate descent method. We additionally propose iEC3, a variant of EC3 that handles imbalanced training data. We perform an extensive experimental analysis by comparing EC3 and iEC3 with 14 baseline methods (7 well-known standalone classifiers, 5 ensemble classifiers, and 2 existing methods that merge classification and clustering) on 13 standard benchmark datasets. We show that our methods outperform other baselines for every single dataset, achieving at most 10% higher AUC. Moreover our methods are faster (1.21 times faster than the best baseline), more resilient to noise and class imbalance than the best baseline method.Comment: 14 pages, 7 figures, 11 table

    Artificial Intelligence and Bank Soundness: Between the Devil and the Deep Blue Sea - Part 2

    Get PDF
    Banks have experienced chronic weaknesses as well as frequent crisis over the years. As bank failures are costly and affect global economies, banks are constantly under intense scrutiny by regulators. This makes banks the most highly regulated industry in the world today. As banks grow into the 21st century framework, banks are in need to embrace Artificial Intelligence (AI) to not only to provide personalized world class service to its large database of customers but most importantly to survive. The chapter provides a taxonomy of bank soundness in the face of AI through the lens of CAMELS where C (Capital), A(Asset), M(Management), E(Earnings), L(Liquidity), S(Sensitivity). The taxonomy partitions challenges from the main strand of CAMELS into distinct categories of AI into 1(C), 4(A), 17(M), 8 (E), 1(L), 2(S) categories that banks and regulatory teams need to consider in evaluating AI use in banks. Although AI offers numerous opportunities to enable banks to operate more efficiently and effectively, at the same time banks also need to give assurance that AI ‘do no harm’ to stakeholders. Posing many unresolved questions, it seems that banks are trapped between the devil and the deep blue sea for now

    A Clustering Based Classifier Ensemble Approach to Corporate Bankruptcy Prediction

    Get PDF
    Corporate bankruptcy prediction is an important research direction in finance. Building a robust prediction scheme for bankruptcy can be beneficial to several stakeholders, including management organizations, government and stockholders. Ensemble learning is a well-known technique to improve the predictive performance of classification algorithms by decreasing the generalization error and enhancing the classification accuracy. It has been a well-established technique in bankruptcy prediction to enhance the predictive performance. Diversity plays an essential role in constructing robust ensemble classification schemes. In this paper, a clustering based classifier ensemble approach is presented for corporate bankruptcy prediction. In this scheme, k-means algorithm is utilized to obtain diversified training subsets. Based on the subsets, each base learning algorithms are trained and the predictions of base learning algorithms are combined by a majority voting scheme. In the empirical analysis, four classification algorithms (namely, C4.5 algorithm, k-nearest neighbour algorithm, support vector machines and logistic regression) and three ensemble learning methods (Bagging, AdaBoost and Random Subspace) are evaluated

    Study of Banking Customers Credit Scoring Indicators Using Artificial Intelligence and Delphi Method

    Get PDF
    With the importance of lending in the banking industry, it is very important to use the indicators affecting credit to decide on lending. The purpose of the present study is to identify and prioritize the effective features in customer accreditation using the viewpoints of bank experts in Kerman and to compare them with existing indicators in models extracted from Meta-Heuristic and Artificial Intelligence methods. The aim is to find out whether there is a match between the human views that arise from knowledge and experience and the views of artificial intelligence that look at the problem as black-box modeling. Required data were collected by questionnaire method and Quantum Binary particle swarm optimization algorithm and analyzed by Delphi. The results show that the selected indices have 80% overlap between the two methods. Due to the results of research and high accuracy of artificial intelligence techniques, it is suggested that in order to give credit to customers in banks and financial and credit institutions, to consider a higher weight for these indicators

    Responsible access to credit for sole-traders and micro-organizations under unstable market conditions with psychometrics

    Get PDF
    In a context of market volatility, the growing complexity of financial products, and a shift towards self-employment, there is an increasing demand for inclusive financial services for sole traders and micro-organizations. To address this need, we conducted a study using real-time data from a Fintech lender in the Czech Republic to assess the effectiveness of a new financial literacy based psychometric credit scoring model (PSM) in improving access to finance for micro, small and medium sized enterprises (MSME) sector, particularly sole traders, and micro-organizations, during volatile market conditions. This study affirms that PSMs play a significant role in responsibly including this underserved sector. Specifically, we observed a 30% higher approval rate and a 23% lower default rate when utilizing the PSM versus the traditional credit scoring model (TCSM). Moreover, during the period of substantial market volatility and instability, such as the state-of-emergency during the COVID-19 pandemic, the PSM exhibited a 13% higher approval rate at a 20% lower default rate than the TCSM. This evidence supports the proposition that PSMs offer a viable option for promoting financial inclusion and targeted financial education among MSMEs in the face of instable financial markets
    corecore