5 research outputs found

    IMPROVED SUPPORT VECTOR MACHINE PERFORMANCE USING PARTICLE SWARM OPTIMIZATION IN CREDIT RISK CLASSIFICATION

    Get PDF
    In Classification using Support Vector Machine (SVM), each kernel has parameters that affect the classification accuracy results. This study examines the improvement of SVM performance by selecting parameters using Particle Swarm Optimization (PSO) on credit risk classification, the results of which are compared with SVM with random parameter selection. The classification performance is evaluated by applying the SVM classification to the Credit German benchmark credit data set and the private credit data set which is a credit data set issued from a local bank in North Sumatra. Although it requires a longer execution time to achieve optimal accuracy values, the SVM+PSO combination is quite effective and more systematic than trial and error techniques in finding SVM parameter values, so as to produce better accuracy. In general, the test results show that the RBF kernel is able to produce higher accuracy and f1-scores than linear and polynomial kernels. SVM classification with optimization using PSO can produce better accuracy than classification using SVM without optimization, namely the determination of parameters randomly. Credit data classification accuracy increased to 92.31%

    A mathematical programming approach to SVM-based classification with label noise

    Get PDF
    The authors of this research acknowledge financial support by the Spanish Ministerio de Ciencia y Tecnologia, Agencia Estatal de Investigacion and Fondos Europeos de Desarrollo Regional (FEDER) via project PID2020114594GB-C21. The authors also acknowledge partial support from projects FEDER-US-1256951, Junta de Andaluc铆a P18-FR-1422, CEI-3-FQM331, NetmeetData: Ayudas Fundaci贸n BBVA a equipos de investigaci贸n cient铆fica 2019. The first author was also supported by projects P18-FR-2369 (Junta de Andaluc铆a) and IMAG-Maria de Maeztu grant CEX2020-001105-M /AEI /10.13039/501100011033. (Spanish Ministerio de Ciencia y Tecnologia).In this paper we propose novel methodologies to optimally construct Support Vector Machine-based classifiers that take into account that label noise occur in the training sample. We propose different alternatives based on solving Mixed Integer Linear and Non Linear models by incorporating decisions on relabeling some of the observations in the training dataset. The first method incorporates relabeling directly in the SVM model while a second family of methods combines clustering with classification at the same time, giving rise to a model that applies simultaneously similarity measures and SVM. Extensive computational experiments are reported based on a battery of standard datasets taken from UCI Machine Learning repository, showing the effectiveness of the proposed approaches.Spanish Ministerio de Ciencia y Tecnologia, Agencia Estatal de Investigacion and Fondos Europeos de Desarrollo Regional (FEDER) via project PID2020114594GB-C21FEDER-US-1256951Junta de Andaluc铆a P18-FR-1422CEI-3-FQM331NetmeetData: Ayudas Fundaci贸n BBVA a equipos de investigaci贸n cient铆fica 2019Project P18-FR-2369 Junta de Andaluc铆aIMAG-Maria de Maeztu grant CEX2020-001105-M /AEI /10.13039/501100011033. (Spanish Ministerio de Ciencia y Tecnologia

    A dynamic credit scoring model based on survival gradient boosting decision tree approach

    Get PDF
    Credit scoring, which is typically transformed into a classification problem, is a powerful tool to manage credit risk since it forecasts the probability of default (PD) of a loan application. However, there is a growing trend of integrating survival analysis into credit scoring to provide a dynamic prediction on PD over time and a clear explanation on censoring. A novel dynamic credit scoring model (i.e., SurvXGBoost) is proposed based on survival gradient boosting decision tree (GBDT) approach. Our proposal, which combines survival analysis and GBDT approach, is expected to enhance predictability relative to statistical survival models. The proposed method is compared with several common benchmark models on a real-world consumer loan dataset. The results of out-of-sample and out-of-time validation indicate that SurvXGBoost outperform the benchmarks in terms of predictability and misclassification cost. The incorporation of macroeconomic variables can further enhance performance of survival models. The proposed SurvXGBoost meanwhile maintains some interpretability since it provides information on feature importance. First published online 14 December 202

    Integrated framework for profit-based feature selection and SVM classification in credit scoring

    No full text
    n this paper, we propose a profit-driven approach for classifier construction and simultaneous variable selection based on linear Support Vector Machines. The main goal is to incorporate business-related information such as the variable acquisition costs, the Type I and II error costs, and the profit generated by correctly classified instances, into the modeling process. Our proposal incorporates a group penalty function in the SVM formulation in order to penalize the variables simultaneously that belong to the same group, assuming that companies often acquire groups of related variables for a given cost rather than acquiring them individually. The proposed framework was studied in a credit scoring problem for a Chilean bank, and led to superior performance with respect to business-related goals
    corecore