41 research outputs found

    Penerapan Ensemble Stacking untuk Klasifikasi Multi Kelas

    Full text link
    Klasifikasi adalah salah satu topik utama yang banyak digunakan dalam penelitian pembelajaran mesin. Beberapa penelitian terdahulu telah menghasilkan base classifier yang sampai saat ini masih digunakan. Banyak base classifier menunjukkan performa yang baik untuk klasifikasi biner tetapi performa classifier tersebut menurun pada saat digunakan untuk klasifikasi multi-kelas. Pada penelitian sebelumnya digunakan hybrid classifier untuk klasifikasi multi kelas. Hasil penelitian menunjukkan akurasi hybrid classifier yang diajukan lebih baik dari base classifier. pada penelitian ini ensemble method stacking diterapkan. Decision tree dan naïve bayes digunakan sebagai classifier dasar. Hasil pengujian menunjukkan metode ensemble stacking hanya mampu melampui pada beberapa dataset jika dibandingkan dengan hybrid classifier

    Modelling bankruptcy prediction models in Slovak companies

    Full text link
    An intensive research from academics and practitioners has been provided regarding models for bankruptcy prediction and credit risk management. In spite of numerous researches focusing on forecasting bankruptcy using traditional statistics techniques (e.g. discriminant analysis and logistic regression) and early artificial intelligence models (e.g. artificial neural networks), there is a trend for transition to machine learning models (support vector machines, bagging, boosting, and random forest) to predict bankruptcy one year prior to the event. Comparing the performance of this with unconventional approach with results obtained by discriminant analysis, logistic regression, and neural networks application, it has been found that bagging, boosting, and random forest models outperform the others techniques, and that all prediction accuracy in the testing sample improves when the additional variables are included. On the other side the prediction accuracy of old and well known bankruptcy prediction models is quiet high. Therefore, we aim to analyse these in some way old models on the dataset of Slovak companies to validate their prediction ability in specific conditions. Furthermore, these models will be modelled according to new trends by calculating the influence of elimination of selected variables on the overall prediction ability of these models

    Penerapan Ensemble Stacking Untuk Klasifikasi Multi Kelas

    Get PDF
    Klasifikasi adalah salah satu topik utama yang banyak digunakan dalam penelitian pembelajaran mesin. Beberapa penelitian terdahulu telah menghasilkan base classifier yang sampai saat ini masih digunakan. Banyak base classifier menunjukkan performa yang baik untuk klasifikasi biner tetapi performa classifier tersebut menurun pada saat digunakan untuk klasifikasi multi-kelas. Pada penelitian sebelumnya digunakan hybrid classifier untuk klasifikasi multi kelas. Hasil penelitian menunjukkan akurasi hybrid classifier yang diajukan lebih baik dari base classifier. pada penelitian ini ensemble method stacking diterapkan. Decision tree dan naïve bayes digunakan sebagai classifier dasar. Hasil pengujian menunjukkan metode ensemble stacking hanya mampu melampui pada beberapa dataset jika dibandingkan dengan hybrid classifier

    COMPARISON OF BANKRUPTCY PREDICTION MODELS WITH PUBLIC RECORDS AND FIRMOGRAPHICS

    Get PDF
    Many business operations and strategies rely on bankruptcy prediction. In this paper, we aim to study the impacts of public records and firmographics and predict the bankruptcy in a 12-month-ahead period with using different classification models and adding values to traditionally used financial ratios. Univariate analysis shows the statistical association and significance of public records and firmographics indicators with the bankruptcy. Further, seven statistical models and machine learning methods were developed, including Logistic Regression, Decision Tree, Random Forest, Gradient Boosting, Support Vector Machine, Bayesian Network, and Neural Network. The performance of models were evaluated and compared based on classification accuracy, Type I error, Type II error, and ROC curves on the hold-out dataset. Moreover, an experiment was set up to show the importance of oversampling for rare event prediction. The result also shows that Bayesian Network is comparatively more robust than other models without oversampling

    Company bankruptcy prediction framework based on the most influential features using XGBoost and stacking ensemble learning

    Get PDF
    Company bankruptcy is often a very big problem for companies. The impact of bankruptcy can cause losses to elements of the company such as owners, investors, employees, and consumers. One way to prevent bankruptcy is to predict the possibility of bankruptcy based on the company's financial data. Therefore, this study aims to find the best predictive model or method to predict company bankruptcy using the dataset from Polish companies bankruptcy. The prediction analysis process uses the best feature selection and ensemble learning. The best feature selection is selected using feature importance to XGBoost with a weight value filter of 10. The ensemble learning method used is stacking. Stacking is composed of the base model and meta learner. The base model consists of K-nearest neighbor, decision tree, SVM, and random forest, while the meta learner used is LightGBM. The stacking model accuracy results can outperform the base model accuracy with an accuracy rate of 97%

    Influence of the Event Rate on Discrimination Abilities of Bankruptcy Prediction Models

    Get PDF
    In bankruptcy prediction, the proportion of events is very low, which is often oversampled to eliminate this bias. In this paper, we study the influence of the event rate on discrimination abilities of bankruptcy prediction models. First the statistical association and significance of public records and firmographics indicators with the bankruptcy were explored. Then the event rate was oversampled from 0.12% to 10%, 20%, 30%, 40%, and 50%, respectively. Seven models were developed, including Logistic Regression, Decision Tree, Random Forest, Gradient Boosting, Support Vector Machine, Bayesian Network, and Neural Network. Under different event rates, models were comprehensively evaluated and compared based on Kolmogorov-Smirnov Statistic, accuracy, F1 score, Type I error, Type II error, and ROC curve on the hold-out dataset with their best probability cut-offs. Results show that Bayesian Network is the most insensitive to the event rate, while Support Vector Machine is the most sensitive
    corecore