2,706 research outputs found

    Double Ensemble Approaches to Predicting Firms’ Credit Rating

    Get PDF
    Several rating agencies such as Standard & Poor\u27s (S&P), Moody\u27s and Fitch Ratings have evaluated firms’ credit rating. Since lots of fees are required by the agencies and sometimes the timely default risk of the firms is not reflected, it can be helpful for stakeholders if the credit ratings can be predicted before the agencies publish them. However, it is not easy to make an accurate prediction of credit rating since it covers a variety of range. Therefore, this study proposes two double ensemble approaches, 1) bagging-boosting and 2) boosting-bagging, to improve the prediction accuracy. To that end, we first conducted feature selection, using Chi-Square and Gain-Ratio attribute evaluators, with 3 classification algorithms (i.e., decision tree (DT), artificial neural network (ANN), and Naïve Bayesian (NB)) to select relevant features and a base classifier of ensemble models. And then, we integrated bagging and boosting methods by applying boosting method to bagging method (bagging-boosting), and bagging method to boosting method (boosting-bagging). Finally, we compared the prediction accuracy of our proposed model to benchmark models. The experimental results showed that our proposed models outperformed the benchmark models

    An academic review: applications of data mining techniques in finance industry

    Get PDF
    With the development of Internet techniques, data volumes are doubling every two years, faster than predicted by Moore’s Law. Big Data Analytics becomes particularly important for enterprise business. Modern computational technologies will provide effective tools to help understand hugely accumulated data and leverage this information to get insights into the finance industry. In order to get actionable insights into the business, data has become most valuable asset of financial organisations, as there are no physical products in finance industry to manufacture. This is where data mining techniques come to their rescue by allowing access to the right information at the right time. These techniques are used by the finance industry in various areas such as fraud detection, intelligent forecasting, credit rating, loan management, customer profiling, money laundering, marketing and prediction of price movements to name a few. This work aims to survey the research on data mining techniques applied to the finance industry from 2010 to 2015.The review finds that Stock prediction and Credit rating have received most attention of researchers, compared to Loan prediction, Money Laundering and Time Series prediction. Due to the dynamics, uncertainty and variety of data, nonlinear mapping techniques have been deeply studied than linear techniques. Also it has been proved that hybrid methods are more accurate in prediction, closely followed by Neural Network technique. This survey could provide a clue of applications of data mining techniques for finance industry, and a summary of methodologies for researchers in this area. Especially, it could provide a good vision of Data Mining Techniques in computational finance for beginners who want to work in the field of computational finance

    Towards ML-based Platforms in Finance Industry – An ML Approach to Generate Corporate Bankruptcy Probabilities based on Annual Financial Statements

    Get PDF
    The increasing interest in Machine Learning (ML) based services and the need for more intelligent and automated processes in the finance industry brings new challenges and requires practitioners and academics to design, develop, and maintain new ML approaches for financial services companies. The main objective of this paper is to provide a standardized procedure to deal with cases that suffer from imbalanced datasets. For this, we propose design recommendations on, how to test and combine multiple oversampling techniques such as SMOTE, SMOTE-ENN and SMOTE-Tomek on such datasets with multiple ML models and attribute-based structure to reach higher accuracies. Moreover, this paper considers to find an appropriate structure while maintaining such systems that work with periodically changing datasets, so that the incoming datasets can be analyzed regularly via this procedure

    Corporate Bankruptcy Prediction

    Get PDF
    Bankruptcy prediction is one of the most important research areas in corporate finance. Bankruptcies are an indispensable element of the functioning of the market economy, and at the same time generate significant losses for stakeholders. Hence, this book was established to collect the results of research on the latest trends in predicting the bankruptcy of enterprises. It suggests models developed for different countries using both traditional and more advanced methods. Problems connected with predicting bankruptcy during periods of prosperity and recession, the selection of appropriate explanatory variables, as well as the dynamization of models are presented. The reliability of financial data and the validity of the audit are also referenced. Thus, I hope that this book will inspire you to undertake new research in the field of forecasting the risk of bankruptcy

    Will it fail and why? A large case study of company default prediction with highly interpretable machine learning models

    Get PDF
    Finding a model to predict the default of a firm is a well-known topic over the financial and data science community. Default prediction problem has been studied for over fifty years, but remain a very hard task even today. Since it maintains a remarkable practical relevance, we try to put in practice our efforts in order to obtain the maximum rediction results, also in comparison with the reference literature. In our work we use in combination three large and important datasets in order to investigate both bankruptcy and bank default: a state of difficulty for companies that often anticipates actual bankruptcy. We combine one dataset from the Italian Central Credit Register of the Bank of Italy, one from balance sheet information related to Italian firms, and information from AnaCredit dataset, a novel source of credit data by European Central Bank. We try to go beyond the academic study and to show how our model, based on some promising machine learning algorithms, outperforms the current default predictions made by credit institutions. At the same time, we try to provide insights on the reasons that lead to a particular outcome. In fact, many modern approaches try to find well-performing models to forecast the default of a company; those models often act like a black-box and don’t give to financial institutions the fundamental explanations they need for their choices. This project aims to find a robust predictive model using a tree-based machine learning algorithm which flanked by a game-theoretic approach can provide sound explanations of the output of the model. Finally, we dedicated a special effort to the analysis of predictions in highly unbalanced contexts. Imbalanced classes are a common problem in machine learning classification that typically is addressed by removing the imbalance in the training set. We conjecture that it is not always the best choice and propose the use of a slightly unbalanced training set, showing that this approach contributes to maximize the performance

    Interpretable Binary and Multiclass Prediction Models for Insolvencies and Credit Ratings

    Get PDF
    Insolvenzprognosen und Ratings sind wichtige Aufgaben der Finanzbranche und dienen der Kreditwürdigkeitsprüfung von Unternehmen. Eine Möglichkeit dieses Aufgabenfeld anzugehen, ist maschinelles Lernen. Dabei werden Vorhersagemodelle aufgrund von Beispieldaten aufgestellt. Methoden aus diesem Bereich sind aufgrund Ihrer Automatisierbarkeit vorteilhaft. Dies macht menschliche Expertise in den meisten Fällen überflüssig und bietet dadurch einen höheren Grad an Objektivität. Allerdings sind auch diese Ansätze nicht perfekt und können deshalb menschliche Expertise nicht gänzlich ersetzen. Sie bieten sich aber als Entscheidungshilfen an und können als solche von Experten genutzt werden, weshalb interpretierbare Modelle wünschenswert sind. Leider bieten nur wenige Lernalgorithmen interpretierbare Modelle. Darüber hinaus sind einige Aufgaben wie z.B. Rating häufig Mehrklassenprobleme. Mehrklassenklassifikationen werden häufig durch Meta-Algorithmen erreicht, welche mehrere binäre Algorithmen trainieren. Die meisten der üblicherweise verwendeten Meta-Algorithmen eliminieren jedoch eine gegebenenfalls vorhandene Interpretierbarkeit. In dieser Dissertation untersuchen wir die Vorhersagegenauigkeit von interpretierbaren Modellen im Vergleich zu nicht interpretierbaren Modellen für Insolvenzprognosen und Ratings. Wir verwenden disjunktive Normalformen und Entscheidungsbäume mit Schwellwerten von Finanzkennzahlen als interpretierbare Modelle. Als nicht interpretierbare Modelle werden Random Forests, künstliche Neuronale Netze und Support Vector Machines verwendet. Darüber hinaus haben wir einen eigenen Lernalgorithmus Thresholder entwickelt, welcher disjunktive Normalformen und interpretierbare Mehrklassenmodelle generiert. Für die Aufgabe der Insolvenzprognose zeigen wir, dass interpretierbare Modelle den nicht interpretierbaren Modellen nicht unterlegen sind. Dazu wird in einer ersten Fallstudie eine in der Praxis verwendete Datenbank mit Jahresabschlüssen von 5152 Unternehmen verwendet, um die Vorhersagegenauigkeit aller oben genannter Modelle zu messen. In einer zweiten Fallstudie zur Vorhersage von Ratings demonstrieren wir, dass interpretierbare Modelle den nicht interpretierbaren Modellen sogar überlegen sind. Die Vorhersagegenauigkeit aller Modelle wird anhand von drei in der Praxis verwendeten Datensätzen bestimmt, welche jeweils drei Ratingklassen aufweisen. In den Fallstudien vergleichen wir verschiedene interpretierbare Ansätze bezüglich deren Modellgrößen und der Form der Interpretierbarkeit. Wir präsentieren exemplarische Modelle, welche auf den entsprechenden Datensätzen basieren und bieten dafür Interpretationsansätze an. Unsere Ergebnisse zeigen, dass interpretierbare, schwellwertbasierte Modelle den Klassifikationsproblemen in der Finanzbranche angemessen sind. In diesem Bereich sind sie komplexeren Modellen, wie z.B. den Support Vector Machines, nicht unterlegen. Unser Algorithmus Thresholder erzeugt die kleinsten Modelle während seine Vorhersagegenauigkeit vergleichbar mit den anderen interpretierbaren Modellen bleibt. In unserer Fallstudie zu Rating liefern die interpretierbaren Modelle deutlich bessere Ergebnisse als bei der zur Insolvenzprognose (s. o.). Eine mögliche Erklärung dieser Ergebnisse bietet die Tatsache, dass Ratings im Gegensatz zu Insolvenzen menschengemacht sind. Das bedeutet, dass Ratings auf Entscheidungen von Menschen beruhen, welche in interpretierbaren Regeln, z.B. logischen Verknüpfungen von Schwellwerten, denken. Daher gehen wir davon aus, dass interpretierbare Modelle zu den Problemstellungen passen und diese interpretierbaren Regeln erkennen und abbilden

    Forecasting Financial Distress With Machine Learning – A Review

    Get PDF
    Purpose – Evaluate the various academic researches with multiple views on credit risk and artificial intelligence (AI) and their evolution.Theoretical framework – The study is divided as follows: Section 1 introduces the article. Section 2 deals with credit risk and its relationship with computational models and techniques. Section 3 presents the methodology. Section 4 addresses a discussion of the results and challenges on the topic. Finally, section 5 presents the conclusions.Design/methodology/approach – A systematic review of the literature was carried out without defining the time period and using the Web of Science and Scopus database.Findings – The application of computational technology in the scope of credit risk analysis has drawn attention in a unique way. It was found that the demand for identification and introduction of new variables, classifiers and more assertive methods is constant. The effort to improve the interpretation of data and models is intense.Research, Practical & Social implications – It contributes to the verification of the theory, providing information in relation to the most used methods and techniques, it brings a wide analysis to deepen the knowledge of the factors and variables on the theme. It categorizes the lines of research and provides a summary of the literature, which serves as a reference, in addition to suggesting future research.Originality/value – Research in the area of Artificial Intelligence and Machine Learning is recent and requires attention and investigation, thus, this study contributes to the opening of new views in order to deepen the work on this topic

    Artificial Intelligence in Banking Industry: A Review on Fraud Detection, Credit Management, and Document Processing

    Get PDF
    AI is likely to alter the banking industry during the next several years. It is progressively being utilized by banks for analyzing and executing credit applications and examining vast volumes of data. This helps to avoid fraud and enables resource-heavy, repetitive procedures and client operations to be automated without any sacrifice in quality. This study reviews how the three most promising AI applications can make the banking sector robust and efficient. Specifically, we review AI fraud detection and prevention, AI credit management, and intelligent document processing. Since the majority of transactions have become digital, there is a great need for enhanced fraud detection algorithms and fraud prevention systems in banking. We argued that the conventional strategy for identifying bank fraud may be inadequate to combat complex fraudulent activity. Instead, artificial intelligence algorithms might be very useful.  Credit management is time-consuming and expensive in terms of resources. Furthermore, because of the number of phases involved, these processes need a significant amount of work involving many laborious tasks. Banks can assess new clients for credit services, calculate loan amounts and pricing, and decrease the risk of fraud by using strong AA/ML models to assess these large and varied data sets in real-time. Documents perform critical functions in the financial system and have a substantial influence on day-to-day operations. Currently, a large percentage of this data is preserved in email messages, online forms, PDFs, scanned images, and other digital formats. Using such a massive dataset is a difficult undertaking for any bank. We discuss how the artificial intelligence techniques that automatically pull critical data from all documents received by the bank, regardless of format, and feed it to the bank's existing portals/systems while maintaining consistency

    Macroeconomics determinants of loss given default

    Get PDF
    Mestrado em FinançasEsta dissertação modeliza a base de dados Moody's Ultimate Recovery Database, concluindo que o ambiente macroeconómico influencia o loss given default (LGD)e que as taxas de recuperação no crédito concedido são menos susceptíveis a serem influenciadas pelas condicionantes macroeconómicas do que as taxas de recuperação das obrigações. A metodologia econométrica tem por base a regressão OLS. São também discutidas outras metodologias passíveis de serem utilizadas.This dissertation models Moody's Ultimate Recovery Database to show that general macroeconomic conditions influence loss given default and that loans' recovery rates are less susceptible to macroeconomic conditions than bonds'. Available data was studied with Ordinary Least Squares regressions. Alternative methodologies are also discussed
    • …
    corecore