1,182 research outputs found

    A Comprehensive Survey on Enterprise Financial Risk Analysis: Problems, Methods, Spotlights and Applications

    Full text link
    Enterprise financial risk analysis aims at predicting the enterprises' future financial risk.Due to the wide application, enterprise financial risk analysis has always been a core research issue in finance. Although there are already some valuable and impressive surveys on risk management, these surveys introduce approaches in a relatively isolated way and lack the recent advances in enterprise financial risk analysis. Due to the rapid expansion of the enterprise financial risk analysis, especially from the computer science and big data perspective, it is both necessary and challenging to comprehensively review the relevant studies. This survey attempts to connect and systematize the existing enterprise financial risk researches, as well as to summarize and interpret the mechanisms and the strategies of enterprise financial risk analysis in a comprehensive way, which may help readers have a better understanding of the current research status and ideas. This paper provides a systematic literature review of over 300 articles published on enterprise risk analysis modelling over a 50-year period, 1968 to 2022. We first introduce the formal definition of enterprise risk as well as the related concepts. Then, we categorized the representative works in terms of risk type and summarized the three aspects of risk analysis. Finally, we compared the analysis methods used to model the enterprise financial risk. Our goal is to clarify current cutting-edge research and its possible future directions to model enterprise risk, aiming to fully understand the mechanisms of enterprise risk communication and influence and its application on corporate governance, financial institution and government regulation

    Forecasting Financial Distress With Machine Learning – A Review

    Get PDF
    Purpose – Evaluate the various academic researches with multiple views on credit risk and artificial intelligence (AI) and their evolution.Theoretical framework – The study is divided as follows: Section 1 introduces the article. Section 2 deals with credit risk and its relationship with computational models and techniques. Section 3 presents the methodology. Section 4 addresses a discussion of the results and challenges on the topic. Finally, section 5 presents the conclusions.Design/methodology/approach – A systematic review of the literature was carried out without defining the time period and using the Web of Science and Scopus database.Findings – The application of computational technology in the scope of credit risk analysis has drawn attention in a unique way. It was found that the demand for identification and introduction of new variables, classifiers and more assertive methods is constant. The effort to improve the interpretation of data and models is intense.Research, Practical & Social implications – It contributes to the verification of the theory, providing information in relation to the most used methods and techniques, it brings a wide analysis to deepen the knowledge of the factors and variables on the theme. It categorizes the lines of research and provides a summary of the literature, which serves as a reference, in addition to suggesting future research.Originality/value – Research in the area of Artificial Intelligence and Machine Learning is recent and requires attention and investigation, thus, this study contributes to the opening of new views in order to deepen the work on this topic

    Using Machine Learning Techniques to Predict a Risk Score for New Members of a Chit Fund Group

    Get PDF
    Predicting the risk score of new and potential customers is used across the financial industry. By implementing the prediction of risk scores for their customers a chit fund company can improve the knowledge and customer understanding without relying on human knowledge. Data is collected on each customer before they have taken out credit and during the time they contribute to a chit fund. Having collected the necessary data, the company can then decide whether modelling customer risk would benefit them. As the data is available historically, one aspect of risk score prediction will be the focus of this thesis, supervised machine learning. Supervised machine learning techniques use historic data to ‘learn a model of the relationship between a set of descriptive features and a target feature’ (Kelleher, Mac Namee, & D’Arcy, 2015). There are many supervised machine learning techniques; support vector machine (SVM), logistic regression and decision trees will be the focal point of this thesis. The main objective of this project attempts to predict a risk score for new or potential subscribers of a chit fund company. The models generated would be suitable for use before a customer joins a chit fund group as well as while the customer is taking part in the group, measuring risk before becoming a subscriber and the behavioural risk while with the company. The objective is to extend research already carried out to predict a score from zero to one identifying the probability of default. Default, for the purpose of this project, is defined as being more than 90 days late with a payment. The data of real chit fund subscribers was used to train and test the models built for the project. A factor reduction technique was used to identify key variables, and multiple models were tested to determine which gives the best results. The second objective of this project will look at the subscriber network. This section of the project will check for links between subscribers, and investigate a possible link between subscribers and their chance of default. Variables such as address and nominee will be the focus in this section. iii The most successful supervised machine learning model was the random forest model with precision of 59% and recall of 92%. Accuracy for this model was the highest of each of the models in the experiment at 85%. However, this is not the most trustworthy evaluation measure for this project as the dataset is unbalanced. A combination of 300 decision trees were applied in this model. Using the classification method, the class that was predicted by the majority of trees was selected as the final prediction. This achieved high accuracy of the dataset from the chit fund company, Kyepot. Social network analysis found that there was no unusual relationship between subscribers that went into default with regards to the area in which they live or their nominees. Supervised machine learning techniques have been shown to be a useful tool in the financial industry. This project suggests that these techniques may also be useful tools for chit fund companies. This project evaluates four different techniques suggesting the random forest technique is the most useful for this chit fund company

    Machine learning applied to banking supervision a literature review

    Get PDF
    Guerra, P., & Castelli, M. (2021). Machine learning applied to banking supervision a literature review. Risks, 9(7), 1-24. [136]. https://doi.org/10.3390/risks9070136Machine learning (ML) has revolutionised data analysis over the past decade. Like in-numerous other industries heavily reliant on accurate information, banking supervision stands to benefit greatly from this technological advance. The objective of this review is to provide a compre-hensive walk-through of how the most common ML techniques have been applied to risk assessment in banking, focusing on a supervisory perspective. We searched Google Scholar, Springer Link, and ScienceDirect databases for articles including the search terms “machine learning” and (“bank” or “banking” or “supervision”). No language, date, or Journal filter was applied. Papers were then screened and selected according to their relevance. The final article base consisted of 41 papers and 2 book chapters, 53% of which were published in the top quartile journals in their field. Results are presented in a timeline according to the publication date and categorised by time slots. Credit risk assessment and stress testing are highlighted topics as well as other risk perspectives, with some references to ML application surveys. The most relevant ML techniques encompass k-nearest neigh-bours (KNN), support vector machines (SVM), tree-based models, ensembles, boosting techniques, and artificial neural networks (ANN). Recent trends include developing early warning systems (EWS) for bankruptcy and refining stress testing. One limitation of this study is the paucity of contributions using supervisory data, which justifies the need for additional investigation in this field. However, there is increasing evidence that ML techniques can enhance data analysis and decision making in the banking industry.publishersversionpublishe

    Corporate Bankruptcy Prediction

    Get PDF
    Bankruptcy prediction is one of the most important research areas in corporate finance. Bankruptcies are an indispensable element of the functioning of the market economy, and at the same time generate significant losses for stakeholders. Hence, this book was established to collect the results of research on the latest trends in predicting the bankruptcy of enterprises. It suggests models developed for different countries using both traditional and more advanced methods. Problems connected with predicting bankruptcy during periods of prosperity and recession, the selection of appropriate explanatory variables, as well as the dynamization of models are presented. The reliability of financial data and the validity of the audit are also referenced. Thus, I hope that this book will inspire you to undertake new research in the field of forecasting the risk of bankruptcy

    Interpretable Binary and Multiclass Prediction Models for Insolvencies and Credit Ratings

    Get PDF
    Insolvenzprognosen und Ratings sind wichtige Aufgaben der Finanzbranche und dienen der Kreditwürdigkeitsprüfung von Unternehmen. Eine Möglichkeit dieses Aufgabenfeld anzugehen, ist maschinelles Lernen. Dabei werden Vorhersagemodelle aufgrund von Beispieldaten aufgestellt. Methoden aus diesem Bereich sind aufgrund Ihrer Automatisierbarkeit vorteilhaft. Dies macht menschliche Expertise in den meisten Fällen überflüssig und bietet dadurch einen höheren Grad an Objektivität. Allerdings sind auch diese Ansätze nicht perfekt und können deshalb menschliche Expertise nicht gänzlich ersetzen. Sie bieten sich aber als Entscheidungshilfen an und können als solche von Experten genutzt werden, weshalb interpretierbare Modelle wünschenswert sind. Leider bieten nur wenige Lernalgorithmen interpretierbare Modelle. Darüber hinaus sind einige Aufgaben wie z.B. Rating häufig Mehrklassenprobleme. Mehrklassenklassifikationen werden häufig durch Meta-Algorithmen erreicht, welche mehrere binäre Algorithmen trainieren. Die meisten der üblicherweise verwendeten Meta-Algorithmen eliminieren jedoch eine gegebenenfalls vorhandene Interpretierbarkeit. In dieser Dissertation untersuchen wir die Vorhersagegenauigkeit von interpretierbaren Modellen im Vergleich zu nicht interpretierbaren Modellen für Insolvenzprognosen und Ratings. Wir verwenden disjunktive Normalformen und Entscheidungsbäume mit Schwellwerten von Finanzkennzahlen als interpretierbare Modelle. Als nicht interpretierbare Modelle werden Random Forests, künstliche Neuronale Netze und Support Vector Machines verwendet. Darüber hinaus haben wir einen eigenen Lernalgorithmus Thresholder entwickelt, welcher disjunktive Normalformen und interpretierbare Mehrklassenmodelle generiert. Für die Aufgabe der Insolvenzprognose zeigen wir, dass interpretierbare Modelle den nicht interpretierbaren Modellen nicht unterlegen sind. Dazu wird in einer ersten Fallstudie eine in der Praxis verwendete Datenbank mit Jahresabschlüssen von 5152 Unternehmen verwendet, um die Vorhersagegenauigkeit aller oben genannter Modelle zu messen. In einer zweiten Fallstudie zur Vorhersage von Ratings demonstrieren wir, dass interpretierbare Modelle den nicht interpretierbaren Modellen sogar überlegen sind. Die Vorhersagegenauigkeit aller Modelle wird anhand von drei in der Praxis verwendeten Datensätzen bestimmt, welche jeweils drei Ratingklassen aufweisen. In den Fallstudien vergleichen wir verschiedene interpretierbare Ansätze bezüglich deren Modellgrößen und der Form der Interpretierbarkeit. Wir präsentieren exemplarische Modelle, welche auf den entsprechenden Datensätzen basieren und bieten dafür Interpretationsansätze an. Unsere Ergebnisse zeigen, dass interpretierbare, schwellwertbasierte Modelle den Klassifikationsproblemen in der Finanzbranche angemessen sind. In diesem Bereich sind sie komplexeren Modellen, wie z.B. den Support Vector Machines, nicht unterlegen. Unser Algorithmus Thresholder erzeugt die kleinsten Modelle während seine Vorhersagegenauigkeit vergleichbar mit den anderen interpretierbaren Modellen bleibt. In unserer Fallstudie zu Rating liefern die interpretierbaren Modelle deutlich bessere Ergebnisse als bei der zur Insolvenzprognose (s. o.). Eine mögliche Erklärung dieser Ergebnisse bietet die Tatsache, dass Ratings im Gegensatz zu Insolvenzen menschengemacht sind. Das bedeutet, dass Ratings auf Entscheidungen von Menschen beruhen, welche in interpretierbaren Regeln, z.B. logischen Verknüpfungen von Schwellwerten, denken. Daher gehen wir davon aus, dass interpretierbare Modelle zu den Problemstellungen passen und diese interpretierbaren Regeln erkennen und abbilden

    Deterministic and Probabilistic Risk Management Approaches in Construction Projects: A Systematic Literature Review and Comparative Analysis

    Get PDF
    Risks and uncertainties are inevitable in construction projects and can drastically change the expected outcome, negatively impacting the project’s success. However, risk management (RM) is still conducted in a manual, largely ineffective, and experience-based fashion, hindering automation and knowledge transfer in projects. The construction industry is benefitting from the recent Industry 4.0 revolution and the advancements in data science branches, such as artificial intelligence (AI), for the digitalization and optimization of processes. Data-driven methods, e.g., AI and machine learning algorithms, Bayesian inference, and fuzzy logic, are being widely explored as possible solutions to RM domain shortcomings. These methods use deterministic or probabilistic risk reasoning approaches, the first of which proposes a fixed predicted value, and the latter embraces the notion of uncertainty, causal dependencies, and inferences between variables affecting projects’ risk in the predicted value. This research used a systematic literature review method with the objective of investigating and comparatively analyzing the main deterministic and probabilistic methods applied to construction RM in respect of scope, primary applications, advantages, disadvantages, limitations, and proven accuracy. The findings established recommendations for optimum AI-based frameworks for different management levels—enterprise, project, and operational—for large or small data sets

    Ennustemallin kehittäminen suomalaisten PK-yritysten konkurssiriskin määritykseen

    Get PDF
    Bankruptcy prediction is a subject of significant interest to both academics and practitioners because of its vast economic and societal impact. Academic research in the field is extensive and diverse; no consensus has formed regarding the superiority of different prediction methods or predictor variables. Most studies focus on large companies; small and medium-sized enterprises (SMEs) have received less attention, mainly due to data unavailability. Despite recent academic advances, simple statistical models are still favored in practical use, largely due to their understandability and interpretability. This study aims to construct a high-performing but user-friendly and interpretable bankruptcy prediction model for Finnish SMEs using financial statement data from 2008–2010. A literature review is conducted to explore the key aspects of bankruptcy prediction; the findings are used for designing an empirical study. Five prediction models are trained on different predictor subsets and training samples, and two models are chosen for detailed examination based on the findings. A prediction model using the random forest method, utilizing all available predictors and the unadjusted training data containing an imbalance of bankrupt and non-bankrupt firms, is found to perform best. Superior performance compared to a benchmark model is observed in terms of both key metrics, and the random forest model is deemed easy to use and interpretable; it is therefore recommended for practical application. Equity ratio and financial expenses to total assets consistently rank as the best two predictors for different models; otherwise the findings on predictor importance are mixed, but mainly in line with the prevalent views in the related literature. This study shows that constructing an accurate but practical bankruptcy prediction model is feasible, and serves as a guideline for future scholars and practitioners seeking to achieve the same. Some further research avenues to follow are recognized based on empirical findings and the extant literature. In particular, this study raises an important question regarding the appropriateness of the most commonly used performance metrics in bankruptcy prediction. Area under the precision-recall curve (PR AUC), which is widely used in other fields of study, is deemed a suitable alternative and is recommended for measuring model performance in future bankruptcy prediction studies.Konkurssien ennustaminen on taloudellisten ja yhteiskunnallisten vaikutustensa vuoksi merkittävä aihe akateemisesta ja käytännöllisestä näkökulmasta. Alan tutkimus on laajaa ja monipuolista, eikä konsensusta parhaiden ennustemallien ja -muuttujien suhteen ole saavutettu. Valtaosa tutkimuksista keskittyy suuryrityksiin; pienten ja keskisuurten (PK)-yritysten konkurssimallinnus on jäänyt vähemmälle huomiolle. Akateemisen tutkimuksen viimeaikaisesta kehityksestä huolimatta käytännön sovellukset perustuvat usein yksinkertaisille tilastollisille malleille johtuen niiden paremmasta ymmärrettävyydestä. Tässä diplomityössä rakennetaan ennustemalli suomalaisten PK-yritysten konkurssiriskin määritykseen käyttäen tilinpäätösdataa vuosilta 2008–2010. Tavoitteena on tarkka, mutta käyttäjäystävällinen ja helposti tulkittava malli. Konkurssimallinnuksen keskeisiin osa-alueisiin perehdytään kirjallisuuskatsauksessa, jonka pohjalta suunnitellaan empiirinen tutkimus. Viiden mallinnusmenetelmän suoriutumista vertaillaan erilaisia opetusaineiston ja ennustemuuttujien osajoukkoja käyttäen, ja löydösten perusteella kaksi parasta menetelmää otetaan lähempään tarkasteluun. Satunnaismetsä (random forest) -koneoppimismenetelmää käyttävä, kaikkia saatavilla olevia ennustemuuttujia ja muokkaamatonta, epäsuhtaisesti konkurssi- ja ei-konkurssitapauksia sisältävää opetusaineistoa hyödyntävä malli toimii parhaiten. Keskeisten suorituskykymittarien valossa satunnaismetsämalli suoriutuu käytettyä verrokkia paremmin, ja todetaan helppokäyttöiseksi ja hyvin tulkittavaksi; sitä suositellaan sovellettavaksi käytäntöön. Omavaraisuusaste ja rahoituskulujen suhde taseen loppusummaan osoittautuvat johdonmukaisesti parhaiksi ennustemuuttujiksi eri mallinnusmetodeilla, mutta muilta osin havainnot muuttujien keskinäisestä paremmuudesta ovat vaihtelevia. Tämä diplomityö osoittaa, että konkurssiennustemalli voi olla sekä tarkka että käytännöllinen, ja tarjoaa suuntaviivoja tuleville tutkimuksille. Empiiristen havaintojen ja kirjallisuuslöydösten pohjalta esitetään jatkotutkimusehdotuksia. Erityisen tärkeä huomio on se, että konkurssiennustamisessa tyypillisesti käytettyjen suorituskykymittarien soveltuvuus on kyseenalaista konkurssitapausten harvinaisuudesta johtuen. Muilla tutkimusaloilla laajasti käytetty tarkkuus-saantikäyrän alle jäävä pinta-ala (PR AUC) todetaan soveliaaksi vaihtoehdoksi, ja sitä suositellaan käytettäväksi konkurssimallien suorituskyvyn mittaukseen. Avainsanat konkurssien ennustaminen, luottoriski, koneoppiminen

    Machine learning methods for systemic risk analysis in financial sectors.

    Get PDF
    Financial systemic risk is an important issue in economics and financial systems. Trying to detect and respond to systemic risk with growing amounts of data produced in financial markets and systems, a lot of researchers have increasingly employed machine learning methods. Machine learning methods study the mechanisms of outbreak and contagion of systemic risk in the financial network and improve the current regulation of the financial market and industry. In this paper, we survey existing researches and methodologies on assessment and measurement of financial systemic risk combined with machine learning technologies, including big data analysis, network analysis and sentiment analysis, etc. In addition, we identify future challenges, and suggest further research topics. The main purpose of this paper is to introduce current researches on financial systemic risk with machine learning methods and to propose directions for future work.This research has been partially supported by grants from the National Natural Science Foundation of China (#U1811462, #71874023, #71771037, #71725001, and #71433001)

    Feature selection strategies for improving data-driven decision support in bank telemarketing

    Get PDF
    The usage of data mining techniques to unveil previously undiscovered knowledge has been applied in past years to a wide number of domains, including banking and marketing. Raw data is the basic ingredient for successfully detecting interesting patterns. A key aspect of raw data manipulation is feature engineering and it is related with the correct characterization or selection of relevant features (or variables) that conceal relations with the target goal. This study is particularly focused on feature engineering, aiming at the unfolding features that best characterize the problem of selling long-term bank deposits through telemarketing campaigns. For the experimental setup, a case-study from a Portuguese bank, ranging the 2008-2013 year period and encompassing the recent global financial crisis, was addressed. To assess the relevance of such problem, a novel literature analysis using text mining and the latent Dirichlet allocation algorithm was conducted, confirming the existence of a research gap for bank telemarketing. Starting from a dataset containing typical telemarketing contacts and client information, research followed three different and complementary strategies: first, by enriching the dataset with social and economic context features; then, by including customer lifetime value related features; finally, by applying a divide and conquer strategy for splitting the problem in smaller fractions, leading to optimized sub-problems. Each of the three approaches improved previous results in terms of model metrics related to prediction performance. The relevance of the proposed features was evaluated, confirming the obtained models as credible and valuable for telemarketing campaign managers.A utilização de técnicas de data mining para a descoberta de conhecimento tem sido aplicada nos últimos anos a uma grande variedade de domínios, incluindo banca e marketing. Os dados no seu estado primitivo constituem o ingrediente básico para a deteção de padrões de informação. Um aspeto chave da manipulação de dados em bruto consiste na "engenharia de atributos", que compreende uma correta definição e seleção de atributos relevantes (ou variáveis) que se relacionem com o alvo da descoberta de conhecimento. Este trabalho foca-se numa abordagem de "engenharia de atributos" para definir as variáveis que melhor caraterizam o problema de vender depósitos bancários a prazo através de campanhas de telemarketing. Sendo um estudo empírico, foi utilizado um caso de estudo de um banco português, abrangendo o período 2008-2013, que inclui os efeitos da crise financeira internacional. Para aferir da importância deste problema, foi realizada uma inovadora análise da literatura recorrendo a text mining e ao algoritmo latent Dirichlet allocation, confirmando a existência de uma lacuna nesta matéria. Utilizando como base um conjunto de dados de contactos de telemarketing e informação sobre os clientes, três estratégias diferentes e complementares foram propostas: primeiro, os dados foram enriquecidos com atributos socioeconómicos; posteriormente, foram adicionadas características associadas ao valor do cliente ao longo do seu tempo de vida; finalmente, o problema foi dividido em problemas mais específicos, permitindo abordagens otimizadas a cada subproblema. Cada abordagem melhorou as métricas associadas à capacidade preditiva do modelo. Adicionalmente, a relevância dos atributos foi avaliada, confirmando os modelos obtidos como credíveis e valiosos para gestores de campanhas de telemarketing
    corecore