707 research outputs found

    Intelligent Financial Fraud Detection Practices: An Investigation

    Full text link
    Financial fraud is an issue with far reaching consequences in the finance industry, government, corporate sectors, and for ordinary consumers. Increasing dependence on new technologies such as cloud and mobile computing in recent years has compounded the problem. Traditional methods of detection involve extensive use of auditing, where a trained individual manually observes reports or transactions in an attempt to discover fraudulent behaviour. This method is not only time consuming, expensive and inaccurate, but in the age of big data it is also impractical. Not surprisingly, financial institutions have turned to automated processes using statistical and computational methods. This paper presents a comprehensive investigation on financial fraud detection practices using such data mining methods, with a particular focus on computational intelligence-based techniques. Classification of the practices based on key aspects such as detection algorithm used, fraud type investigated, and success rate have been covered. Issues and challenges associated with the current practices and potential future direction of research have also been identified.Comment: Proceedings of the 10th International Conference on Security and Privacy in Communication Networks (SecureComm 2014

    Estimating bankruptcy using neural networks trained with hidden layer learning vector quantization

    Get PDF
    The Hidden Layer Learning Vector Quantization (HLVQ), a recent algorithm for training neural networks, is used to correct the output of traditional MultiLayer Preceptrons (MLP) in estimating the probability of company bankruptcy. It is shown that this method improves the results of traditional neural networks and outperforms substantially the discriminant analysis in predicting one-year advance bankruptcy. We also studied the effect of using unbalanced samples of healthy and bankrupted firms. The database used was Diane, which contains financial accounts of French firms. The sample is composed of all 583 industrial bankruptcies found in the database with more than 35 employees, that occurred in the 1999-2000 period. For the classification models we considered 30 financial ratios published by Coface available from Diane database, and additionally the Beaver (1966) ratio of Cash Earnings to Total Debt, the 5 ratios of Altman (1968) used in his Z-model and the size of the firms measured by the logarithm of sales. Attention was given to variable selection, data pre¬processing and feature selection to reduce the dimensionality of the problem

    An academic review: applications of data mining techniques in finance industry

    Get PDF
    With the development of Internet techniques, data volumes are doubling every two years, faster than predicted by Moore’s Law. Big Data Analytics becomes particularly important for enterprise business. Modern computational technologies will provide effective tools to help understand hugely accumulated data and leverage this information to get insights into the finance industry. In order to get actionable insights into the business, data has become most valuable asset of financial organisations, as there are no physical products in finance industry to manufacture. This is where data mining techniques come to their rescue by allowing access to the right information at the right time. These techniques are used by the finance industry in various areas such as fraud detection, intelligent forecasting, credit rating, loan management, customer profiling, money laundering, marketing and prediction of price movements to name a few. This work aims to survey the research on data mining techniques applied to the finance industry from 2010 to 2015.The review finds that Stock prediction and Credit rating have received most attention of researchers, compared to Loan prediction, Money Laundering and Time Series prediction. Due to the dynamics, uncertainty and variety of data, nonlinear mapping techniques have been deeply studied than linear techniques. Also it has been proved that hybrid methods are more accurate in prediction, closely followed by Neural Network technique. This survey could provide a clue of applications of data mining techniques for finance industry, and a summary of methodologies for researchers in this area. Especially, it could provide a good vision of Data Mining Techniques in computational finance for beginners who want to work in the field of computational finance

    MCDM approach to evaluating bank loan default models

    Get PDF
    Banks and financial institutions rely on loan default prediction models in credit risk management. An important yet challenging task in developing and applying default classification models is model evaluation and selection. This study proposes an evaluation approach for bank loan default classification models based on multiple criteria decision making (MCDM) methods. A large real-life Chinese bank loan dataset is used to validate the proposed approach. Specifically, a set of performance metrics is utilized to measure a selection of statistical and machine-learning default models. The technique for order preference by similarity to ideal solution (TOPSIS), a MCDM method, takes the performances of default classification models on multiple performance metrics as inputs to generate a ranking of default risk models. In addition, feature selection and sampling techniques are applied to the data pre-processing step to handle high dimensionality and class unbalancedness of bank loan default data. The results show that K-Nearest Neighbor algorithm has a good potential in bank loan default prediction

    A Comprehensive Survey on Enterprise Financial Risk Analysis: Problems, Methods, Spotlights and Applications

    Full text link
    Enterprise financial risk analysis aims at predicting the enterprises' future financial risk.Due to the wide application, enterprise financial risk analysis has always been a core research issue in finance. Although there are already some valuable and impressive surveys on risk management, these surveys introduce approaches in a relatively isolated way and lack the recent advances in enterprise financial risk analysis. Due to the rapid expansion of the enterprise financial risk analysis, especially from the computer science and big data perspective, it is both necessary and challenging to comprehensively review the relevant studies. This survey attempts to connect and systematize the existing enterprise financial risk researches, as well as to summarize and interpret the mechanisms and the strategies of enterprise financial risk analysis in a comprehensive way, which may help readers have a better understanding of the current research status and ideas. This paper provides a systematic literature review of over 300 articles published on enterprise risk analysis modelling over a 50-year period, 1968 to 2022. We first introduce the formal definition of enterprise risk as well as the related concepts. Then, we categorized the representative works in terms of risk type and summarized the three aspects of risk analysis. Finally, we compared the analysis methods used to model the enterprise financial risk. Our goal is to clarify current cutting-edge research and its possible future directions to model enterprise risk, aiming to fully understand the mechanisms of enterprise risk communication and influence and its application on corporate governance, financial institution and government regulation

    The use of genetic programming for detecting the incorrect predictions of classification models

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsCompanies around the world use Advanced Analytics to support their decision making process. Traditionally they used Statistics and Business Intelligence for that, but as the technology is advancing, the more complex models are gaining popularity. The main reason for an increasing interest in Machine Learning and Deep Learning models is the fact that they reach a high prediction accuracy. On the second hand with good performance, comes an increasing complexity of the programs. Therefore the new area of Predictors was introduced, it is called Explainable AI. The idea is to create models that can be understood by business users or models to explain other predictions. Therefore we propose the study in which we create a separate model, that will serve as a very er for the machine learning models predictions. This work falls into area of Post-processing of models outputs. For this purpose we select Genetic Programming, that was proven to be successful in various applications. In the scope of this research we investigate if GP can evaluate the prediction of other models. This area of applications was not explored yet, therefore in the study we explore the possibility of evolving an individual for another model validation. We focus on classi cation problems and select 4 machine learning models: logistic regression, decision tree, random forest, perceptron and 3 di erent datasets. This set up is used for assuring that during the research we conclude that the presented idea is universal for di erent problems. The performance of 12 Genetic Programming experiments indicates that in some cases it is possible to create a successful model for errors prediction. During the study we discovered that the performance of GP programs is mostly connected to the dataset on the experiment is conducted. The type of predictive models does not in uence the performance of GP. Although we managed to create good classi ers of errors, during the evolution process we faced the problem of over tting. That is common in problems with imbalanced datasets. The results of the study con rms that GP can be used for the new type of problems and successfully predict errors of Machine Learning Models

    Nonstatistical Factors Influencing Predictions of Financial Distress and Managerial Implications in the All-Cargo Airline Industry

    Get PDF
    All-cargo airlines carry over 50% of global airfreight, yet they are prone to bankruptcy. Many financial models are designed to predict a firms\u27 financial health, but they do not assess many nonstatistical factors that influence the prediction capability of these models. In this study, qualitative grounded theory design was used to identify nonstatistical factors and explore how they influence bankruptcy prediction models in the all-cargo airline industry. In the first phase of the study, financial data from 2005 to 2009 for 17 all-cargo U.S. airlines were used to determine the bankruptcy prediction ability of the Kroeze financial bankruptcy model. A sample of six all-cargo airlines (ABX Air, Arrow Air, Atlas Air, Cargo 360, Gemini Air Cargo, and Kitty Hawk Air Cargo) were selected containing a mixture of airlines for which the Kroeze model correctly and incorrectly predicted bankruptcy. The sample was used as the starting point to explore the nonstatistical factors using grounded theory. Data were obtained on the six airlines from company annual reports, SEC 10K annual reports, reports from professional journals such as Air Transport Intelligence and Traffic World, news reports and company press releases. The data were coded and grouped into conceptual categories, which were used in theory generation to support the emerging theory. Six categories (management, risk, operations, competitive advantage, financial, and external factors) that relate to the financial stability of an all-cargo airline emerged during the research. Three themes emerged that may improve current quantitative bankruptcy prediction models. The three themes are airline fleet type, type of aircraft flown, and aircraft utilization. The three themes relate to the type, use, and make up of an airline’s fleet. These themes influence bankruptcy prediction model and should be incorporated into failure prediction models to improve their overall accuracy. Future research should be conducted to verify these findings on a larger population, such as all-cargo airlines that operate outside the United States. These airlines operate under different financial regimes that may affect the prediction models differently

    Will it fail and why? A large case study of company default prediction with highly interpretable machine learning models

    Get PDF
    Finding a model to predict the default of a firm is a well-known topic over the financial and data science community. Default prediction problem has been studied for over fifty years, but remain a very hard task even today. Since it maintains a remarkable practical relevance, we try to put in practice our efforts in order to obtain the maximum rediction results, also in comparison with the reference literature. In our work we use in combination three large and important datasets in order to investigate both bankruptcy and bank default: a state of difficulty for companies that often anticipates actual bankruptcy. We combine one dataset from the Italian Central Credit Register of the Bank of Italy, one from balance sheet information related to Italian firms, and information from AnaCredit dataset, a novel source of credit data by European Central Bank. We try to go beyond the academic study and to show how our model, based on some promising machine learning algorithms, outperforms the current default predictions made by credit institutions. At the same time, we try to provide insights on the reasons that lead to a particular outcome. In fact, many modern approaches try to find well-performing models to forecast the default of a company; those models often act like a black-box and don’t give to financial institutions the fundamental explanations they need for their choices. This project aims to find a robust predictive model using a tree-based machine learning algorithm which flanked by a game-theoretic approach can provide sound explanations of the output of the model. Finally, we dedicated a special effort to the analysis of predictions in highly unbalanced contexts. Imbalanced classes are a common problem in machine learning classification that typically is addressed by removing the imbalance in the training set. We conjecture that it is not always the best choice and propose the use of a slightly unbalanced training set, showing that this approach contributes to maximize the performance
    • …
    corecore