9,908 research outputs found
Application of support vector machines on the basis of the first Hungarian bankruptcy model
In our study we rely on a data mining procedure known as support vector machine (SVM) on the database of the first Hungarian bankruptcy model. The models constructed are then contrasted with the results of earlier bankruptcy models with the use of classification accuracy and the area under the ROC curve. In using the SVM technique, in addition to conventional kernel functions, we also examine the possibilities of applying the ANOVA kernel function and take a detailed look at data preparation tasks recommended in using the SVM method (handling of outliers). The results of the models assembled suggest that a significant improvement of classification accuracy can be achieved on the database of the first Hungarian bankruptcy model when using the SVM method as opposed to neural networks
Hybrid model using logit and nonparametric methods for predicting micro-entity failure
Following the calls from literature on bankruptcy, a parsimonious hybrid bankruptcy model is developed in this paper
by combining parametric and non-parametric approaches.To this end, the variables with the highest predictive power to
detect bankruptcy are selected using logistic regression (LR). Subsequently, alternative non-parametric methods
(Multilayer Perceptron, Rough Set, and Classification-Regression Trees) are applied, in turn, to firms classified as
either âbankruptâ or ânot bankruptâ. Our findings show that hybrid models, particularly those combining LR and
Multilayer Perceptron, offer better accuracy performance and interpretability and converge faster than each method
implemented in isolation. Moreover, the authors demonstrate that the introduction of non-financial and macroeconomic
variables complement financial ratios for bankruptcy prediction
Predicting financial distress:A comparison of survival analysis and decision tree techniques
AbstractFinancial distress and then the consequent failure of a business is usually an extremely costly and disruptive event. Statistical financial distress prediction models attempt to predict whether a business will experience financial distress in the future. Discriminant analysis and logistic regression have been the most popular approaches, but there is also a large number of alternative cutting â edge data mining techniques that can be used. In this paper, a semi-parametric Cox survival analysis model and non-parametric CART decision trees have been applied to financial distress prediction and compared with each other as well as the most popular approaches. This analysis is done over a variety of cost ratios (Type I Error cost: Type II Error cost) and prediction intervals as these differ depending on the situation. The results show that decision trees and survival analysis models have good prediction accuracy that justifies their use and supports further investigation
Neural Networks in Bankruptcy Prediction - A Comparative Study on the Basis of the First Hungarian Bankruptcy Model
The article attempts to answer the question whether or not the latest bankruptcy prediction techniques are more reliable than traditional mathematicalâstatistical ones in Hungary. Simulation experiments carried out on the database of the first Hungarian bankruptcy prediction model clearly
prove that bankruptcy models built using artificial neural networks have higher classification accuracy than models created in the 1990s based on discriminant analysis and logistic regression analysis.
The article presents the main results, analyses the reasons for the differences and presents constructive proposals concerning the further development of Hungarian bankruptcy prediction
Clear Visual Separation of Temporal Event Sequences
Extracting and visualizing informative insights from temporal event sequences
becomes increasingly difficult when data volume and variety increase. Besides
dealing with high event type cardinality and many distinct sequences, it can be
difficult to tell whether it is appropriate to combine multiple events into one
or utilize additional information about event attributes. Existing approaches
often make use of frequent sequential patterns extracted from the dataset,
however, these patterns are limited in terms of interpretability and utility.
In addition, it is difficult to assess the role of absolute and relative time
when using pattern mining techniques.
In this paper, we present methods that addresses these challenges by
automatically learning composite events which enables better aggregation of
multiple event sequences. By leveraging event sequence outcomes, we present
appropriate linked visualizations that allow domain experts to identify
critical flows, to assess validity and to understand the role of time.
Furthermore, we explore information gain and visual complexity metrics to
identify the most relevant visual patterns. We compare composite event learning
with two approaches for extracting event patterns using real world company
event data from an ongoing project with the Danish Business Authority.Comment: In Proceedings of the 3rd IEEE Symposium on Visualization in Data
Science (VDS), 201
Recommended from our members
An Overview of the Use of Neural Networks for Data Mining Tasks
In the recent years the area of data mining has experienced a considerable demand for technologies that extract knowledge from large and complex data sources. There is a substantial commercial interest as well as research investigations in the area that aim to develop new and improved approaches for extracting information, relationships, and patterns from datasets. Artificial Neural Networks (NN) are popular biologically inspired intelligent methodologies, whose classification, prediction and pattern recognition capabilities have been utilised successfully in many areas, including science, engineering, medicine, business, banking, telecommunication, and many other fields. This paper highlights from a data mining perspective the implementation of NN, using supervised and unsupervised learning, for pattern recognition, classification, prediction and cluster analysis, and focuses the discussion on their usage in bioinformatics and financial data analysis tasks
Statistical modelling to predict corporate default for Brazilian companies in the context of Basel II using a new set of financial ratios
This paper deals with statistical modelling to predict failure of Brazilian companies in the light of the Basel II definition of default using a new set of explanatory variables. A rearrangement in the official format of the Balance Sheet is put forward. From this rearrangement a framework of complementary non-conventional ratios is proposed. Initially, a model using 22 traditional ratios is constructed. Problems associated with multicollinearity were found in this model. Adding a group of 6 non-conventional ratios alongside traditional ratios improves the model substantially. The main findings in this study are: (a) logistic regression performs well in the context of Basel II, yielding a sound model applicable in the decision making process; (b) the complementary list of financial ratios plays a critical role in the model proposed; (c) the variables selected in the model show that when current assets and current liabilities are split into two sub-groups - financial and operational - they are more effective in explaining default than the traditional ratios associated with liquidity; and (d) those variables also indicate that high interest rates in Brazil adversely affect the performance of those companies which have a higher dependency on borrowing
A critical assessment of imbalanced class distribution problem: the case of predicting freshmen student attrition
Predicting student attrition is an intriguing yet challenging problem for any academic institution. Class-imbalanced data is a common in the field of student retention, mainly because a lot of students register but fewer students drop out. Classification techniques for imbalanced dataset can yield deceivingly high
prediction accuracy where the overall predictive accuracy is usually driven by the majority class at the expense of having very poor performance on the crucial minority class. In this study, we compared different data balancing techniques to improve the predictive accuracy in minority class while maintaining satisfactory overall classification performance. Specifically, we tested three balancing techniquesâoversampling, under-sampling and synthetic minority over-sampling (SMOTE)âalong with four popular classification methodsâlogistic regression, decision trees, neuron networks and support vector machines. We used a large and feature rich institutional student data (between the years 2005 and 2011) to assess the efficacy of both balancing techniques as well as prediction methods. The results indicated that the support vector machine combined with SMOTE data-balancing technique achieved the best classification performance with a 90.24% overall accuracy on the 10-fold holdout sample. All three data-balancing techniques improved the prediction accuracy for the minority class. Applying sensitivity analyses on developed models, we also identified the most important variables for accurate prediction of student attrition. Application of these models has the potential to accurately predict at-risk students and help reduce student dropout rates
Intelligent Financial Fraud Detection Practices: An Investigation
Financial fraud is an issue with far reaching consequences in the finance
industry, government, corporate sectors, and for ordinary consumers. Increasing
dependence on new technologies such as cloud and mobile computing in recent
years has compounded the problem. Traditional methods of detection involve
extensive use of auditing, where a trained individual manually observes reports
or transactions in an attempt to discover fraudulent behaviour. This method is
not only time consuming, expensive and inaccurate, but in the age of big data
it is also impractical. Not surprisingly, financial institutions have turned to
automated processes using statistical and computational methods. This paper
presents a comprehensive investigation on financial fraud detection practices
using such data mining methods, with a particular focus on computational
intelligence-based techniques. Classification of the practices based on key
aspects such as detection algorithm used, fraud type investigated, and success
rate have been covered. Issues and challenges associated with the current
practices and potential future direction of research have also been identified.Comment: Proceedings of the 10th International Conference on Security and
Privacy in Communication Networks (SecureComm 2014
- âŠ