3,330 research outputs found

    Bankruptcy prediction for credit risk using neural networks: A survey and new results

    Get PDF
    The prediction of corporate bankruptcies is an important and widely studied topic since it can have significant impact on bank lending decisions and profitability. This work presents two contributions. First we review the topic of bankruptcy prediction, with emphasis on neural-network (NN) models. Second, we develop an NN bankruptcy prediction model. Inspired by one of the traditional credit risk models developed by Merton (1974), we propose novel indicators for the NN system. We show that the use of these indicators in addition to traditional financial ratio indicators provides a significant improvement in the (out-of-sample) prediction accuracy (from 81.46% to 85.5% for a three-year-ahead forecast)

    Three essays on the use of neural networks for financial prediction

    Get PDF
    The number of studies trying to explain the causes and consequences of the economic and financial crises usually rises considerably after a banking crisis occurs. The dramatic effects of the most recent financial crisis on the real economy around the world call for a better comprehension of previous crises as a way to anticipate future crisis episodes. It is precisely this objective, preventing future crises, the main motivation of this PhD dissertation. We identify two important mechanisms that have failed during the latest years and that are closely related to the onset of the financial crisis: The assessment of the solvency of banks along with the systemic risk over the time, and the detection of the macroeconomic imbalances in some countries, especially in Europe, which made the financial crisis evolve through a sovereign crisis. Our dissertation is made up of three different essays, trying to go a step ahead in the knowledge of these mechanisms.Departamento de Economía Financiera y ContabilidadDoctorado en Economía de la Empres

    Trustworthiness and metrics in visualizing similarity of gene expression

    Get PDF
    BACKGROUND: Conventionally, the first step in analyzing the large and high-dimensional data sets measured by microarrays is visual exploration. Dendrograms of hierarchical clustering, self-organizing maps (SOMs), and multidimensional scaling have been used to visualize similarity relationships of data samples. We address two central properties of the methods: (i) Are the visualizations trustworthy, i.e., if two samples are visualized to be similar, are they really similar? (ii) The metric. The measure of similarity determines the result; we propose using a new learning metrics principle to derive a metric from interrelationships among data sets. RESULTS: The trustworthiness of hierarchical clustering, multidimensional scaling, and the self-organizing map were compared in visualizing similarity relationships among gene expression profiles. The self-organizing map was the best except that hierarchical clustering was the most trustworthy for the most similar profiles. Trustworthiness can be further increased by treating separately those genes for which the visualization is least trustworthy. We then proceed to improve the metric. The distance measure between the expression profiles is adjusted to measure differences relevant to functional classes of the genes. The genes for which the new metric is the most different from the usual correlation metric are listed and visualized with one of the visualization methods, the self-organizing map, computed in the new metric. CONCLUSIONS: The conjecture from the methodological results is that the self-organizing map can be recommended to complement the usual hierarchical clustering for visualizing and exploring gene expression data. Discarding the least trustworthy samples and improving the metric still improves it

    Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detection

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsAlthough the current trend of data production is focused on generating tons of it every second, there are situations where the target category is represented extremely unequally, giving rise to imbalanced datasets, analyzing them correctly can lead to relevant decisions that produces appropriate business strategies. Fraud modeling is one example of this situation: it is expected less fraudulent transactions than reliable ones, predict them could be crucial for improving decisions and processes in a company. However, class imbalance produces a negative effect on traditional techniques in dealing with this problem, a lot of techniques have been proposed and oversampling is one of them. This work analyses the behavior of different oversampling techniques such as Random oversampling, SOMO and SMOTE, through different classifiers and evaluation metrics. The exercise is done with real data from an insurance company in Colombia predicting fraudulent claims for its compulsory auto product. Conclusions of this research demonstrate the advantages of using oversampling for imbalance circumstances but also the importance of comparing different evaluation metrics and classifiers to obtain accurate appropriate conclusions and comparable results

    Enhanced default risk models with SVM+

    Get PDF
    Default risk models have lately raised a great interest due to the recent world economic crisis. In spite of many advanced techniques that have extensively been proposed, no comprehensive method incorporating a holistic perspective has hitherto been considered. Thus, the existing models for bankruptcy prediction lack the whole coverage of contextual knowledge which may prevent the decision makers such as investors and financial analysts to take the right decisions. Recently, SVM+ provides a formal way to incorporate additional information (not only training data) onto the learning models improving generalization. In financial settings examples of such non-financial (though relevant) information are marketing reports, competitors landscape, economic environment, customers screening, industry trends, etc. By exploiting additional information able to improve classical inductive learning we propose a prediction model where data is naturally separated into several structured groups clustered by the size and annual turnover of the firms. Experimental results in the setting of a heterogeneous data set of French companies demonstrated that the proposed default risk model showed better predictability performance than the baseline SVM and multi-task learning with SVM.info:eu-repo/semantics/publishedVersio

    Learning metrics and discriminative clustering

    Get PDF
    In this work methods have been developed to extract relevant information from large, multivariate data sets in a flexible, nonlinear way. The techniques are applicable especially at the initial, explorative phase of data analysis, in cases where an explicit indicator of relevance is available as part of the data set. The unsupervised learning methods, popular in data exploration, often rely on a distance measure defined for data items. Selection of the distance measure, part of which is feature selection, is therefore fundamentally important. The learning metrics principle is introduced to complement manual feature selection by enabling automatic modification of a distance measure on the basis of available relevance information. Two applications of the principle are developed. The first emphasizes relevant aspects of the data by directly modifying distances between data items, and is usable, for example, in information visualization with the self-organizing maps. The other method, discriminative clustering, finds clusters that are internally homogeneous with respect to the interesting variation of the data. The techniques have been applied to text document analysis, gene expression clustering, and charting the bankruptcy sensitivity of companies. In the first, more straightforward approach, a new local metric of the data space measures changes in the conditional distribution of the relevance-indicating data by the Fisher information matrix, a local approximation of the Kullback-Leibler distance. Discriminative clustering, on the other hand, directly minimizes a Kullback-Leibler based distortion measure within the clusters, or equivalently maximizes the mutual information between the clusters and the relevance indicator. A finite-data algorithm for discriminative clustering is also presented. It maximizes a partially marginalized posterior probability of the model and is asymptotically equivalent to maximizing mutual information.reviewe

    ПРОГНОЗ ФІНАНСОВИХ ПРОБЛЕМ, ВИКОРИСТОВУЮЧИ МЕТАЕВРИСТИЧНІ МОДЕЛІ

    Get PDF
    Investors need to assess and analyze the financial statement, to make the logical decision. Using financial ratios is one of the most common methods. The main purpose of this research is to predict the financial crisis, using ratios of liquidity. Four models, Support vector machine, neural network back propagation, Decision trees and Adaptive Neuro–Fuzzy Inference System has been compared.Furthermore, the ratios of liquidity considered in a period of 89_93. The research method is qualitative and quantitative and type of casual comparative. The result indicates that the accuracy of the neural network, Decision tree, and Adaptive Neuro–Fuzzy Inference System showed that there is a significant differently 0/000 and 0/005 years this is more than support vector machine result. Therefore the result of support vector machine showed that there is a significant differently 0/001 in years. This has been shown that neural network in 2 years before the bankruptcy has the ability to predict a right thing. Therefore, the results have been shown that all four models were statistically significant. Consequently, there are no significant differences. All models have the precision to predict the financial crisis.Инвесторам необходимо оценить и проанализировать финансовую отчетность, принять логическое решение. Использование финансовых показателей является одним из самых распространенных методов. Основная цель этого исследования – прогнозировать финансовый кризис, используя соотношение ликвидности. Четыре модели: векторные машины поддержки, обратное распространение нейронных сетей, дерево решений и адаптивная нейро–нечеткая система вывода. Кроме того, коэффициенты ликвидности рассмотрены в период 2011–2015 гг. Метод исследования является качественным и количественным, а также тип случайной сравнительной. Результат показывает точность нейронной сети, дерево решений, и система Adaptive Neuro–Fuzzy Inference показала, что значительно отличается от 0/000 и 0/005 лет, это больше, чем поддержка векторной машины. Поэтому результат поддержки векторной машины показал, что существует значительно по–разному 0/001 лет. Это показало, что нейронная сеть за 2 года до банкротства имеет возможность прогнозировать правильно. Поэтому результаты показали, что все четыре модели были статистически значимыми. Итак, существенных различий нет. Все модели имеют точность прогнозирования финансового кризиса.Інвесторам необхідно оцінити та проаналізувати фінансову звітність, прийняти логічне рішення. Використання фінансових показників є одним з найпоширеніших методів. Основна мета цього дослідження – прогнозувати фінансову кризу, використовуючи співвідношення ліквідності. Чотири моделі: векторні машини підтримки, зворотне розповсюдження нейронних мереж, дерево рішень та адаптивна  система нейро–нечіткого висновку. Крім того, коефіцієнти ліквідності розглянуті в період 2011–2015 рр. Метод дослідження є якісним та кількісним, а також тип випадкової порівняльної. Результат показує точність нейронної мережі, дерево рішень, і система Adaptive Neuro–Fuzzy Inference показала, що значно відрізняється від 0/000 і 0/005 років, це більше, ніж підтримка векторної машини. Тому результат підтримки векторної машини показав, що існує значно по–різному 0/001 років. Це показало, що нейронна мережа за 2 роки до банкрутства має можливість прогнозувати правильну річ. Тому результати показали, що всі чотири моделі були статистично значущими. Отже, істотних відмінностей немає. Всі моделі мають точність прогнозування фінансової кризи

    Prediction of Banks Financial Distress

    Get PDF
    In this research we conduct a comprehensive review on the existing literature of prediction techniques that have been used to assist on prediction of the bank distress. We categorized the review results on the groups depending on the prediction techniques method, our categorization started by firstly using time factors of the founded literature, so we mark the literature founded in the period (1990-2010) as history of prediction techniques, and after this period until 2013 as recent prediction techniques and then presented the strengths and weaknesses of both. We came out by the fact that there was no specific type fit with all bank distress issue although we found that intelligent hybrid techniques considered the most candidates methods in term of accuracy and reputatio

    Measuring, Monitoring and Managing Legal Complexity

    Get PDF
    The American legal system is often accused of being “too complex.” For example, most Americans believe the Tax Code is too complex. But what does that mean, and how would one prove the Tax Code is too complex? Both the descriptive claim that an element of law is complex and the normative claim that it is too complex should be empirically testable hypotheses. Yet, in fact, very little is known about how to measure legal complexity, much less how to monitor and manage it. Legal scholars have begun to employ the science of complex adaptive systems, also known as complexity science, to probe these kinds of descriptive and normative questions about the legal system. This body of work has focused primarily on developing theories of legal complexity and positing reasons for, and ways of, managing it. Legal scholars thus have skipped the hard part—developing quantitative metrics and methods for measuring and monitoring law’s complexity. But the theory of legal complexity will remain stuck in theory until it moves to the empirical phase of study. Thinking about ways of managing legal complexity is pointless if there is no yardstick for deciding how complex the law should be. In short, the theory of legal complexity cannot be put to work without more robust empirical tools for identifying and tracking complexity in legal systems. This Article explores legal complexity at a depth not previously undertaken in legal scholarship. First, the Article orients the discussion by briefly reviewing complexity science scholarship to develop descriptive, prescriptive, and ethical theories of legal complexity. The Article then shifts to the empirical front, identifying potentially useful metrics and methods for studying legal complexity. It draws from complexity science to develop methods that have been or might be applied to measure different features of legal complexity. Next, the Article proposes methods for monitoring legal complexity over time, in particular by conceptualizing what we call Legal Maps—a multi-layered, active representation of the legal system network at work. Finally, the Article concludes with a preliminary examination of how the measurement and monitoring techniques could inform interventions designed to manage legal complexity by using currently available machine learning and user interface design technologies
    corecore