Search CORE

6,058 research outputs found

APPLICATION OF RECURSIVE PARTITIONING TO AGRICULTURAL CREDIT SCORING

Author: LaDue Eddy L.
Novak Michael P.
Publication venue
Publication date
Field of study

Recursive Partitioning Algorithm (RPA) is introduced as a technique for credit scoring analysis, which allows direct incorporation of misclassification costs. This study corroborates nonagricultural credit studies, which indicate that RPA outperforms logistic regression based on within-sample observations. However, validation based on more appropriate out-of-sample observations indicates that logistic regression is superior under some conditions. Incorporation of misclassification costs can influence the creditworthiness decision.finance, credit scoring, misclassification, recursive partitioning algorithm, Agricultural Finance,

Research Papers in Economics

Hybrid model using logit and nonparametric methods for predicting micro-entity failure

Author: Blanco Oliver Antonio Jesús
Irimia Diéguez Ana Isabel
Oliver Alfonso María Dolores
Vázquez Cueto María José
Publication venue: LLC “Consulting Publishing Company “Business Perspectives”
Publication date: 01/01/2016
Field of study

Following the calls from literature on bankruptcy, a parsimonious hybrid bankruptcy model is developed in this paper by combining parametric and non-parametric approaches.To this end, the variables with the highest predictive power to detect bankruptcy are selected using logistic regression (LR). Subsequently, alternative non-parametric methods (Multilayer Perceptron, Rough Set, and Classification-Regression Trees) are applied, in turn, to firms classified as either “bankrupt” or “not bankrupt”. Our findings show that hybrid models, particularly those combining LR and Multilayer Perceptron, offer better accuracy performance and interpretability and converge faster than each method implemented in isolation. Moreover, the authors demonstrate that the introduction of non-financial and macroeconomic variables complement financial ratios for bankruptcy prediction

idUS. Depósito de Investigación Universidad de Sevilla

Improving bankruptcy prediction in micro-entities by using nonlinear effects and non-financial variables

Author: Blanco Oliver Antonio Jesús
Irimia Diéguez Ana Isabel
Oliver Alfonso María Dolores
Wilson Nicholas
Publication venue: Charles University Prague
Publication date: 01/01/2015
Field of study

The use of non-parametric methodologies, the introduction of non-financial variables, and the development of models geared towards the homogeneous characteristics of corporate sub-populations have recently experienced a surge of interest in the bankruptcy literature. However, no research on default prediction has yet focused on micro-entities (MEs), despite such firms’ importance in the global economy. This paper builds the first bankruptcy model especially designed for MEs by using a wide set of accounts from 1999 to 2008 and applying artificial neural networks (ANNs). Our findings show that ANNs outperform the traditional logistic regression (LR) models. In addition, we also report that, thanks to the introduction of non-financial predictors related to age, the delay in filing accounts, legal action by creditors to recover unpaid debts, and the ownership features of the company, the improvement with respect to the use of solely financial information is 3.6%, which is even higher than the improvement that involves the use of the best ANN (2.6%)

idUS. Depósito de Investigación Universidad de Sevilla

Comment: Boosting Algorithms: Regularization, Prediction and Model Fitting

Author: Buja Andreas
Mease David
Wyner Abraham J.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 17/04/2008
Field of study

The authors are doing the readers of Statistical Science a true service with a well-written and up-to-date overview of boosting that originated with the seminal algorithms of Freund and Schapire. Equally, we are grateful for high-level software that will permit a larger readership to experiment with, or simply apply, boosting-inspired model fitting. The authors show us a world of methodology that illustrates how a fundamental innovation can penetrate every nook and cranny of statistical thinking and practice. They introduce the reader to one particular interpretation of boosting and then give a display of its potential with extensions from classification (where it all started) to least squares, exponential family models, survival analysis, to base-learners other than trees such as smoothing splines, to degrees of freedom and regularization, and to fascinating recent work in model selection. The uninitiated reader will find that the authors did a nice job of presenting a certain coherent and useful interpretation of boosting. The other reader, though, who has watched the business of boosting for a while, may have quibbles with the authors over details of the historic record and, more importantly, over their optimism about the current state of theoretical knowledge. In fact, as much as ``the statistical view'' has proven fruitful, it has also resulted in some ideas about why boosting works that may be misconceived, and in some recommendations that may be misguided. [arXiv:0804.2752]Comment: Published in at http://dx.doi.org/10.1214/07-STS242B the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Random Forests for Big Data

Author: Genuer Robin
Poggi Jean-Michel
Tuleau-Malot Christine
Villa-Vialaneix Nathalie
Publication venue
Publication date: 19/11/2015
Field of study

Big Data is one of the major challenges of statistical science and has numerous consequences from algorithmic and theoretical viewpoints. Big Data always involve massive data but they also often include online data and data heterogeneity. Recently some statistical methods have been adapted to process Big Data, like linear regression models, clustering methods and bootstrapping schemes. Based on decision trees combined with aggregation and bootstrap ideas, random forests were introduced by Breiman in 2001. They are a powerful nonparametric statistical method allowing to consider in a single and versatile framework regression problems, as well as two-class and multi-class classification problems. Focusing on classification problems, this paper proposes a selective review of available proposals that deal with scaling random forests to Big Data problems. These proposals rely on parallel environments or on online adaptations of random forests. We also describe how related quantities -- such as out-of-bag error and variable importance -- are addressed in these methods. Then, we formulate various remarks for random forests in the Big Data context. Finally, we experiment five variants on two massive datasets (15 and 120 millions of observations), a simulated one as well as real world data. One variant relies on subsampling while three others are related to parallel implementations of random forests and involve either various adaptations of bootstrap to Big Data or to "divide-and-conquer" approaches. The fifth variant relates on online learning of random forests. These numerical experiments lead to highlight the relative performance of the different variants, as well as some of their limitations

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes

ProdInra

Hal-Diderot

Evaluating Misclassification Probability Using Empirical Risk

Author: Nedel’ko Victor
Publication venue: Institute of Information Theories and Applications FOI ITHEA
Publication date: 01/01/2006
Field of study

* The work is supported by RFBR, grant 04-01-00858-aThe goal of the paper is to estimate misclassification probability for decision function by training sample. Here are presented results of investigation an empirical risk bias for nearest neighbours, linear and decision tree classifier in comparison with exact bias estimations for a discrete (multinomial) case. This allows to find out how far Vapnik–Chervonenkis risk estimations are off for considered decision function classes and to choose optimal complexity parameters for constructed decision functions. Comparison of linear classifier and decision trees capacities is also performed

Bulgarian Digital Mathematics Library at IMI-BAS

Review of eradication and containment campaigns. PD No. 5-1

Author: Bacher S.
Breukers A.
Canon R.
Jarosik V.
Pergl J.
Pysek P.
Therese P.
Publication venue: EU Framework 7
Publication date: 01/01/2009
Field of study

Wageningen University & Research Publications