21,771 research outputs found

    A survey of cost-sensitive decision tree induction algorithms

    Get PDF
    The past decade has seen a significant interest on the problem of inducing decision trees that take account of costs of misclassification and costs of acquiring the features used for decision making. This survey identifies over 50 algorithms including approaches that are direct adaptations of accuracy based methods, use genetic algorithms, use anytime methods and utilize boosting and bagging. The survey brings together these different studies and novel approaches to cost-sensitive decision tree learning, provides a useful taxonomy, a historical timeline of how the field has developed and should provide a useful reference point for future research in this field

    CSNL: A cost-sensitive non-linear decision tree algorithm

    Get PDF
    This article presents a new decision tree learning algorithm called CSNL that induces Cost-Sensitive Non-Linear decision trees. The algorithm is based on the hypothesis that nonlinear decision nodes provide a better basis than axis-parallel decision nodes and utilizes discriminant analysis to construct nonlinear decision trees that take account of costs of misclassification. The performance of the algorithm is evaluated by applying it to seventeen datasets and the results are compared with those obtained by two well known cost-sensitive algorithms, ICET and MetaCost, which generate multiple trees to obtain some of the best results to date. The results show that CSNL performs at least as well, if not better than these algorithms, in more than twelve of the datasets and is considerably faster. The use of bagging with CSNL further enhances its performance showing the significant benefits of using nonlinear decision nodes. The performance of the algorithm is evaluated by applying it to seventeen data sets and the results are compared with those obtained by two well known cost-sensitive algorithms, ICET and MetaCost, which generate multiple trees to obtain some of the best results to date. The results show that CSNL performs at least as well, if not better than these algorithms, in more than twelve of the data sets and is considerably faster. The use of bagging with CSNL further enhances its performance showing the significant benefits of using non-linear decision nodes

    Inducing safer oblique trees without costs

    Get PDF
    Decision tree induction has been widely studied and applied. In safety applications, such as determining whether a chemical process is safe or whether a person has a medical condition, the cost of misclassification in one of the classes is significantly higher than in the other class. Several authors have tackled this problem by developing cost-sensitive decision tree learning algorithms or have suggested ways of changing the distribution of training examples to bias the decision tree learning process so as to take account of costs. A prerequisite for applying such algorithms is the availability of costs of misclassification. Although this may be possible for some applications, obtaining reasonable estimates of costs of misclassification is not easy in the area of safety. This paper presents a new algorithm for applications where the cost of misclassifications cannot be quantified, although the cost of misclassification in one class is known to be significantly higher than in another class. The algorithm utilizes linear discriminant analysis to identify oblique relationships between continuous attributes and then carries out an appropriate modification to ensure that the resulting tree errs on the side of safety. The algorithm is evaluated with respect to one of the best known cost-sensitive algorithms (ICET), a well-known oblique decision tree algorithm (OC1) and an algorithm that utilizes robust linear programming

    Improving bankruptcy prediction in micro-entities by using nonlinear effects and non-financial variables

    Get PDF
    The use of non-parametric methodologies, the introduction of non-financial variables, and the development of models geared towards the homogeneous characteristics of corporate sub-populations have recently experienced a surge of interest in the bankruptcy literature. However, no research on default prediction has yet focused on micro-entities (MEs), despite such firms’ importance in the global economy. This paper builds the first bankruptcy model especially designed for MEs by using a wide set of accounts from 1999 to 2008 and applying artificial neural networks (ANNs). Our findings show that ANNs outperform the traditional logistic regression (LR) models. In addition, we also report that, thanks to the introduction of non-financial predictors related to age, the delay in filing accounts, legal action by creditors to recover unpaid debts, and the ownership features of the company, the improvement with respect to the use of solely financial information is 3.6%, which is even higher than the improvement that involves the use of the best ANN (2.6%)

    Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm

    Full text link
    This paper introduces ICET, a new algorithm for cost-sensitive classification. ICET uses a genetic algorithm to evolve a population of biases for a decision tree induction algorithm. The fitness function of the genetic algorithm is the average cost of classification when using the decision tree, including both the costs of tests (features, measurements) and the costs of classification errors. ICET is compared here with three other algorithms for cost-sensitive classification - EG2, CS-ID3, and IDX - and also with C4.5, which classifies without regard to cost. The five algorithms are evaluated empirically on five real-world medical datasets. Three sets of experiments are performed. The first set examines the baseline performance of the five algorithms on the five datasets and establishes that ICET performs significantly better than its competitors. The second set tests the robustness of ICET under a variety of conditions and shows that ICET maintains its advantage. The third set looks at ICET's search in bias space and discovers a way to improve the search.Comment: See http://www.jair.org/ for any accompanying file

    The Affordable Care Act raises the stakes on worker classification; what does this mean for the Voluntary Classification Settlement Program

    Get PDF
    This research considers worker classification and the many implications an employer must consider when classifying a worker as employee or independent contractor. One implication relates to healthcare benefits and healthcare taxes. As such, this research will evaluate the new healthcare taxes and implications resulting from the Affordable Care Act. Furthermore, this research will relate and explain worker classification with regards to the Voluntary Classification Settlement Program. This is a program offered by the Internal Revenue Service allowing employers to prospectively classify workers as employees with tax relief for past misclassification. The healthcare implications from the Affordable Care Act have raised the stakes on worker classification. This research will confirm whether this will provide greater incentive for employers to classify workers as employees or independent contractors. This research considers worker classification and the many implications an employer must consider when classifying a worker as employee or independent contractor. One implication relates to healthcare benefits and healthcare taxes. As such, this research will evaluate the new healthcare taxes and implications resulting from the Affordable Care Act. Furthermore, this research will relate and explain worker classification with regards to the Voluntary Classification Settlement Program. This is a program offered by the Internal Revenue Service allowing employers to prospectively classify workers as employees with tax relief for past misclassification. The healthcare implications from the Affordable Care Act have raised the stakes on worker classification. This research will confirm whether this will provide greater incentive for employers to classify workers as employees or independent contractors

    Tax and Policy Implications of Changes to Reporting Requirements for Construction Services

    Get PDF
    [Excerpt] New York and other states could increase revenue and improve their tax systems by requiring information reporting for all payments by businesses for construction services, utilizing a form similar to the Federal form 1099. The state could also advance other important policy goals including an increase in the fairness of the tax system and a reduction of the misclassification of workers as independent contractors
    corecore