36,300 research outputs found
A survey of cost-sensitive decision tree induction algorithms
The past decade has seen a significant interest on the problem of inducing decision trees that take account of costs of misclassification and costs of acquiring the features used for decision making. This survey identifies over 50 algorithms including approaches that are direct adaptations of accuracy based methods, use genetic algorithms, use anytime methods and utilize boosting and bagging. The survey brings together these different studies and novel approaches to cost-sensitive decision tree learning, provides a useful taxonomy, a historical timeline of how the field has developed and should provide a useful reference point for future research in this field
A System for Induction of Oblique Decision Trees
This article describes a new system for induction of oblique decision trees.
This system, OC1, combines deterministic hill-climbing with two forms of
randomization to find a good oblique split (in the form of a hyperplane) at
each node of a decision tree. Oblique decision tree methods are tuned
especially for domains in which the attributes are numeric, although they can
be adapted to symbolic or mixed symbolic/numeric attributes. We present
extensive empirical studies, using both real and artificial data, that analyze
OC1's ability to construct oblique trees that are smaller and more accurate
than their axis-parallel counterparts. We also examine the benefits of
randomization for the construction of oblique decision trees.Comment: See http://www.jair.org/ for an online appendix and other files
accompanying this articl
Rule-based Machine Learning Methods for Functional Prediction
We describe a machine learning method for predicting the value of a
real-valued function, given the values of multiple input variables. The method
induces solutions from samples in the form of ordered disjunctive normal form
(DNF) decision rules. A central objective of the method and representation is
the induction of compact, easily interpretable solutions. This rule-based
decision model can be extended to search efficiently for similar cases prior to
approximating function values. Experimental results on real-world data
demonstrate that the new techniques are competitive with existing machine
learning and statistical methods and can sometimes yield superior regression
performance.Comment: See http://www.jair.org/ for any accompanying file
Inducing safer oblique trees without costs
Decision tree induction has been widely studied and applied. In safety applications, such as determining whether a chemical process is safe or whether a person has a medical condition, the cost of misclassification in one of the classes is significantly higher than in the other class. Several authors have tackled this problem by developing cost-sensitive decision tree learning algorithms or have suggested ways of changing the
distribution of training examples to bias the decision tree learning process so as to take account of costs. A prerequisite for applying such algorithms is the availability of costs of misclassification.
Although this may be possible for some applications, obtaining reasonable estimates of costs of misclassification is not easy in the area of safety.
This paper presents a new algorithm for applications where the cost of misclassifications cannot be quantified, although the cost of misclassification in one class is known to be significantly higher than in another class. The algorithm utilizes linear discriminant analysis to identify oblique relationships between continuous attributes and then carries out an appropriate modification to ensure that the resulting tree errs on the side of safety. The algorithm is evaluated with respect to one of the best known cost-sensitive algorithms (ICET), a well-known oblique decision tree algorithm (OC1) and an algorithm that utilizes robust linear programming
CSNL: A cost-sensitive non-linear decision tree algorithm
This article presents a new decision tree learning algorithm called CSNL that induces Cost-Sensitive Non-Linear decision trees. The algorithm is based on the hypothesis that nonlinear decision nodes provide a better basis than axis-parallel decision nodes and utilizes discriminant analysis to construct nonlinear decision trees that take account of costs of misclassification.
The performance of the algorithm is evaluated by applying it to seventeen datasets and the results are compared with those obtained by two well known cost-sensitive algorithms, ICET and MetaCost, which generate multiple trees to obtain some of the best results to date. The results show that CSNL performs at least as well, if not better than these algorithms, in more than twelve of the datasets and is considerably faster. The use of bagging with CSNL further enhances its performance showing the significant benefits of using nonlinear decision nodes.
The performance of the algorithm is evaluated by applying it to seventeen data sets and the results are
compared with those obtained by two well known cost-sensitive algorithms, ICET and MetaCost, which generate multiple trees to obtain some of the best results to date.
The results show that CSNL performs at least as well, if not better than these algorithms, in more than twelve of the data sets and is considerably faster.
The use of bagging with CSNL further enhances its performance showing the significant benefits of using non-linear decision nodes
Local Rule-Based Explanations of Black Box Decision Systems
The recent years have witnessed the rise of accurate but obscure decision
systems which hide the logic of their internal decision processes to the users.
The lack of explanations for the decisions of black box systems is a key
ethical issue, and a limitation to the adoption of machine learning components
in socially sensitive and safety-critical contexts. %Therefore, we need
explanations that reveals the reasons why a predictor takes a certain decision.
In this paper we focus on the problem of black box outcome explanation, i.e.,
explaining the reasons of the decision taken on a specific instance. We propose
LORE, an agnostic method able to provide interpretable and faithful
explanations. LORE first leans a local interpretable predictor on a synthetic
neighborhood generated by a genetic algorithm. Then it derives from the logic
of the local interpretable predictor a meaningful explanation consisting of: a
decision rule, which explains the reasons of the decision; and a set of
counterfactual rules, suggesting the changes in the instance's features that
lead to a different outcome. Wide experiments show that LORE outperforms
existing methods and baselines both in the quality of explanations and in the
accuracy in mimicking the black box
On The Stability of Interpretable Models
Interpretable classification models are built with the purpose of providing a
comprehensible description of the decision logic to an external oversight
agent. When considered in isolation, a decision tree, a set of classification
rules, or a linear model, are widely recognized as human-interpretable.
However, such models are generated as part of a larger analytical process. Bias
in data collection and preparation, or in model's construction may severely
affect the accountability of the design process. We conduct an experimental
study of the stability of interpretable models with respect to feature
selection, instance selection, and model selection. Our conclusions should
raise awareness and attention of the scientific community on the need of a
stability impact assessment of interpretable models
Fitting Prediction Rule Ensembles with R Package pre
Prediction rule ensembles (PREs) are sparse collections of rules, offering
highly interpretable regression and classification models. This paper presents
the R package pre, which derives PREs through the methodology of Friedman and
Popescu (2008). The implementation and functionality of package pre is
described and illustrated through application on a dataset on the prediction of
depression. Furthermore, accuracy and sparsity of PREs is compared with that of
single trees, random forest and lasso regression in four benchmark datasets.
Results indicate that pre derives ensembles with predictive accuracy comparable
to that of random forests, while using a smaller number of variables for
prediction
Can k-NN imputation improve the performance of C4.5 with small software project data sets? A comparative evaluation
Missing data is a widespread problem that can affect the ability to use data to construct effective prediction systems. We investigate a common machine learning technique that can tolerate missing values, namely C4.5, to predict cost using six real world software project databases. We analyze the predictive performance after using the k-NN missing data imputation technique to see if it is better to tolerate missing data or to try to impute missing values and then apply the C4.5 algorithm. For the investigation, we simulated three missingness mechanisms, three missing data patterns, and five missing data percentages. We found that the k-NN imputation can improve the prediction accuracy of C4.5. At the same time, both C4.5 and k-NN are little affected by the missingness mechanism, but that the missing data pattern and the missing data percentage have a strong negative impact upon prediction (or imputation) accuracy particularly if the missing data percentage exceeds 40%
- âŠ