98,494 research outputs found
CSNL: A cost-sensitive non-linear decision tree algorithm
This article presents a new decision tree learning algorithm called CSNL that induces Cost-Sensitive Non-Linear decision trees. The algorithm is based on the hypothesis that nonlinear decision nodes provide a better basis than axis-parallel decision nodes and utilizes discriminant analysis to construct nonlinear decision trees that take account of costs of misclassification.
The performance of the algorithm is evaluated by applying it to seventeen datasets and the results are compared with those obtained by two well known cost-sensitive algorithms, ICET and MetaCost, which generate multiple trees to obtain some of the best results to date. The results show that CSNL performs at least as well, if not better than these algorithms, in more than twelve of the datasets and is considerably faster. The use of bagging with CSNL further enhances its performance showing the significant benefits of using nonlinear decision nodes.
The performance of the algorithm is evaluated by applying it to seventeen data sets and the results are
compared with those obtained by two well known cost-sensitive algorithms, ICET and MetaCost, which generate multiple trees to obtain some of the best results to date.
The results show that CSNL performs at least as well, if not better than these algorithms, in more than twelve of the data sets and is considerably faster.
The use of bagging with CSNL further enhances its performance showing the significant benefits of using non-linear decision nodes
A survey of cost-sensitive decision tree induction algorithms
The past decade has seen a significant interest on the problem of inducing decision trees that take account of costs of misclassification and costs of acquiring the features used for decision making. This survey identifies over 50 algorithms including approaches that are direct adaptations of accuracy based methods, use genetic algorithms, use anytime methods and utilize boosting and bagging. The survey brings together these different studies and novel approaches to cost-sensitive decision tree learning, provides a useful taxonomy, a historical timeline of how the field has developed and should provide a useful reference point for future research in this field
Data Imputation through the Identification of Local Anomalies
We introduce a comprehensive and statistical framework in a model free
setting for a complete treatment of localized data corruptions due to severe
noise sources, e.g., an occluder in the case of a visual recording. Within this
framework, we propose i) a novel algorithm to efficiently separate, i.e.,
detect and localize, possible corruptions from a given suspicious data instance
and ii) a Maximum A Posteriori (MAP) estimator to impute the corrupted data. As
a generalization to Euclidean distance, we also propose a novel distance
measure, which is based on the ranked deviations among the data attributes and
empirically shown to be superior in separating the corruptions. Our algorithm
first splits the suspicious instance into parts through a binary partitioning
tree in the space of data attributes and iteratively tests those parts to
detect local anomalies using the nominal statistics extracted from an
uncorrupted (clean) reference data set. Once each part is labeled as anomalous
vs normal, the corresponding binary patterns over this tree that characterize
corruptions are identified and the affected attributes are imputed. Under a
certain conditional independency structure assumed for the binary patterns, we
analytically show that the false alarm rate of the introduced algorithm in
detecting the corruptions is independent of the data and can be directly set
without any parameter tuning. The proposed framework is tested over several
well-known machine learning data sets with synthetically generated corruptions;
and experimentally shown to produce remarkable improvements in terms of
classification purposes with strong corruption separation capabilities. Our
experiments also indicate that the proposed algorithms outperform the typical
approaches and are robust to varying training phase conditions
Rates of convergence in active learning
We study the rates of convergence in generalization error achievable by
active learning under various types of label noise. Additionally, we study the
general problem of model selection for active learning with a nested hierarchy
of hypothesis classes and propose an algorithm whose error rate provably
converges to the best achievable error among classifiers in the hierarchy at a
rate adaptive to both the complexity of the optimal classifier and the noise
conditions. In particular, we state sufficient conditions for these rates to be
dramatically faster than those achievable by passive learning.Comment: Published in at http://dx.doi.org/10.1214/10-AOS843 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Auto-WEKA: Combined Selection and Hyperparameter Optimization of Classification Algorithms
Many different machine learning algorithms exist; taking into account each
algorithm's hyperparameters, there is a staggeringly large number of possible
alternatives overall. We consider the problem of simultaneously selecting a
learning algorithm and setting its hyperparameters, going beyond previous work
that addresses these issues in isolation. We show that this problem can be
addressed by a fully automated approach, leveraging recent innovations in
Bayesian optimization. Specifically, we consider a wide range of feature
selection techniques (combining 3 search and 8 evaluator methods) and all
classification approaches implemented in WEKA, spanning 2 ensemble methods, 10
meta-methods, 27 base classifiers, and hyperparameter settings for each
classifier. On each of 21 popular datasets from the UCI repository, the KDD Cup
09, variants of the MNIST dataset and CIFAR-10, we show classification
performance often much better than using standard selection/hyperparameter
optimization methods. We hope that our approach will help non-expert users to
more effectively identify machine learning algorithms and hyperparameter
settings appropriate to their applications, and hence to achieve improved
performance.Comment: 9 pages, 3 figure
- …