60,088 research outputs found
A survey of cost-sensitive decision tree induction algorithms
The past decade has seen a significant interest on the problem of inducing decision trees that take account of costs of misclassification and costs of acquiring the features used for decision making. This survey identifies over 50 algorithms including approaches that are direct adaptations of accuracy based methods, use genetic algorithms, use anytime methods and utilize boosting and bagging. The survey brings together these different studies and novel approaches to cost-sensitive decision tree learning, provides a useful taxonomy, a historical timeline of how the field has developed and should provide a useful reference point for future research in this field
Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm
This paper introduces ICET, a new algorithm for cost-sensitive
classification. ICET uses a genetic algorithm to evolve a population of biases
for a decision tree induction algorithm. The fitness function of the genetic
algorithm is the average cost of classification when using the decision tree,
including both the costs of tests (features, measurements) and the costs of
classification errors. ICET is compared here with three other algorithms for
cost-sensitive classification - EG2, CS-ID3, and IDX - and also with C4.5,
which classifies without regard to cost. The five algorithms are evaluated
empirically on five real-world medical datasets. Three sets of experiments are
performed. The first set examines the baseline performance of the five
algorithms on the five datasets and establishes that ICET performs
significantly better than its competitors. The second set tests the robustness
of ICET under a variety of conditions and shows that ICET maintains its
advantage. The third set looks at ICET's search in bias space and discovers a
way to improve the search.Comment: See http://www.jair.org/ for any accompanying file
Recommended from our members
Dynamic low-level context for the detection of mild traumatic brain injury.
Mild traumatic brain injury (mTBI) appears as low contrast lesions in magnetic resonance (MR) imaging. Standard automated detection approaches cannot detect the subtle changes caused by the lesions. The use of context has become integral for the detection of low contrast objects in images. Context is any information that can be used for object detection but is not directly due to the physical appearance of an object in an image. In this paper, new low-level static and dynamic context features are proposed and integrated into a discriminative voxel-level classifier to improve the detection of mTBI lesions. Visual features, including multiple texture measures, are used to give an initial estimate of a lesion. From the initial estimate novel proximity and directional distance, contextual features are calculated and used as features for another classifier. This feature takes advantage of spatial information given by the initial lesion estimate using only the visual features. Dynamic context is captured by the proposed posterior marginal edge distance context feature, which measures the distance from a hard estimate of the lesion at a previous time point. The approach is validated on a temporal mTBI rat model dataset and shown to have improved dice score and convergence compared to other state-of-the-art approaches. Analysis of feature importance and versatility of the approach on other datasets are also provided
A new analysis strategy for detection of faint gamma-ray sources with Imaging Atmospheric Cherenkov Telescopes
A new background rejection strategy for gamma-ray astrophysics with
stereoscopic Imaging Atmospheric Cherenkov Telescopes (IACT), based on Monte
Carlo (MC) simulations and real background data from the H.E.S.S. [High Energy
Stereoscopic System, see [1].] experiment, is described. The analysis is based
on a multivariate combination of both previously-known and newly-derived
discriminant variables using the physical shower properties, as well as its
multiple images, for a total of eight variables. Two of these new variables are
defined thanks to a new energy evaluation procedure, which is also presented
here. The method allows an enhanced sensitivity with the current generation of
ground-based Cherenkov telescopes to be achieved, and at the same time its main
features of rapidity and flexibility allow an easy generalization to any type
of IACT. The robustness against Night Sky Background (NSB) variations of this
approach is tested with MC simulated events. The overall consistency of the
analysis chain has been checked by comparison of the real gamma-ray signal
obtained from H.E.S.S. observations with MC simulations and through
reconstruction of known source spectra. Finally, the performance has been
evaluated by application to faint H.E.S.S. sources. The gain in sensitivity as
compared to the best standard Hillas analysis ranges approximately from 1.2 to
1.8 depending on the source characteristics, which corresponds to an economy in
observation time of a factor 1.4 to 3.2.Comment: 26 pages, 13 figure
- …