126,403 research outputs found
Soft Methodology for Cost-and-error Sensitive Classification
Many real-world data mining applications need varying cost for different
types of classification errors and thus call for cost-sensitive classification
algorithms. Existing algorithms for cost-sensitive classification are
successful in terms of minimizing the cost, but can result in a high error rate
as the trade-off. The high error rate holds back the practical use of those
algorithms. In this paper, we propose a novel cost-sensitive classification
methodology that takes both the cost and the error rate into account. The
methodology, called soft cost-sensitive classification, is established from a
multicriteria optimization problem of the cost and the error rate, and can be
viewed as regularizing cost-sensitive classification with the error rate. The
simple methodology allows immediate improvements of existing cost-sensitive
classification algorithms. Experiments on the benchmark and the real-world data
sets show that our proposed methodology indeed achieves lower test error rates
and similar (sometimes lower) test costs than existing cost-sensitive
classification algorithms. We also demonstrate that the methodology can be
extended for considering the weighted error rate instead of the original error
rate. This extension is useful for tackling unbalanced classification problems.Comment: A shorter version appeared in KDD '1
Weakly supervised segment annotation via expectation kernel density estimation
Since the labelling for the positive images/videos is ambiguous in weakly
supervised segment annotation, negative mining based methods that only use the
intra-class information emerge. In these methods, negative instances are
utilized to penalize unknown instances to rank their likelihood of being an
object, which can be considered as a voting in terms of similarity. However,
these methods 1) ignore the information contained in positive bags, 2) only
rank the likelihood but cannot generate an explicit decision function. In this
paper, we propose a voting scheme involving not only the definite negative
instances but also the ambiguous positive instances to make use of the extra
useful information in the weakly labelled positive bags. In the scheme, each
instance votes for its label with a magnitude arising from the similarity, and
the ambiguous positive instances are assigned soft labels that are iteratively
updated during the voting. It overcomes the limitations of voting using only
the negative bags. We also propose an expectation kernel density estimation
(eKDE) algorithm to gain further insight into the voting mechanism.
Experimental results demonstrate the superiority of our scheme beyond the
baselines.Comment: 9 pages, 2 figure
A methodology for the generation of efficient error detection mechanisms
A dependable software system must contain error detection mechanisms and error recovery mechanisms. Software components for the detection of errors are typically designed based on a system specification or the experience of software engineers, with their efficiency typically being measured using fault injection and metrics such as coverage and latency. In this paper, we introduce a methodology for the design of highly efficient error detection mechanisms. The proposed methodology combines fault injection analysis and data mining techniques in order to generate predicates for efficient error detection mechanisms. The results presented demonstrate the viability of the methodology as an approach for the development of efficient error detection mechanisms, as the predicates generated yield a true positive rate of almost 100% and a false positive rate very close to 0% for the detection of failure-inducing states. The main advantage of the proposed methodology over current state-of-the-art approaches is that efficient detectors are obtained by design, rather than by using specification-based detector design or the experience of software engineers
Rough sets theory for travel demand analysis in Malaysia
This study integrates the rough sets theory into tourism demand analysis. Originated from the area of Artificial Intelligence, the rough sets theory was introduced to disclose important structures and to classify objects. The Rough Sets methodology provides definitions and methods for finding which attributes separates one class or classification from another. Based on this theory can propose a formal framework for the automated transformation of data into knowledge. This makes the rough sets approach a useful classification and pattern recognition technique. This study introduces a new rough sets approach for deriving rules from information table of tourist in Malaysia. The induced rules were able to forecast change in demand with certain accuracy
- …