Search CORE

4,363 research outputs found

A Decision tree-based attribute weighting filter for naive Bayes

Author: Hall Mark A.
Publication venue
Publication date: 01/05/2006
Field of study

The naive Bayes classifier continues to be a popular learning algorithm for data mining applications due to its simplicity and linear run-time. Many enhancements to the basic algorithm have been proposed to help mitigate its primary weakness--the assumption that attributes are independent given the class. All of them improve the performance of naïve Bayes at the expense (to a greater or lesser degree) of execution time and/or simplicity of the final model. In this paper we present a simple filter method for setting attribute weights for use with naive Bayes. Experimental results show that naive Bayes with attribute weights rarely degrades the quality of the model compared to standard naive Bayes and, in many cases, improves it dramatically. The main advantages of this method compared to other approaches for improving naive Bayes is its run-time complexity and the fact that it maintains the simplicity of the final model

Research Commons@Waikato

Naive Bayes and Exemplar-Based approaches to Word Sense Disambiguation Revisited

Author: Escudero Gerard
Marquez Lluis
Rigau German
Publication venue
Publication date: 01/01/2000
Field of study

This paper describes an experimental comparison between two standard supervised learning methods, namely Naive Bayes and Exemplar-based classification, on the Word Sense Disambiguation (WSD) problem. The aim of the work is twofold. Firstly, it attempts to contribute to clarify some confusing information about the comparison between both methods appearing in the related literature. In doing so, several directions have been explored, including: testing several modifications of the basic learning algorithms and varying the feature space. Secondly, an improvement of both algorithms is proposed, in order to deal with large attribute sets. This modification, which basically consists in using only the positive information appearing in the examples, allows to improve greatly the efficiency of the methods, with no loss in accuracy. The experiments have been performed on the largest sense-tagged corpus available containing the most frequent and ambiguous English words. Results show that the Exemplar-based approach to WSD is generally superior to the Bayesian approach, especially when a specific metric for dealing with symbolic attributes is used.Comment: 5 page

arXiv.org e-Print Archive

CiteSeerX

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Absolute Correlation Weighted Naïve Bayes for Software Defect Prediction

Author: Asmono R. T. (Rizky)
Syukur A. (Abdul)
Wahono R. S. (Romi)
Publication venue: IlmuKomputer.com
Publication date: 01/01/2015
Field of study

The maintenance phase of the software project can be very expensive for the developer team and harmful to the users because some flawed software modules. It can be avoided by detecting defects as early as possible. Software defect prediction will provide an opportunity for the developer team to test modules or files that have a high probability defect. Naïve Bayes has been used to predict software defects. However, Naive Bayes assumes all attributes are equally important and are not related each other while, in fact, this assumption is not true in many cases. Absolute value of correlation coefficient has been proposed as weighting method to overcome Naïve Bayes assumptions. In this study, Absolute Correlation Weighted Naïve Bayes have been proposed. The results of parametric test on experiment results show that the proposed method improve the performance of Naïve Bayes for classifying defect-prone on software defect prediction

Neliti

Self-adaptive attribute weighting for Naive Bayes classification

Author: Cai Z
Pan S
Wu J
Zhang C
Zhang P
Zhu X
Publication venue: 'Elsevier BV'
Publication date: 15/02/2015
Field of study

©2014 Elsevier Ltd. All rights reserved. Naive Bayes (NB) is a popular machine learning tool for classification, due to its simplicity, high computational efficiency, and good classification accuracy, especially for high dimensional data such as texts. In reality, the pronounced advantage of NB is often challenged by the strong conditional independence assumption between attributes, which may deteriorate the classification performance. Accordingly, numerous efforts have been made to improve NB, by using approaches such as structure extension, attribute selection, attribute weighting, instance weighting, local learning and so on. In this paper, we propose a new Artificial Immune System (AIS) based self-adaptive attribute weighting method for Naive Bayes classification. The proposed method, namely AISWNB, uses immunity theory in Artificial Immune Systems to search optimal attribute weight values, where self-adjusted weight values will alleviate the conditional independence assumption and help calculate the conditional probability in an accurate way. One noticeable advantage of AISWNB is that the unique immune system based evolutionary computation process, including initialization, clone, section, and mutation, ensures that AISWNB can adjust itself to the data without explicit specification of functional or distributional forms of the underlying model. As a result, AISWNB can obtain good attribute weight values during the learning process. Experiments and comparisons on 36 machine learning benchmark data sets and six image classification data sets demonstrate that AISWNB significantly outperforms its peers in classification accuracy, class probability estimation, and class ranking performance

OPUS - University of Technology Sydney

Seir immune strategy for instance weighted naive bayes classification

Author: DW Aha
GI Webb
J Wu
J Wu
J Wu
J Wu
JR Quinlan
L Jiang
L Jiang
N Friedman
SB Kim
T Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

© Springer International Publishing Switzerland 2015. Naive Bayes (NB) has been popularly applied in many classification tasks. However, in real-world applications, the pronounced advantage of NB is often challenged by insufficient training samples. Specifically, the high variance may occur with respect to the limited number of training samples. The estimated class distribution of a NB classier is inaccurate if the number of training instances is small. To handle this issue, in this paper, we proposed a SEIR (Susceptible, Exposed, Infectious and Recovered) immune-strategy-based instance weighting algorithm for naive Bayes classification, namely SWNB. The immune instance weighting allows the SWNB algorithm adjust itself to the data without explicit specification of functional or distributional forms of the underlying model. Experiments and comparisons on 20 benchmark datasets demonstrated that the proposed SWNB algorithm outperformed existing state-of-the-art instance weighted NB algorithm and other related computational intelligence methods

Crossref

OPUS - University of Technology Sydney

SODE: Self-Adaptive One-Dependence Estimators for classification

Author: Pan S
Wu J
Zhang C
Zhang P
Zhu X
Publication venue: 'Elsevier BV'
Publication date: 01/03/2016
Field of study

© 2015 Elsevier Ltd. SuperParent-One-Dependence Estimators (SPODEs) represent a family of semi-naive Bayesian classifiers which relax the attribute independence assumption of Naive Bayes (NB) to allow each attribute to depend on a common single attribute (superparent). SPODEs can effectively handle data with attribute dependency but still inherent NB's key advantages such as computational efficiency and robustness for high dimensional data. In reality, determining an optimal superparent for SPODEs is difficult. One common approach is to use weighted combinations of multiple SPODEs, each having a different superparent with a properly assigned weight value (i.e., a weight value is assigned to each attribute). In this paper, we propose a self-adaptive SPODEs, namely SODE, which uses immunity theory in artificial immune systems to automatically and self-adaptively select the weight for each single SPODE. SODE does not need to know the importance of individual SPODE nor the relevance among SPODEs, and can flexibly and efficiently search optimal weight values for each SPODE during the learning process. Extensive experiments and comparisons on 56 benchmark data sets, and validations on image and text classification, demonstrate that SODE outperforms state-of-the-art weighted SPODE algorithms and is suitable for a wide range of learning tasks. Results also confirm that SODE provides an appropriate balance between runtime efficiency and accuracy

OPUS - University of Technology Sydney