4,009 research outputs found

    Absolute Correlation Weighted Naïve Bayes for Software Defect Prediction

    Full text link
    The maintenance phase of the software project can be very expensive for the developer team and harmful to the users because some flawed software modules. It can be avoided by detecting defects as early as possible. Software defect prediction will provide an opportunity for the developer team to test modules or files that have a high probability defect. Naïve Bayes has been used to predict software defects. However, Naive Bayes assumes all attributes are equally important and are not related each other while, in fact, this assumption is not true in many cases. Absolute value of correlation coefficient has been proposed as weighting method to overcome Naïve Bayes assumptions. In this study, Absolute Correlation Weighted Naïve Bayes have been proposed. The results of parametric test on experiment results show that the proposed method improve the performance of Naïve Bayes for classifying defect-prone on software defect prediction

    Optimized Naïve Bayesian Algorithm for Efficient Performance

    Get PDF
    Naïve Bayesian algorithm is a data mining algorithm that depicts relationship between data objects using probabilistic method. Classification using Bayesian algorithm is usually done by finding the class that has the highest probability value. Data mining is a popular research area that consists of algorithm development and pattern extraction from database using different algorithms. Classification is one of the major tasks of data mining which aimed at building a model (classifier) that can be used to predict unknown class labels. There are so many algorithms for classification such as decision tree classifier, neural network, rule induction and naïve Bayesian. This paper is focused on naïve Bayesian algorithm which is a classical algorithm for classifying categorical data. It easily converged at local optima. Particle Swarm Optimization (PSO) algorithm has gained recognition in many fields of human endeavours and has been applied to enhance efficiency and accuracy in different problem domain. This paper proposed an optimized naïve Bayesian classifier using particle swarm optimization to overcome the problem of premature convergence and to improve the efficiency of the naïve Bayesian algorithm. The classification result from the optimized naïve Bayesian when compared with the traditional algorithm showed a better performance Keywords: Data Mining, Classification, Particle Swarm Optimization, Naïve Bayesian

    Alleviating Naive Bayes attribute independence assumption by attribute weighting

    Get PDF
    Despite the simplicity of the Naive Bayes classifier, it has continued to perform well against more sophisticated newcomers and has remained, therefore, of great interest to the machine learning community. Of numerous approaches to refining the naive Bayes classifier, attribute weighting has received less attention than it warrants. Most approaches, perhaps influenced by attribute weighting in other machine learning algorithms, use weighting to place more emphasis on highly predictive attributes than those that are less predictive. In this paper, we argue that for naive Bayes attribute weighting should instead be used to alleviate the conditional independence assumption. Based on this premise, we propose a weighted naive Bayes algorithm, called WANBIA, that selects weights to minimize either the negative conditional log likelihood or the mean squared error objective functions. We perform extensive evaluations and find that WANBIA is a competitive alternative to state of the art classifiers like Random Forest, Logistic Regression and A1DE. © 2013 Nayyar A. Zaidi, Jesus Cerquides, Mark J. Carman and Geoffrey I. Webb.This research has been supported by the Australian Research Council under grant DP110101427 and Asian Office of Aerospace Research and Development, Air Force Office of Scientific Research under contract FA23861214030. The authors would like to thank Mark Hall for providing the code for CFS and MH. The authors would also like to thank anonymous reviewers for their insightful comments that helped improving the paper tremendously.Peer Reviewe

    What attracts vehicle consumers’ buying:A Saaty scale-based VIKOR (SSC-VIKOR) approach from after-sales textual perspective?

    Get PDF
    Purpose: The increasingly booming e-commerce development has stimulated vehicle consumers to express individual reviews through online forum. The purpose of this paper is to probe into the vehicle consumer consumption behavior and make recommendations for potential consumers from textual comments viewpoint. Design/methodology/approach: A big data analytic-based approach is designed to discover vehicle consumer consumption behavior from online perspective. To reduce subjectivity of expert-based approaches, a parallel Naïve Bayes approach is designed to analyze the sentiment analysis, and the Saaty scale-based (SSC) scoring rule is employed to obtain specific sentimental value of attribute class, contributing to the multi-grade sentiment classification. To achieve the intelligent recommendation for potential vehicle customers, a novel SSC-VIKOR approach is developed to prioritize vehicle brand candidates from a big data analytical viewpoint. Findings: The big data analytics argue that “cost-effectiveness” characteristic is the most important factor that vehicle consumers care, and the data mining results enable automakers to better understand consumer consumption behavior. Research limitations/implications: The case study illustrates the effectiveness of the integrated method, contributing to much more precise operations management on marketing strategy, quality improvement and intelligent recommendation. Originality/value: Researches of consumer consumption behavior are usually based on survey-based methods, and mostly previous studies about comments analysis focus on binary analysis. The hybrid SSC-VIKOR approach is developed to fill the gap from the big data perspective

    Self-adaptive attribute weighting for Naive Bayes classification

    Full text link
    ©2014 Elsevier Ltd. All rights reserved. Naive Bayes (NB) is a popular machine learning tool for classification, due to its simplicity, high computational efficiency, and good classification accuracy, especially for high dimensional data such as texts. In reality, the pronounced advantage of NB is often challenged by the strong conditional independence assumption between attributes, which may deteriorate the classification performance. Accordingly, numerous efforts have been made to improve NB, by using approaches such as structure extension, attribute selection, attribute weighting, instance weighting, local learning and so on. In this paper, we propose a new Artificial Immune System (AIS) based self-adaptive attribute weighting method for Naive Bayes classification. The proposed method, namely AISWNB, uses immunity theory in Artificial Immune Systems to search optimal attribute weight values, where self-adjusted weight values will alleviate the conditional independence assumption and help calculate the conditional probability in an accurate way. One noticeable advantage of AISWNB is that the unique immune system based evolutionary computation process, including initialization, clone, section, and mutation, ensures that AISWNB can adjust itself to the data without explicit specification of functional or distributional forms of the underlying model. As a result, AISWNB can obtain good attribute weight values during the learning process. Experiments and comparisons on 36 machine learning benchmark data sets and six image classification data sets demonstrate that AISWNB significantly outperforms its peers in classification accuracy, class probability estimation, and class ranking performance

    Multiple instance learning for sequence data with across bag dependencies

    Full text link
    In Multiple Instance Learning (MIL) problem for sequence data, the instances inside the bags are sequences. In some real world applications such as bioinformatics, comparing a random couple of sequences makes no sense. In fact, each instance may have structural and/or functional relations with instances of other bags. Thus, the classification task should take into account this across bag relation. In this work, we present two novel MIL approaches for sequence data classification named ABClass and ABSim. ABClass extracts motifs from related instances and use them to encode sequences. A discriminative classifier is then applied to compute a partial classification result for each set of related sequences. ABSim uses a similarity measure to discriminate the related instances and to compute a scores matrix. For both approaches, an aggregation method is applied in order to generate the final classification result. We applied both approaches to solve the problem of bacterial Ionizing Radiation Resistance prediction. The experimental results of the presented approaches are satisfactory

    Evolutionary lazy learning for Naive Bayes classification

    Full text link
    © 2016 IEEE. Most improvements for Naive Bayes (NB) have a common yet important flaw - these algorithms split the modeling of the classifier into two separate stages - the stage of preprocessing (e.g., feature selection and data expansion) and the stage of building the NB classifier. The first stage does not take the NB's objective function into consideration, so the performance of the classification cannot be guaranteed. Motivated by these facts and aiming to improve NB with accurate classification, we present a new learning algorithm called Evolutionary Local Instance Weighted Naive Bayes or ELWNB, to extend NB for classification. ELWNB combines local NB, instance weighted dataset extension and evolutionary algorithms seamlessly. Experiments on 20 UCI benchmark datasets demonstrate that ELWNB significantly outperforms NB and several other improved NB algorithms

    Water filtration by using apple and banana peels as activated carbon

    Get PDF
    Water filter is an important devices for reducing the contaminants in raw water. Activated from charcoal is used to absorb the contaminants. Fruit peels are some of the suitable alternative carbon to substitute the charcoal. Determining the role of fruit peels which were apple and banana peels powder as activated carbon in water filter is the main goal. Drying and blending the peels till they become powder is the way to allow them to absorb the contaminants. Comparing the results for raw water before and after filtering is the observation. After filtering the raw water, the reading for pH was 6.8 which is in normal pH and turbidity reading recorded was 658 NTU. As for the colour, the water becomes more clear compared to the raw water. This study has found that fruit peels such as banana and apple are an effective substitute to charcoal as natural absorbent

    Local feature weighting in nearest prototype classification

    Get PDF
    The distance metric is the corner stone of nearest neighbor (NN)-based methods, and therefore, of nearest prototype (NP) algorithms. That is because they classify depending on the similarity of the data. When the data is characterized by a set of features which may contribute to the classification task in different levels, feature weighting or selection is required, sometimes in a local sense. However, local weighting is typically restricted to NN approaches. In this paper, we introduce local feature weighting (LFW) in NP classification. LFW provides each prototype its own weight vector, opposite to typical global weighting methods found in the NP literature, where all the prototypes share the same one. Providing each prototype its own weight vector has a novel effect in the borders of the Voronoi regions generated: They become nonlinear. We have integrated LFW with a previously developed evolutionary nearest prototype classifier (ENPC). The experiments performed both in artificial and real data sets demonstrate that the resulting algorithm that we call LFW in nearest prototype classification (LFW-NPC) avoids overfitting on training data in domains where the features may have different contribution to the classification task in different areas of the feature space. This generalization capability is also reflected in automatically obtaining an accurate and reduced set of prototypes.Publicad
    corecore