175 research outputs found

    Evaluation of the Performance of the Markov Blanket Bayesian Classifier Algorithm

    Full text link
    The Markov Blanket Bayesian Classifier is a recently-proposed algorithm for construction of probabilistic classifiers. This paper presents an empirical comparison of the MBBC algorithm with three other Bayesian classifiers: Naive Bayes, Tree-Augmented Naive Bayes and a general Bayesian network. All of these are implemented using the K2 framework of Cooper and Herskovits. The classifiers are compared in terms of their performance (using simple accuracy measures and ROC curves) and speed, on a range of standard benchmark data sets. It is concluded that MBBC is competitive in terms of speed and accuracy with the other algorithms considered.Comment: 9 pages: Technical Report No. NUIG-IT-011002, Department of Information Technology, National University of Ireland, Galway (2002

    On the use of Bayesian network classifiers to classify patients with peptic ulcer among upper gastrointestinal bleeding patients

    Get PDF
    A Bayesian network classifier is one type of graphical probabilistic models that is capable of representing relationship between variables in a given domain under study. We consider the naive Bayes, tree augmented naive Bayes (TAN) and boosted augmented naive Bayes (BAN) to classify patients with peptic ulcer disease among upper gastro intestinal bleeding patients. We compare their performance with IBk and C4.5. To identify relevant variables for peptic ulcer disease, we use some methodologies for attributes subset selection. Results show that, blood urea nitrogen, hemoglobin and gastric malignancy are important for classification. BAN achieves the best accuracy of 77.3 and AUC of (0.81) followed by TAN with 72.4 and 0.76 respectively among Bayesian classifiers. While the accuracy of the TAN is improved with attribute selection, the BAN and IBK are better off without attribute selection

    Object Classification Techniques using Tree Based Classifiers

    Get PDF
    Object recognition is presently one of the most active research areas in computer vision, pattern recognition, artificial intelligence and human activity analysis. The area of object detection and classification, attention habitually focuses on changes in the location of anobject with respect to time, since appearance information can sensibly describe the object category. In this paper, feature set obtained from the Gray Level Co-Occurrence Matrices (GLCM), representing a different stage of statistical variations of object category. The experiments are carried out using Caltech 101 dataset, considering sevenobjects viz (airplanes, camera, chair, elephant, laptop, motorbike and bonsai tree) and the extracted GLCM feature set are modeled by tree based classifier like Naive Bayes Tree and Random Forest. In the experimental results, Random Forest classifier exhibits the accuracy and effectiveness of the proposed method with an overall accuracy rate of 89.62%, which outperforms the Naive Bayes classifier

    Bayesian Approach For Early Stage Event Prediction In Survival Data

    Get PDF
    Predicting event occurrence at an early stage in longitudinal studies is an important and challenging problem which has high practical value. As opposed to the standard classification and regression problems where a domain expert can provide the labels for the data in a reasonably short period of time, training data in such longitudinal studies must be obtained only by waiting for the occurrence of sufficient number of events. On the other hand, survival analysis aims at finding the underlying distribution for data that measure the length of time until the occurrence of an event. However, it cannot give an answer to the open question of how to forecast whether a subject will experience event by end of study having event occurrence information at early stage of survival data?\u27\u27. This problem exhibits two major challenges: 1) absence of complete information about event occurrence (censoring) and 2) availability of only a partial set of events that occurred during the initial phase of the study. Thus, the main objective of this work is to predict for which subject in the study event will occur at future based on few event information at the initial stages of a longitudinal study. In this thesis, we propose a novel approach to address the first challenge by introducing a new method for handling censored data using Kaplan-Meier estimator. The second challenge is tackled by effectively integrating Bayesian methods with an Accelerated Failure Time (AFT) model by adapting the prior probability of the event occurrence for future time points. In another word, we propose a novel Early Stage Prediction (ESP) framework for building event prediction models which are trained at early stages of longitudinal studies. More specifically, we extended the Naive Bayes, Tree-Augmented Naive Bayes (TAN) and Bayesian Network methods based on the proposed framework, and developed three algorithms, namely, ESP-NB, ESP-TAN and ESP-BN, to effectively predict event occurrence using the training data obtained at early stage of the study. The proposed framework is evaluated using a wide range of synthetic and real-world benchmark datasets. Our extensive set of experiments show that the proposed ESP framework is able to more accurately predict future event occurrences using only a limited amount of training data compared to the other alternative prediction methods

    Negative Price Forecasting in Australian Energy Markets using gradient-boosted Machines: Predictive and Probabilistic Analysis

    Get PDF
    With the integration of distributed energy resources such as roof-top solar panels and wind turbines into the grid, power generation can surpass demand-generation and thus, giving rise to the negative pricing, especially in the summer months. In this regard, a scientific case study is conducted in this paper to analyse and predict the increasing instances of negative energy prices against demand-generation in Australian energy markets (AEMs) using real-time energy data from the Hornsdale power reserve, South Australia. A robust machine learning method, Light gradient boosting machine (LightGBM) is utilised to detect and predict negative prices at different quantiles to quantity the outliers in the pricing data. The implementation results demonstrate that predicting the prices at different quantiles can tackle outliers (negative prices) effectively with the help of extracted upper and lower bounds using quantile regression-based approach. The case study is further extended to learn the complex statistical relationships between different data features using Naive-Bayes Tree Augmented (NB-TAN) algorithm considering ‘price’ as the dependent feature against the independent features such as demand-generation, battery charging/discharging, and frequency control ancillary services

    Recognition of traffic generated by WebRTC communication

    Get PDF
    Network traffic recognition serves as a basic condition for network operators to differentiate and prioritize traffic for a number of purposes, from guaranteeing the Quality of Service (QoS), to monitoring safety, as well as monitoring and detecting anomalies. Web Real-Time Communication (WebRTC) is an open-source project that enables real-time audio, video, and text communication among browsers. Since WebRTC does not include any characteristic pattern for semantically based traffic recognition, this paper proposes models for recognizing traffic generated during WebRTC audio and video communication based on statistical characteristics and usage of machine learning in Weka tool. Five classification algorithms have been used for model development, such as Naive Bayes, J48, Random Forest, REP tree, and Bayes Net. The results show that J48 and BayesNet have the best performances in this experimental case of WebRTC traffic recognition. Future work will be focused on comparison of a wide range of machine learning algorithms using a large enough dataset to improve the significance of the results
    corecore