5,667 research outputs found

    Bayesian classification theory

    Get PDF
    The task of inferring a set of classes and class descriptions most likely to explain a given data set can be placed on a firm theoretical foundation using Bayesian statistics. Within this framework and using various mathematical and algorithmic approximations, the AutoClass system searches for the most probable classifications, automatically choosing the number of classes and complexity of class descriptions. A simpler version of AutoClass has been applied to many large real data sets, has discovered new independently-verified phenomena, and has been released as a robust software package. Recent extensions allow attributes to be selectively correlated within particular classes, and allow classes to inherit or share model parameters though a class hierarchy. We summarize the mathematical foundations of AutoClass

    Bayesian classification in a time-varying environment

    Get PDF
    The problem of classifying a pattern based on multiple observation made in a time-varying environment is analyzed. The identity of the pattern may itself change. A Bayesian solution is derived, after which the conditions of the physical situation are invoked to produce a cascade classifier model. Experimental results based on remote sensing data demonstrate the effectiveness of the classifier

    Analysis of Bayesian classification-based approaches for Android malware detection

    Get PDF
    The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.Mobile malware has been growing in scale and complexity spurred by the unabated uptake of smartphones worldwide. Android is fast becoming the most popular mobile platform resulting in sharp increase in malware targeting the platform. Additionally, Android malware is evolving rapidly to evade detection by traditional signature-based scanning. Despite current detection measures in place, timely discovery of new malware is still a critical issue. This calls for novel approaches to mitigate the growing threat of zero-day Android malware. Hence, the authors develop and analyse proactive machine-learning approaches based on Bayesian classification aimed at uncovering unknown Android malware via static analysis. The study, which is based on a large malware sample set of majority of the existing families, demonstrates detection capabilities with high accuracy. Empirical results and comparative analysis are presented offering useful insight towards development of effective static-analytic Bayesian classification-based solutions for detecting unknown Android malware

    Classifying Electrocardiogram with Machine Learning Techniques

    Get PDF
    Classifying the electrocardiogram is of clinical importance because classification can be used to diagnose patients with cardiac arrhythmias. Many industries utilize machine learning techniques that consist of feature extraction methods followed by Naive- Bayesian classification in order to detect faults within machinery. Machine learning techniques that analyze vibrational machine data in a mechanical application may be used to analyze electrical data in a physiological application. Three of the most common feature extraction methods used to prepare machine vibration data for Naive-Bayesian classification are the Fourier transform, the Hilbert transform, and the Wavelet Packet transform. Each machine learning technique consists of a different feature extraction method to prepare the data for Naive-Bayesian classification. The effectiveness of the different machine learning techniques, when applied to electrocardiogram, is assessed by measuring the sensitivity and specificity of the classifications. Comparing the sensitivity and specificity of each machine learning technique to the other techniques revealed that the Wavelet Packet transform, followed by Naïve-Bayesian classification, is the most effective machine learning technique

    Bayesian classification with Gaussian processes

    Get PDF
    We consider the problem of assigning an input vector to one of m classes by predicting P(c|x) for c=1,...,m. For a two-class problem, the probability of class one given x is estimated by s(y(x)), where s(y)=1/(1+e-y). A Gaussian process prior is placed on y(x), and is combined with the training data to obtain predictions for new x points. We provide a Bayesian treatment, integrating over uncertainty in y and in the parameters that control the Gaussian process prior the necessary integration over y is carried out using Laplace's approximation. The method is generalized to multiclass problems (m>2) using the softmax function. We demonstrate the effectiveness of the method on a number of datasets

    SEQUENTIAL BAYESIAN CLASSIFICATION: DNA BARCODES

    Get PDF
    DNA barcodes are short strands of nucleotide bases taken from the cytochrome c oxidase subunit 1 (COI) of the mitochondrial DNA (mtDNA). A single barcode may have the form C C G G C A T A G T A G G C A C T G and typically ranges in length from 255 to around 700 nucleotide bases. Unlike nuclear DNA (nDNA), mtDNA remains largely unchanged as it is passed from mother to o spring. It has been proposed that these barcodes may be used as a method of di erentiating between biological species (Hebert, Ratnasingham, and deWaard 2003). While this proposal is sharply debated among some taxonomists (Will and Rubino 2004), it has gained much momentum and attention from biologists. One issue at the heart of the controversy is the use of genetic distance measures as a tool for species differentiation. Current methods of species classification utilize these distance measures that are heavily dependent on both evolutionary model assumptions as well as a clearly defined gap between intra- and interspecies variation (Meyer and Paulay 2005). We point out the limitations of such distance measures and propose a character-based method of species classification which utilizes an application of Bayes\u27 rule to overcome these defciencies. The proposed method is shown to provide accurate species-level classification. The proposed methods also provide answers to important questions not addressable with current methods
    corecore