252 research outputs found

    The classification performance of Bayesian Networks Classifiers: a case study of detecting Denial of Service (DoS) attacks in cloud computing environments

    Get PDF
    In this research we propose a Bayesian networks approach as a promissory classification technique for detecting malicious traffic due to Denial of Service (DoS) attacks. Bayesian networks have been applied in numerous fields fraught with uncertainty and they have been proved to be successful. They have excelled tremendously in classification tasks i.e. text analysis, medical diagnoses and environmental modeling and management. The detection of DoS attacks has received tremendous attention in the field of network security. DoS attacks have proved to be detrimental and are the bane of cloud computing environments. Large business enterprises have been/or are still unwilling to outsource their businesses to the cloud due to the intrusive tendencies that the cloud platforms are prone too. To make use of Bayesian networks it is imperative to understand the ―ecosystem‖ of factors that are external to modeling the Bayesian algorithm itself. Understanding these factors have proven to result in comparable improvement in classification performance beyond the augmentation of the existing algorithms. Literature provides discussions pertaining to the factors that impact the classification capability, however it was noticed that the effects of the factors are not universal, they tend to be unique for each domain problem. This study investigates the effects of modeling parameters on the classification performance of Bayesian network classifiers in detecting DoS attacks in cloud platforms. We analyzed how structural complexity, training sample size, the choice of discretization method and lastly the score function both individually and collectively impact the performance of classifying between normal and DoS attacks on the cloud. To study the aforementioned factors, we conducted a series of experiments in detecting live DoS attacks launched against a deployed cloud and thereafter examined the classification performance in terms of accuracy of different classes of Bayesian networks. NSL-KDD dataset was used as our training set. We used ownCloud software to deploy our cloud platform. To launch DoS attacks, we used hping3 hacker friendly utility. A live packet capture was used as our test set. WEKA version 3.7.12 was used for our experiments. Our results show that the progression in model complexity improves the classification performance. This is attributed to the increase in the number of attribute correlations. Also the size of the training sample size proved to improve classification ability. Our findings noted that the choice of discretization algorithm does matter in the quest for optimal classification performance. Furthermore, our results indicate that the choice of scoring function does not affect the classification performance of Bayesian networks. Conclusions drawn from this research are prescriptive particularly for a novice machine learning researcher with valuable recommendations that ensure optimal classification performance of Bayesian networks classifiers

    Applying MDL to Learning Best Model Granularity

    Get PDF
    The Minimum Description Length (MDL) principle is solidly based on a provably ideal method of inference using Kolmogorov complexity. We test how the theory behaves in practice on a general problem in model selection: that of learning the best model granularity. The performance of a model depends critically on the granularity, for example the choice of precision of the parameters. Too high precision generally involves modeling of accidental noise and too low precision may lead to confusion of models that should be distinguished. This precision is often determined ad hoc. In MDL the best model is the one that most compresses a two-part code of the data set: this embodies ``Occam's Razor.'' In two quite different experimental settings the theoretical value determined using MDL coincides with the best value found experimentally. In the first experiment the task is to recognize isolated handwritten characters in one subject's handwriting, irrespective of size and orientation. Based on a new modification of elastic matching, using multiple prototypes per character, the optimal prediction rate is predicted for the learned parameter (length of sampling interval) considered most likely by MDL, which is shown to coincide with the best value found experimentally. In the second experiment the task is to model a robot arm with two degrees of freedom using a three layer feed-forward neural network where we need to determine the number of nodes in the hidden layer giving best modeling performance. The optimal model (the one that extrapolizes best on unseen examples) is predicted for the number of nodes in the hidden layer considered most likely by MDL, which again is found to coincide with the best value found experimentally.Comment: LaTeX, 32 pages, 5 figures. Artificial Intelligence journal, To appea

    Application of an efficient Bayesian discretization method to biomedical data

    Get PDF
    Background\ud Several data mining methods require data that are discrete, and other methods often perform better with discrete data. We introduce an efficient Bayesian discretization (EBD) method for optimal discretization of variables that runs efficiently on high-dimensional biomedical datasets. The EBD method consists of two components, namely, a Bayesian score to evaluate discretizations and a dynamic programming search procedure to efficiently search the space of possible discretizations. We compared the performance of EBD to Fayyad and Irani's (FI) discretization method, which is commonly used for discretization.\ud \ud Results\ud On 24 biomedical datasets obtained from high-throughput transcriptomic and proteomic studies, the classification performances of the C4.5 classifier and the naïve Bayes classifier were statistically significantly better when the predictor variables were discretized using EBD over FI. EBD was statistically significantly more stable to the variability of the datasets than FI. However, EBD was less robust, though not statistically significantly so, than FI and produced slightly more complex discretizations than FI.\ud \ud Conclusions\ud On a range of biomedical datasets, a Bayesian discretization method (EBD) yielded better classification performance and stability but was less robust than the widely used FI discretization method. The EBD discretization method is easy to implement, permits the incorporation of prior knowledge and belief, and is sufficiently fast for application to high-dimensional data

    Scalable Population Synthesis with Deep Generative Modeling

    Full text link
    Population synthesis is concerned with the generation of synthetic yet realistic representations of populations. It is a fundamental problem in the modeling of transport where the synthetic populations of micro-agents represent a key input to most agent-based models. In this paper, a new methodological framework for how to 'grow' pools of micro-agents is presented. The model framework adopts a deep generative modeling approach from machine learning based on a Variational Autoencoder (VAE). Compared to the previous population synthesis approaches, including Iterative Proportional Fitting (IPF), Gibbs sampling and traditional generative models such as Bayesian Networks or Hidden Markov Models, the proposed method allows fitting the full joint distribution for high dimensions. The proposed methodology is compared with a conventional Gibbs sampler and a Bayesian Network by using a large-scale Danish trip diary. It is shown that, while these two methods outperform the VAE in the low-dimensional case, they both suffer from scalability issues when the number of modeled attributes increases. It is also shown that the Gibbs sampler essentially replicates the agents from the original sample when the required conditional distributions are estimated as frequency tables. In contrast, the VAE allows addressing the problem of sampling zeros by generating agents that are virtually different from those in the original data but have similar statistical properties. The presented approach can support agent-based modeling at all levels by enabling richer synthetic populations with smaller zones and more detailed individual characteristics.Comment: 27 pages, 15 figures, 4 table

    Empirical evaluation of scoring functions for Bayesian network model selection

    Get PDF
    In this work, we empirically evaluate the capability of various scoring functions of Bayesian networks for recovering true underlying structures. Similar investigations have been carried out before, but they typically relied on approximate learning algorithms to learn the network structures. The suboptimal structures found by the approximation methods have unknown quality and may affect the reliability of their conclusions. Our study uses an optimal algorithm to learn Bayesian network structures from datasets generated from a set of gold standard Bayesian networks. Because all optimal algorithms always learn equivalent networks, this ensures that only the choice of scoring function affects the learned networks. Another shortcoming of the previous studies stems from their use of random synthetic networks as test cases. There is no guarantee that these networks reflect real-world data. We use real-world data to generate our gold-standard structures, so our experimental design more closely approximates real-world situations. A major finding of our study suggests that, in contrast to results reported by several prior works, the Minimum Description Length (MDL) (or equivalently, Bayesian information criterion (BIC)) consistently outperforms other scoring functions such as Akaike\u27s information criterion (AIC), Bayesian Dirichlet equivalence score (BDeu), and factorized normalized maximum likelihood (fNML) in recovering the underlying Bayesian network structures. We believe this finding is a result of using both datasets generated from real-world applications rather than from random processes used in previous studies and learning algorithms to select high-scoring structures rather than selecting random models. Other findings of our study support existing work, e.g., large sample sizes result in learning structures closer to the true underlying structure; the BDeu score is sensitive to the parameter settings; and the fNML performs pretty well on small datasets. We also tested a greedy hill climbing algorithm and observed similar results as the optimal algorithm

    Proceedings of the Fifth Workshop on Information Theoretic Methods in Science and Engineering

    Get PDF
    These are the online proceedings of the Fifth Workshop on Information Theoretic Methods in Science and Engineering (WITMSE), which was held in the Trippenhuis, Amsterdam, in August 2012
    corecore