25,672 research outputs found

    Enhancing Automated Test Selection in Probabilistic Networks

    Get PDF
    In diagnostic decision-support systems, test selection amounts to selecting, in a sequential manner, a test that is expected to yield the largest decrease in the uncertainty about a patientā€™s diagnosis. For capturing this uncertainty, often an information measure is used. In this paper, we study the Shannon entropy, the Gini index, and the misclassification error for this purpose. We argue that the Gini index can be regarded as an approximation of the Shannon entropy and that the misclassification error can be looked upon as an approximation of the Gini index. We further argue that the differences between the first derivatives of the three functions can explain different test sequences in practice. Experimental results from using the measures with a real-life probabilistic network in oncology support our observations

    Reinforcement learning for efficient network penetration testing

    Get PDF
    Penetration testing (also known as pentesting or PT) is a common practice for actively assessing the defenses of a computer network by planning and executing all possible attacks to discover and exploit existing vulnerabilities. Current penetration testing methods are increasingly becoming non-standard, composite and resource-consuming despite the use of evolving tools. In this paper, we propose and evaluate an AI-based pentesting system which makes use of machine learning techniques, namely reinforcement learning (RL) to learn and reproduce average and complex pentesting activities. The proposed system is named Intelligent Automated Penetration Testing System (IAPTS) consisting of a module that integrates with industrial PT frameworks to enable them to capture information, learn from experience, and reproduce tests in future similar testing cases. IAPTS aims to save human resources while producing much-enhanced results in terms of time consumption, reliability and frequency of testing. IAPTS takes the approach of modeling PT environments and tasks as a partially observed Markov decision process (POMDP) problem which is solved by POMDP-solver. Although the scope of this paper is limited to network infrastructures PT planning and not the entire practice, the obtained results support the hypothesis that RL can enhance PT beyond the capabilities of any human PT expert in terms of time consumed, covered attacking vectors, accuracy and reliability of the outputs. In addition, this work tackles the complex problem of expertise capturing and re-use by allowing the IAPTS learning module to store and re-use PT policies in the same way that a human PT expert would learn but in a more efficient way

    Novel Bayesian Networks for Genomic Prediction of Developmental Traits in Biomass Sorghum.

    Get PDF
    The ability to connect genetic information between traits over time allow Bayesian networks to offer a powerful probabilistic framework to construct genomic prediction models. In this study, we phenotyped a diversity panel of 869 biomass sorghum (Sorghum bicolor (L.) Moench) lines, which had been genotyped with 100,435 SNP markers, for plant height (PH) with biweekly measurements from 30 to 120 days after planting (DAP) and for end-of-season dry biomass yield (DBY) in four environments. We evaluated five genomic prediction models: Bayesian network (BN), Pleiotropic Bayesian network (PBN), Dynamic Bayesian network (DBN), multi-trait GBLUP (MTr-GBLUP), and multi-time GBLUP (MTi-GBLUP) models. In fivefold cross-validation, prediction accuracies ranged from 0.46 (PBN) to 0.49 (MTr-GBLUP) for DBY and from 0.47 (DBN, DAP120) to 0.75 (MTi-GBLUP, DAP60) for PH. Forward-chaining cross-validation further improved prediction accuracies of the DBN, MTi-GBLUP and MTr-GBLUP models for PH (training slice: 30-45 DAP) by 36.4-52.4% relative to the BN and PBN models. Coincidence indices (target: biomass, secondary: PH) and a coincidence index based on lines (PH time series) showed that the ranking of lines by PH changed minimally after 45 DAP. These results suggest a two-level indirect selection method for PH at harvest (first-level target trait) and DBY (second-level target trait) could be conducted earlier in the season based on ranking of lines by PH at 45 DAP (secondary trait). With the advance of high-throughput phenotyping technologies, our proposed two-level indirect selection framework could be valuable for enhancing genetic gain per unit of time when selecting on developmental traits

    Rapid design of tool-wear condition monitoring systems for turning processes using novelty detection

    Get PDF
    Condition monitoring systems of manufacturing processes have been recognised in recent years as one of the key technologies that provide the competitive advantage in many manufacturing environments. It is capable of providing an essential means to reduce cost, increase productivity, improve quality and prevent damage to the machine or workpiece. Turning operations are considered one of the most common manufacturing processes in industry. It is used to manufacture different round objects such as shafts, spindles and pins. Despite recent development and intensive engineering research, the development of tool wear monitoring systems in turning is still ongoing challenge. In this paper, force signals are used for monitoring tool-wear in a feature fusion model. A novel approach for the design of condition monitoring systems for turning operations using novelty detection algorithm is presented. The results found prove that the developed system can be used for rapid design of condition monitoring systems for turning operations to predict tool-wear

    Fine-grained Search Space Classification for Hard Enumeration Variants of Subset Problems

    Full text link
    We propose a simple, powerful, and flexible machine learning framework for (i) reducing the search space of computationally difficult enumeration variants of subset problems and (ii) augmenting existing state-of-the-art solvers with informative cues arising from the input distribution. We instantiate our framework for the problem of listing all maximum cliques in a graph, a central problem in network analysis, data mining, and computational biology. We demonstrate the practicality of our approach on real-world networks with millions of vertices and edges by not only retaining all optimal solutions, but also aggressively pruning the input instance size resulting in several fold speedups of state-of-the-art algorithms. Finally, we explore the limits of scalability and robustness of our proposed framework, suggesting that supervised learning is viable for tackling NP-hard problems in practice.Comment: AAAI 201
    • ā€¦
    corecore