2,201 research outputs found

    The use of data-mining for the automatic formation of tactics

    Get PDF
    This paper discusses the usse of data-mining for the automatic formation of tactics. It was presented at the Workshop on Computer-Supported Mathematical Theory Development held at IJCAR in 2004. The aim of this project is to evaluate the applicability of data-mining techniques to the automatic formation of tactics from large corpuses of proofs. We data-mine information from large proof corpuses to find commonly occurring patterns. These patterns are then evolved into tactics using genetic programming techniques

    Detecting Irrelevant subtrees to improve probabilistic learning from tree-structured data

    No full text
    International audienceIn front of the large increase of the available amount of structured data (such as XML documents), many algorithms have emerged for dealing with tree-structured data. In this article, we present a probabilistic approach which aims at a posteriori pruning noisy or irrelevant subtrees in a set of trees. The originality of this approach, in comparison with classic data reduction techniques, comes from the fact that only a part of a tree (i.e. a subtree) can be deleted, rather than the whole tree itself. Our method is based on the use of confidence intervals, on a partition of subtrees, computed according to a given probability distribution. We propose an original approach to assess these intervals on tree-structured data and we experimentally show its interest in the presence of noise

    Mining XML Documents

    Get PDF
    XML documents are becoming ubiquitous because of their rich and flexible format that can be used for a variety of applications. Giving the increasing size of XML collections as information sources, mining techniques that traditionally exist for text collections or databases need to be adapted and new methods to be invented to exploit the particular structure of XML documents. Basically XML documents can be seen as trees, which are well known to be complex structures. This chapter describes various ways of using and simplifying this tree structure to model documents and support efficient mining algorithms. We focus on three mining tasks: classification and clustering which are standard for text collections; discovering of frequent tree structure which is especially important for heterogeneous collection. This chapter presents some recent approaches and algorithms to support these tasks together with experimental evaluation on a variety of large XML collections

    Machine learning and its applications in reliability analysis systems

    Get PDF
    In this thesis, we are interested in exploring some aspects of Machine Learning (ML) and its application in the Reliability Analysis systems (RAs). We begin by investigating some ML paradigms and their- techniques, go on to discuss the possible applications of ML in improving RAs performance, and lastly give guidelines of the architecture of learning RAs. Our survey of ML covers both levels of Neural Network learning and Symbolic learning. In symbolic process learning, five types of learning and their applications are discussed: rote learning, learning from instruction, learning from analogy, learning from examples, and learning from observation and discovery. The Reliability Analysis systems (RAs) presented in this thesis are mainly designed for maintaining plant safety supported by two functions: risk analysis function, i.e., failure mode effect analysis (FMEA) ; and diagnosis function, i.e., real-time fault location (RTFL). Three approaches have been discussed in creating the RAs. According to the result of our survey, we suggest currently the best design of RAs is to embed model-based RAs, i.e., MORA (as software) in a neural network based computer system (as hardware). However, there are still some improvement which can be made through the applications of Machine Learning. By implanting the 'learning element', the MORA will become learning MORA (La MORA) system, a learning Reliability Analysis system with the power of automatic knowledge acquisition and inconsistency checking, and more. To conclude our thesis, we propose an architecture of La MORA

    Foundations of Software Science and Computation Structures

    Get PDF
    This open access book constitutes the proceedings of the 22nd International Conference on Foundations of Software Science and Computational Structures, FOSSACS 2019, which took place in Prague, Czech Republic, in April 2019, held as part of the European Joint Conference on Theory and Practice of Software, ETAPS 2019. The 29 papers presented in this volume were carefully reviewed and selected from 85 submissions. They deal with foundational research with a clear significance for software science

    Synthesizing Program Input Grammars

    Full text link
    We present an algorithm for synthesizing a context-free grammar encoding the language of valid program inputs from a set of input examples and blackbox access to the program. Our algorithm addresses shortcomings of existing grammar inference algorithms, which both severely overgeneralize and are prohibitively slow. Our implementation, GLADE, leverages the grammar synthesized by our algorithm to fuzz test programs with structured inputs. We show that GLADE substantially increases the incremental coverage on valid inputs compared to two baseline fuzzers

    Inferring Different Types of Lindenmayer Systems Using Artificial Intelligence

    Get PDF
    Lindenmayer systems (L-systems) are a formal grammar system which consist of a set of rewriting rules. Each rewriting rule is comprised of a symbol to replace (predecessor), a replacement string (successor), and an optional condition that is necessary for replacement. Starting with an initial string, every symbol in the string is replaced in parallel in accordance with the conditions on the rewriting rules, to produce a new string. The replacement process iterates as needed to produce a sequence of strings. There are different types of L-systems, which allow for different types of conditions, and methods of selecting the rules to apply. Some symbols of the alphabet can be interpreted as instructions for simulation software towards process modelling, where each string describes another step of the simulated process. Typically, creating an L-system for a specific process is done by experts by making meticulous measurements and using a priori knowledge about the process. It would be desirable to have a method to automatically learn the L-systems (the simulation program) from data, such as from a temporal sequence of images. This thesis presents a suite of tools, collectively called the Plant Model Inference Tools or PMIT (despite the name, the tools are domain agnostic), for inferring different types of L-systems using only a sequence of strings describing the process over some initial time period. Variants of PMIT are created for deterministic context-free L-systems, stochastic L-systems, and parametric L-systems. They are each evaluated using existing known deterministic and parametric L-systems from the literature, and procedurally generated stochastic L-systems. Accuracy can be detected in various ways, such as checking whether the inferred L-system is equal to the original one. PMIT is able to correctly infer deterministic L-systems with up to 31 symbols in the alphabet compared to the previous state-of-the-art algorithm's limit of 2 symbols. Stochastic L-systems allow symbols in the alphabet to have multiple rewriting rules each with an associated probability of being selected. Evaluating stochastic L-system inference with 960 procedurally generated L-systems with multiple sequences of strings as input found the following: 1) when 3 input sequences are used, the inferred successors always matched the original successors for systems with up to 9 rewriting rules, 2) when 6 sequences of strings are used, the difference between the associated probabilities of the inferred and the original L-system is approximately 1%. Parametric L-systems allow symbols to have multiple rewriting rules with parameters that get passed during rewriting. Rule selection is based on an associated Boolean condition over the parameters that gets evaluated to choose the rule to be applied. Inference is done in two steps. In the first step, the successors are inferred, and in the second step, appropriate Boolean conditions are found. Parametric L-system inference was evaluated on 20 known parametric L-systems. For 18 of the 20 L-systems where all successors were non-empty, the successors were correctly identified, but the time taken was up to 26 days on a single core CPU for the largest L-system. The second step, inferring the Boolean conditions, was successful for all 20 systems in the test set. No previous algorithm from the literature had implemented stochastic or parametric L-system inference. Inferring L-systems of greater complexity algorithmically can save considerable time and effort versus constructing them manually; however, perhaps more importantly rather than relying on existing knowledge, inferring a simulation of a process from data can help reveal the underlying scientific principles of the process

    Decision Trees and Transient Stability of Electric Power Systems

    Full text link
    An inductive inference method for the automatic building of decision trees is investigated. Among its various tasks, the splitting and the stop splitting criteria successively applied to the nodes of a grown tree, are found to play a crucial role on its overall shape and performances. The application of this general method to transient stability is systematically explored. Parameters related to the stop splitting criterion, to the learning set and to the tree classes are thus considered, and their influence on the tree features is scrutinized. Evaluation criteria appropriate to assess accuracy are also compared. Various tradeoffs are further examined, such as complexity vs number of classes, or misclassification rate vs type of misclassification errors. Possible uses of the trees are also envisaged. Computational issues relating to the building and the use of trees are finally discussed

    Foundations of Software Science and Computation Structures

    Get PDF
    This open access book constitutes the proceedings of the 23rd International Conference on Foundations of Software Science and Computational Structures, FOSSACS 2020, which took place in Dublin, Ireland, in April 2020, and was held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020. The 31 regular papers presented in this volume were carefully reviewed and selected from 98 submissions. The papers cover topics such as categorical models and logics; language theory, automata, and games; modal, spatial, and temporal logics; type theory and proof theory; concurrency theory and process calculi; rewriting theory; semantics of programming languages; program analysis, correctness, transformation, and verification; logics of programming; software specification and refinement; models of concurrent, reactive, stochastic, distributed, hybrid, and mobile systems; emerging models of computation; logical aspects of computational complexity; models of software security; and logical foundations of data bases.
    corecore