874 research outputs found

    Graph-based Modelling of Concurrent Sequential Patterns

    Get PDF
    Structural relation patterns have been introduced recently to extend the search for complex patterns often hidden behind large sequences of data. This has motivated a novel approach to sequential patterns post-processing and a corresponding data mining method was proposed for Concurrent Sequential Patterns (ConSP). This article refines the approach in the context of ConSP modelling, where a companion graph-based model is devised as an extension of previous work. Two new modelling methods are presented here together with a construction algorithm, to complete the transformation of concurrent sequential patterns to a ConSP-Graph representation. Customer orders data is used to demonstrate the effectiveness of ConSP mining while synthetic sample data highlights the strength of the modelling technique, illuminating the theories developed

    An efficient parallel method for mining frequent closed sequential patterns

    Get PDF
    Mining frequent closed sequential pattern (FCSPs) has attracted a great deal of research attention, because it is an important task in sequences mining. In recently, many studies have focused on mining frequent closed sequential patterns because, such patterns have proved to be more efficient and compact than frequent sequential patterns. Information can be fully extracted from frequent closed sequential patterns. In this paper, we propose an efficient parallel approach called parallel dynamic bit vector frequent closed sequential patterns (pDBV-FCSP) using multi-core processor architecture for mining FCSPs from large databases. The pDBV-FCSP divides the search space to reduce the required storage space and performs closure checking of prefix sequences early to reduce execution time for mining frequent closed sequential patterns. This approach overcomes the problems of parallel mining such as overhead of communication, synchronization, and data replication. It also solves the load balance issues of the workload between the processors with a dynamic mechanism that re-distributes the work, when some processes are out of work to minimize the idle CPU time.Web of Science5174021739

    Formal Languages and Compilation

    Get PDF
    This textbook describes the essential principles and methods used for defining the syntax of artificial languages, and for designing efficient parsing algorithms and syntax-directed translators with semantic attributes. A comprehensive selection of topics is presented within a rigorous, unified framework, illustrated by numerous practical examples. Features and topics: presents a novel conceptual approach to parsing algorithms that applies to extended BNF grammars, together with a parallel parsing algorithm; supplies supplementary teaching tools, including course slides and exercises with solutions, at an associated website; unifies the concepts and notations used in different approaches, enabling an extended coverage of methods with a reduced number of definitions; systematically discusses ambiguous forms, allowing readers to avoid pitfalls when designing grammars; describes all algorithms in pseudocode, so that detailed knowledge of a specific programming language is not necessary; makes extensive usage of theoretical models of automata, transducers and formal grammars; includes concise coverage of algorithms for processing regular expressions and finite automata; and introduces static program analysis based on flow equations. This clearly-written, classroom-tested textbook is an ideal guide to the fundamentals of this field for advanced undergraduate and graduate students in computer science and computer engineering. Some background in programming is required, and readers should also be familiar with basic set theory, algebra and logic

    From sequential patterns to concurrent branch patterns: a new post sequential patterns mining approach

    Get PDF
    A thesis submitted for the degree of Doctor ofPhilosophy of the University of BedfordshireSequential patterns mining is an important pattern discovery technique used to identify frequently observed sequential occurrence of items across ordered transactions over time. It has been intensively studied and there exists a great diversity of algorithms. However, there is a major problem associated with the conventional sequential patterns mining in that patterns derived are often large and not very easy to understand or use. In addition, more complex relations among events are often hidden behind sequences. A novel model for sequential patterns called Sequential Patterns Graph (SPG) is proposed. The construction algorithm of SPG is presented with experimental results to substantiate the concept. The thesis then sets out to define some new structural patterns such as concurrent branch patterns, exclusive patterns and iterative patterns which are generally hidden behind sequential patterns. Finally, an integrative framework, named Post Sequential Patterns Mining (PSPM), which is based on sequential patterns mining, is also proposed for the discovery and visualisation of structural patterns. This thesis is intended to prove that discrete sequential patterns derived from traditional sequential patterns mining can be modelled graphically using SPG. It is concluded from experiments and theoretical studies that SPG is not only a minimal representation of sequential patterns mining, but it also represents the interrelation among patterns and establishes further the foundation for mining structural knowledge (i.e. concurrent branch patterns, exclusive patterns and iterative patterns). from experiments conducted on both synthetic and real datasets, it is shown that Concurrent Branch Patterns (CBP) mining is an effective and efficient mining algorithm suitable for concurrent branch patterns

    The Design & Implementation of an Abstract Semantic Graph for Statement-Level Dynamic Analysis of C++ Applications

    Get PDF
    In this thesis, we describe our system, Hylian, for statement-level analysis, both static and dynamic, of a C++ application. We begin by extending the GNU gcc parser to generate parse trees in XML format for each of the compilation units in a C++ application. We then provide verification that the generated parse trees are structurally equivalent to the code in the original C++ application. We use the generated parse trees, together with an augmented version of the gcc test suite, to recover a grammar for the C++ dialect that we parse. We use the recovered grammar to generate a schema for further verification of the parse trees and evaluate the coverage provided by our C++ test suite. We then extend the parse tree, for each compilation unit, with semantic information to form an abstract semantic graph, ASG, and then link the ASGs for all of the compilation units into a unified ASG for the entire application under study. In addition, to relieve the cognitive burden of information that may inundate a developer, we describe our development of extensions to Hylian to build abbreviated abstract semantic graphs, which incorporate information about user code, but not about compiler provided library code. Finally, we describe the various approaches that we adopted to provide assurance for the developer that the ASGs that Hylian builds, correctly represent the program under study

    The paradigm compiler: Mapping a functional language for the connection machine

    Get PDF
    The Paradigm Compiler implements a new approach to compiling programs written in high level languages for execution on highly parallel computers. The general approach is to identify the principal data structures constructed by the program and to map these structures onto the processing elements of the target machine. The mapping is chosen to maximize performance as determined through compile time global analysis of the source program. The source language is Sisal, a functional language designed for scientific computations, and the target language is Paris, the published low level interface to the Connection Machine. The data structures considered are multidimensional arrays whose dimensions are known at compile time. Computations that build such arrays usually offer opportunities for highly parallel execution; they are data parallel. The Connection Machine is an attractive target for these computations, and the parallel for construct of the Sisal language is a convenient high level notation for data parallel algorithms. The principles and organization of the Paradigm Compiler are discussed

    The Gremlin Graph Traversal Machine and Language

    Full text link
    Gremlin is a graph traversal machine and language designed, developed, and distributed by the Apache TinkerPop project. Gremlin, as a graph traversal machine, is composed of three interacting components: a graph GG, a traversal Ψ\Psi, and a set of traversers TT. The traversers move about the graph according to the instructions specified in the traversal, where the result of the computation is the ultimate locations of all halted traversers. A Gremlin machine can be executed over any supporting graph computing system such as an OLTP graph database and/or an OLAP graph processor. Gremlin, as a graph traversal language, is a functional language implemented in the user's native programming language and is used to define the Ψ\Psi of a Gremlin machine. This article provides a mathematical description of Gremlin and details its automaton and functional properties. These properties enable Gremlin to naturally support imperative and declarative querying, host language agnosticism, user-defined domain specific languages, an extensible compiler/optimizer, single- and multi-machine execution models, hybrid depth- and breadth-first evaluation, as well as the existence of a Universal Gremlin Machine and its respective entailments.Comment: To appear in the Proceedings of the 2015 ACM Database Programming Languages Conferenc

    Test Programming by Program Composition and Symbolic Simulation

    Get PDF
    Classical test generation techniques rely on search through gate-level circuit descriptions, which results in long runtimes. In some instances, classical techniques cannot be used because they would take longer than the lifetime of the product to generate tests which are needed when the first devices come off the assembly line. Despite these difficulties, human experts often succeed in writing test programs for very complex circuits. How can we account for their success? We take a knowledge engineering approach to this problem by trying to capture in a program techniques gleaned from working with experienced test programmers. From these talks, we conjecture that expert test programming performance relies in part on two aspects of human problem solving. First, the experts remember many cliched solutions to test programming problems. The difficulty lies in formalizing the notion of a cliche for this domain. For test programming, we propose that cliches contain goal to subgoal expansions, fragments of test program code, and constraints describing how program fragments fit together. We present an algorithm which uses testing cliches to generate test programs. Second, experts can simulate a circuit at various levels of abstraction and recognize patterns of activity in the circuit which are useful for solving test problems. We argue that symbolic simulation coupled with recognition of which simulated events solve our goals is an effective planning strategy in certain cases. We present a second algorithm which simulates circuit behavior on symbolic inputs at roughly the register transfer level and generates fragments of test programs suitable for use by our first algorithm.MIT Artificial Intelligence Laborator
    • …
    corecore