69,733 research outputs found

    Compiling Regular Patterns to Sequential Machines

    Get PDF
    Pattern matching combined with regular expressions has many applications including text and XML processing, lexical analysis, classification of DNA segments and content-based routing. Patterns contain variables to refer to parts of the matching input. But regular patterns pose the problem of ambiguity: Words can be matched against 'overlapping' sections of the pattern in several ways, yielding different variable bindings. A match policy like shortest or longest match disambiguates the outcome of matching. In order to implement the longest/shortest match policies, we propose to compile regular patterns to sequential machines. This intuitive approach %to resolving ambiguities by means of shortest/longest match %policies (and the slightly different ungreedy/greedy match), with lets us derive a compilation scheme with linear runtime complexity. \par The main contributions of this paper are firstly, a decision procedure for unambiguous regular patterns, which can be matched with a single traversal of the input, and secondly, algorithms to obtain deterministic sequential machines from ambiguous patterns. These produce the shortest(longest) match in two consecutive runs. The first run produces an intermediary result from which all possible variable bindings can be reproduced. The second run then chooses the unique binding which adheres to the given match policy. In the general case, this approach is optimal

    DESIGN AND IMPLEMENTATION OF A LANGUAGE SCANNER GENERATOR (KnitAutoGen)

    Get PDF
    The need for fast, efficient and simple scanner generator that has the primary responsibility to perform efficiently gave rise to this paper. This is due to the fact that, on daily basis new technologies arise which brings a great improvement on the design of computer architecture. However, attention was given to speed, run time and resource availability of the design machine to be used since lexical analysis has an impact on how the compiler works. This paper seeks to develop a lexical analyzer (scanner generator) automatically by specifying the lexemes patterns to a lexical analyzer generator and compiling those patterns into a code that functions as a lexical analyzer. The scanner accepts characters as input and breaks them down to produce tokens by grouping the characters and not deviating from specifications. The project employs one of the different methods of lexical analyzer generator to perform pattern-matching on text using regular expression over a global character set. The paper shows how input is matched and specifies what to do when a pattern is matched. This was achieved with the use of regular expressions (RE) which were converted to non-deterministic finite automata (NFA) or deterministic finite automata (DFA). The regular expression and the regular grammar were thus joined together mathematically. Various results are presented and further work on micro compilers was proposed. Keyword: Scanner generator, Regular expression, KnitAutoGen, DFA, NFA. DOI: 10.7176/CTI/8-0

    Compiling global name-space programs for distributed execution

    Get PDF
    Distributed memory machines do not provide hardware support for a global address space. Thus programmers are forced to partition the data across the memories of the architecture and use explicit message passing to communicate data between processors. The compiler support required to allow programmers to express their algorithms using a global name-space is examined. A general method is presented for analysis of a high level source program and along with its translation to a set of independently executing tasks communicating via messages. If the compiler has enough information, this translation can be carried out at compile-time. Otherwise run-time code is generated to implement the required data movement. The analysis required in both situations is described and the performance of the generated code on the Intel iPSC/2 is presented

    MatchPy: A Pattern Matching Library

    Full text link
    Pattern matching is a powerful tool for symbolic computations, based on the well-defined theory of term rewriting systems. Application domains include algebraic expressions, abstract syntax trees, and XML and JSON data. Unfortunately, no lightweight implementation of pattern matching as general and flexible as Mathematica exists for Python Mathics,MacroPy,patterns,PyPatt. Therefore, we created the open source module MatchPy which offers similar pattern matching functionality in Python using a novel algorithm which finds matches for large pattern sets more efficiently by exploiting similarities between patterns.Comment: arXiv admin note: substantial text overlap with arXiv:1710.0007

    A Fast Compiler for NetKAT

    Full text link
    High-level programming languages play a key role in a growing number of networking platforms, streamlining application development and enabling precise formal reasoning about network behavior. Unfortunately, current compilers only handle "local" programs that specify behavior in terms of hop-by-hop forwarding behavior, or modest extensions such as simple paths. To encode richer "global" behaviors, programmers must add extra state -- something that is tricky to get right and makes programs harder to write and maintain. Making matters worse, existing compilers can take tens of minutes to generate the forwarding state for the network, even on relatively small inputs. This forces programmers to waste time working around performance issues or even revert to using hardware-level APIs. This paper presents a new compiler for the NetKAT language that handles rich features including regular paths and virtual networks, and yet is several orders of magnitude faster than previous compilers. The compiler uses symbolic automata to calculate the extra state needed to implement "global" programs, and an intermediate representation based on binary decision diagrams to dramatically improve performance. We describe the design and implementation of three essential compiler stages: from virtual programs (which specify behavior in terms of virtual topologies) to global programs (which specify network-wide behavior in terms of physical topologies), from global programs to local programs (which specify behavior in terms of single-switch behavior), and from local programs to hardware-level forwarding tables. We present results from experiments on real-world benchmarks that quantify performance in terms of compilation time and forwarding table size

    Dataplane Specialization for High-performance OpenFlow Software Switching

    Get PDF
    OpenFlow is an amazingly expressive dataplane program- ming language, but this expressiveness comes at a severe performance price as switches must do excessive packet clas- sification in the fast path. The prevalent OpenFlow software switch architecture is therefore built on flow caching, but this imposes intricate limitations on the workloads that can be supported efficiently and may even open the door to mali- cious cache overflow attacks. In this paper we argue that in- stead of enforcing the same universal flow cache semantics to all OpenFlow applications and optimize for the common case, a switch should rather automatically specialize its dat- aplane piecemeal with respect to the configured workload. We introduce ES WITCH , a novel switch architecture that uses on-the-fly template-based code generation to compile any OpenFlow pipeline into efficient machine code, which can then be readily used as fast path. We present a proof- of-concept prototype and we demonstrate on illustrative use cases that ES WITCH yields a simpler architecture, superior packet processing speed, improved latency and CPU scala- bility, and predictable performance. Our prototype can eas- ily scale beyond 100 Gbps on a single Intel blade even with complex OpenFlow pipelines
    • …
    corecore