1,439 research outputs found

    An Overlay Architecture for Pattern Matching

    Get PDF
    Deterministic and Non-deterministic Finite Automata (DFA and NFA) comprise the fundamental unit of work for many emerging big data applications, motivating recent efforts to develop Domain-Specific Architectures (DSAs) to exploit fine-grain parallelism available in automata workloads. This dissertation presents NAPOLY (Non-Deterministic Automata Processor Over- LaY), an overlay architecture and associated software that attempt to maximally exploit on-chip memory parallelism for NFA evaluation. In order to avoid an upper bound in NFA size that commonly affects prior efforts, NAPOLY is optimized for runtime reconfiguration, allowing for full reconfiguration in 10s of microseconds. NAPOLY is also parameterizable, allowing for offline generation of repertoire of overlay configurations with various trade-offs between state capacity and transition capacity. In this dissertation, we evaluate NAPOLY on automata applications packaged in ANMLZoo benchmarks using our proposed state mapping heuristic and off-shelf SAT solver. We compare NAPOLY’s performance against existing CPU and GPU implementations. The results show NAPOLY performs best for larger benchmarks with more active states and high report frequency. NAPOLY outperforms in 10 out of 12 benchmark suite to the best of state-of-the-art CPU and GPU implementations. To the best of our knowledge, this is the first example of a runtime-reprogrammable FPGA-based automata processor overlay

    Merlin: A Language for Provisioning Network Resources

    Full text link
    This paper presents Merlin, a new framework for managing resources in software-defined networks. With Merlin, administrators express high-level policies using programs in a declarative language. The language includes logical predicates to identify sets of packets, regular expressions to encode forwarding paths, and arithmetic formulas to specify bandwidth constraints. The Merlin compiler uses a combination of advanced techniques to translate these policies into code that can be executed on network elements including a constraint solver that allocates bandwidth using parameterizable heuristics. To facilitate dynamic adaptation, Merlin provides mechanisms for delegating control of sub-policies and for verifying that modifications made to sub-policies do not violate global constraints. Experiments demonstrate the expressiveness and scalability of Merlin on real-world topologies and applications. Overall, Merlin simplifies network administration by providing high-level abstractions for specifying network policies and scalable infrastructure for enforcing them

    Faster Compression of Deterministic Finite Automata

    Full text link
    Deterministic finite automata (DFA) are a classic tool for high throughput matching of regular expressions, both in theory and practice. Due to their high space consumption, extensive research has been devoted to compressed representations of DFAs that still support efficient pattern matching queries. Kumar~et~al.~[SIGCOMM 2006] introduced the \emph{delayed deterministic finite automaton} (\ddfa{}) which exploits the large redundancy between inter-state transitions in the automaton. They showed it to obtain up to two orders of magnitude compression of real-world DFAs, and their work formed the basis of numerous subsequent results. Their algorithm, as well as later algorithms based on their idea, have an inherent quadratic-time bottleneck, as they consider every pair of states to compute the optimal compression. In this work we present a simple, general framework based on locality-sensitive hashing for speeding up these algorithms to achieve sub-quadratic construction times for \ddfa{}s. We apply the framework to speed up several algorithms to near-linear time, and experimentally evaluate their performance on real-world regular expression sets extracted from modern intrusion detection systems. We find an order of magnitude improvement in compression times, with either little or no loss of compression, or even significantly better compression in some cases

    New Regular Expressions on Old Accelerators

    Get PDF

    High performance stride-based network payload inspection

    Get PDF
    There are two main drivers for network payload inspection: malicious data, attacks, virus detection in Network Intrusion Detection System (NIDS) and content detection in Data Leakage Prevention System (DLPS) or Copyright Infringement Detection System (CIDS). Network attacks are getting more and more prevalent. Traditional network firewalls can only check the packet header, but fail to detect attacks hidden in the packet payload. Therefore, the NIDS with Deep Packet Inspection (DPI) function has been developed and widely deployed. By checking each byte of a packet against the pattern set, which is called pattern matching, NIDS is able to detect the attack codes hidden in the payload. The pattern set is usually organized as a Deterministic Finite Automata (DFA). The processing time of DFA is proportional to the length of the input string, but the memory cost of a DFA is quite large. Meanwhile, the link bandwidth and the traffic of the Internet are rapidly increasing, the size of the attack signature database is also growing larger and larger due to the diversification of the attacks. Consequently, there is a strong demand for high performance and low storage cost NIDS. Traditional softwarebased and hardware-based pattern matching algorithms are have difficulty satisfying the processing speed requirement, thus high performance network payload inspection methods are needed to enable deep packet inspection at line rate. In this thesis, Stride Finite Automata (StriFA), a novel finite automata family to accelerate both string matching and regular expression matching, is presented. Compared with the conventional finite automata, which scan the entire traffic stream to locate malicious information, the StriFA only needs to scan samples of the traffic stream to find the suspicious information, thus increasing the matching speed and reducing memory requirements. Technologies such as instant messaging software (Skype, MSN) or BitTorrent file sharing methods, allow convenient sharing of information between managers, employees, customers, and partners. This, however, leads to two kinds of major security risks when exchanging data between different people: firstly, leakage of sensitive data from a company and, secondly, distribution of copyright infringing products in Peer to Peer (P2P) networks. Traditional DFA-based DPI solutions cannot be used for inspection of file distribution in P2P networks due to the potential out-of-order manner of the data delivery. To address this problem, a hybrid finite automaton called Skip-Stride-Neighbor Finite Automaton (S2NFA) is proposed to solve this problem. It combines benefits of the following three structures: 1) Skip-FA, which is used to solve the out-of-order data scanning problem; 2) Stride-DFA, which is introduced to reduce the memory usage of Skip-FA; 3) Neighbor-DFA which is based on the characteristics of Stride-DFA to get a low false positive rate at the additional cost of a small increase in memory consumption

    Big Data Computing for Geospatial Applications

    Get PDF
    The convergence of big data and geospatial computing has brought forth challenges and opportunities to Geographic Information Science with regard to geospatial data management, processing, analysis, modeling, and visualization. This book highlights recent advancements in integrating new computing approaches, spatial methods, and data management strategies to tackle geospatial big data challenges and meanwhile demonstrates opportunities for using big data for geospatial applications. Crucial to the advancements highlighted in this book is the integration of computational thinking and spatial thinking and the transformation of abstract ideas and models to concrete data structures and algorithms

    A Formal Approach to Combining Prospective and Retrospective Security

    Get PDF
    The major goal of this dissertation is to enhance software security by provably correct enforcement of in-depth policies. In-depth security policies allude to heterogeneous specification of security strategies that are required to be followed before and after sensitive operations. Prospective security is the enforcement of security, or detection of security violations before the execution of sensitive operations, e.g., in authorization, authentication and information flow. Retrospective security refers to security checks after the execution of sensitive operations, which is accomplished through accountability and deterrence. Retrospective security frameworks are built upon auditing in order to provide sufficient evidence to hold users accountable for their actions and potentially support other remediation actions. Correctness and efficiency of audit logs play significant roles in reaching the accountability goals that are required by retrospective, and consequently, in-depth security policies. This dissertation addresses correct audit logging in a formal framework. Leveraging retrospective controls beside the existing prospective measures enhances security in numerous applications. This dissertation focuses on two major application spaces for in-depth enforcement. The first is to enhance prospective security through surveillance and accountability. For example, authorization mechanisms could be improved by guaranteed retrospective checks in environments where there is a high cost of access denial, e.g., healthcare systems. The second application space is the amelioration of potentially flawed prospective measures through retrospective checks. For instance, erroneous implementations of input sanitization methods expose vulnerabilities in taint analysis tools that enforce direct flow of data integrity policies. In this regard, we propose an in-depth enforcement framework to mitigate such problems. We also propose a general semantic notion of explicit flow of information integrity in a high-level language with sanitization. This dissertation studies the ways by which prospective and retrospective security could be enforced uniformly in a provably correct manner to handle security challenges in legacy systems. Provable correctness of our results relies on the formal Programming Languages-based approach that we have taken in order to provide software security assurance. Moreover, this dissertation includes the implementation of such in-depth enforcement mechanisms for a medical records web application

    PPL: a Packet Processing Language

    Get PDF
    Any computing device or system that uses the internet needs to analyze and identify the contents of network packets. Code that does this is often written in C, but reading, identifying, and manipulating network packets in C requires writing tricky and tedious code. Previous work has offered specification languages for describing the format of network packets, which would allow packet type identification without the hassles of doing this task in C. For example, McCann and Chandra\u27s Packet Types [3] system allows the programmer to define arbitrary packet types and generates C unctions which match given data against a specified packet type. This paper will present a packet processing language named PPL, which extends McCann and Chandra�s Packet Types to allow the programmer to not only describe arbitrary packet types, but also to control when and how a matching is attempted, with ML-style pattern matching. PPL is intended for multiple applications, such as intrusion detection systems, quick prototypes of new protocols, and IP de-multiplexing code
    • …