7 research outputs found

    Window Expressions for Stream Data Processing

    Full text link
    Traditional ways of storing and querying data do not work well in scenarios where data is being generated continuously and quick decisions need to be taken. For example, in hospital intensive care units, signals from multiple devices need to be monitored and the occurrence of any anomaly should raise alarms immediately. A typical design would take the average from a window of say 10 seconds (time-based) or 10 successive (count-based) readings and look for sudden deviations. Existing stream processing systems either restrict the windows to time or count-based windows or let users define customized windows in imperative programming languages. These are subject to the implementers' interpretation of what is desired and hard to understand for others. We introduce a formalism for specifying windows based on Monadic Second Order logic. It offers several advantages over ad-hoc definitions written in imperative languages. We demonstrate four such advantages. First, we illustrate how practical streaming data queries can be easily written with precise semantics. Second, we can get different but expressively equivalent formalisms for defining windows. We use one of them (regular expressions) to design an end-user-friendly language for defining windows. Third, we use another expressively equivalent formalism (automata) to design a processor that automatically generates windows according to specifications. The fourth advantage we demonstrate is more sophisticated. Some window definitions have the problem of too many windows overlapping with each other, overwhelming the processing engine. This is handled in different ways by different engines, but all the options are about what to do when this happens at runtime. We study this as a static analysis question and prove that it is undecidable to check whether such a scenario can ever arise for a given window definition. We identify a decidable fragment..

    Register automata with linear arithmetic

    Full text link
    We propose a novel automata model over the alphabet of rational numbers, which we call register automata over the rationals (RA-Q). It reads a sequence of rational numbers and outputs another rational number. RA-Q is an extension of the well-known register automata (RA) over infinite alphabets, which are finite automata equipped with a finite number of registers/variables for storing values. Like in the standard RA, the RA-Q model allows both equality and ordering tests between values. It, moreover, allows to perform linear arithmetic between certain variables. The model is quite expressive: in addition to the standard RA, it also generalizes other well-known models such as affine programs and arithmetic circuits. The main feature of RA-Q is that despite the use of linear arithmetic, the so-called invariant problem---a generalization of the standard non-emptiness problem---is decidable. We also investigate other natural decision problems, namely, commutativity, equivalence, and reachability. For deterministic RA-Q, commutativity and equivalence are polynomial-time inter-reducible with the invariant problem

    Symbolic Register Automata

    Full text link
    Symbolic Finite Automata and Register Automata are two orthogonal extensions of finite automata motivated by real-world problems where data may have unbounded domains. These automata address a demand for a model over large or infinite alphabets, respectively. Both automata models have interesting applications and have been successful in their own right. In this paper, we introduce Symbolic Register Automata, a new model that combines features from both symbolic and register automata, with a view on applications that were previously out of reach. We study their properties and provide algorithms for emptiness, inclusion and equivalence checking, together with experimental results

    Completeness in Approximate Transduction

    Get PDF
    Symbolic finite automata (SFA) allow the representation of regular languages of strings over an infinite alphabet of symbols. Recently these automata have been studied in the context of abstract interpretation, showing their extreme flexibility in representing languages at different levels of abstraction. Therefore, SFAs can naturally approximate sets of strings by the language they recognise, providing a suitable abstract domain for the analysis of symbolic data structures. In this scenario, transducers model SFA transformations. We characterise the properties of transduction of SFAs that guarantee soundness and completeness of the abstract interpretation of operations manipulating strings. We apply our model to the derivation of sanitisers for preventing cross site scripting attacks in web application security. In this case we extract the code sanitiser directly from the backward (transduction) analysis of the program given the specification of the expected attack in terms of SFA

    Automata-based Model Counting String Constraint Solver for Vulnerability Analysis

    Get PDF
    Most common vulnerabilities in modern software applications are due to errors in string manipulation code. String constraint solvers are essential components of program analysis techniques for detecting and repairing vulnerabilities that are due to string manipulation errors. In this dissertation, we present an automata-based string constraint solver for vulnerability analysis of string manipulating programs.Given a string constraint, we generate an automaton that accepts all solutions that satisfy the constraint. Our string constraint solver can also map linear arithmetic constraints to automata in order to handle constraints on string lengths. By integrating our string constraint solver to a symbolic execution tool, we can check for string manipulation errors in programs. Recently, quantitative and probabilistic program analyses techniques have been proposed which require counting the number of solutions to string constraints. We extend our string constraint solver with model counting capability based on the observation that, using an automata-based constraint representation, model counting reduces to path counting, which can be solved precisely. Our approach is parameterized in the sense that, we do notassume a finite domain size during automata construction, resulting in a potentially infinite set of solutions, and our model counting approach works for arbitrarily large bounds.We have implemented our approach in a tool called ABC (Automata-Based model Counter) using a constraint language that is compatible with the SMTLIB language specification used by satifiabilty-modula-theories solvers. This SMTLIB interface facilitates integration of our constraint solver with existing symbolic execution tools. We demonstrate the effectiveness of ABC on a large set of string constraints extracted from real-world web applications.We also present automata-based testing techniques for string manipulating programs. A vulnerability signature is a characterization of all user inputs that can be used to exploit a vulnerability. Automata-based static string analysis techniques allow automated computation of vulnerability signatures represented as automata. Given a vulnerability signature represented as an automaton, we present algorithms for test case generation based on state, transition, and path coverage. These automaticallygenerated test cases can be used to test applications that are not analyzable statically, and to discover attack strings that demonstrate how the vulnerabilities can be exploited. We experimentally comparedifferent coverage criteria and demonstrate the effectiveness of our test generation approach

    Computer Aided Verification

    Get PDF
    This open access two-volume set LNCS 11561 and 11562 constitutes the refereed proceedings of the 31st International Conference on Computer Aided Verification, CAV 2019, held in New York City, USA, in July 2019. The 52 full papers presented together with 13 tool papers and 2 case studies, were carefully reviewed and selected from 258 submissions. The papers were organized in the following topical sections: Part I: automata and timed systems; security and hyperproperties; synthesis; model checking; cyber-physical systems and machine learning; probabilistic systems, runtime techniques; dynamical, hybrid, and reactive systems; Part II: logics, decision procedures; and solvers; numerical programs; verification; distributed systems and networks; verification and invariants; and concurrency
    corecore