Search CORE

355 research outputs found

Regular Expressions and Transducers over Alphabet-invariant and User-defined Labels

Author: A Demaille
A Demaille
BG Mirkin
C Allauzen
HJ Shyr
J Brzozowski
J Sakarovitch
JA Brzozowski
JM Champarnaud
JM Champarnaud
K Thompson
M Veanes
M-P Béal
P Caron
R Bastos
S Broda
S Konstantinidis
S Konstantinidis
S Lombardy
VM Antimirov
Y Sheng
Publication venue
Publication date: 04/05/2018
Field of study

We are interested in regular expressions and transducers that represent word relations in an alphabet-invariant way---for example, the set of all word pairs u,v where v is a prefix of u independently of what the alphabet is. Current software systems of formal language objects do not have a mechanism to define such objects. We define transducers in which transition labels involve what we call set specifications, some of which are alphabet invariant. In fact, we give a more broad definition of automata-type objects, called labelled graphs, where each transition label can be any string, as long as that string represents a subset of a certain monoid. Then, the behaviour of the labelled graph is a subset of that monoid. We do the same for regular expressions. We obtain extensions of a few classic algorithmic constructions on ordinary regular expressions and transducers at the broad level of labelled graphs and in such a way that the computational efficiency of the extended constructions is not sacrificed. For regular expressions with set specs we obtain the corresponding partial derivative automata. For transducers with set specs we obtain further algorithms that can be applied to questions about independent regular languages, in particular the witness version of the independent property satisfaction question

arXiv.org e-Print Archive

Crossref

Liveness of Randomised Parameterised Systems under Arbitrary Schedulers (Technical Report)

Author: Lin Anthony W.
Ruemmer Philipp
Publication venue
Publication date: 01/01/2016
Field of study

We consider the problem of verifying liveness for systems with a finite, but unbounded, number of processes, commonly known as parameterised systems. Typical examples of such systems include distributed protocols (e.g. for the dining philosopher problem). Unlike the case of verifying safety, proving liveness is still considered extremely challenging, especially in the presence of randomness in the system. In this paper we consider liveness under arbitrary (including unfair) schedulers, which is often considered a desirable property in the literature of self-stabilising systems. We introduce an automatic method of proving liveness for randomised parameterised systems under arbitrary schedulers. Viewing liveness as a two-player reachability game (between Scheduler and Process), our method is a CEGAR approach that synthesises a progress relation for Process that can be symbolically represented as a finite-state automaton. The method is incremental and exploits both Angluin-style L*-learning and SAT-solvers. Our experiments show that our algorithm is able to prove liveness automatically for well-known randomised distributed protocols, including Lehmann-Rabin Randomised Dining Philosopher Protocol and randomised self-stabilising protocols (such as the Israeli-Jalfon Protocol). To the best of our knowledge, this is the first fully-automatic method that can prove liveness for randomised protocols.Comment: Full version of CAV'16 pape

arXiv.org e-Print Archive

Crossref

Publikationer från Uppsala Universitet

Oxford University Research Archive

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Stream Processing using Grammars and Regular Expressions

Author: Rasmussen Ulrik Terp
Publication venue
Publication date: 01/01/2016
Field of study

In this dissertation we study regular expression based parsing and the use of grammatical specifications for the synthesis of fast, streaming string-processing programs. In the first part we develop two linear-time algorithms for regular expression based parsing with Perl-style greedy disambiguation. The first algorithm operates in two passes in a semi-streaming fashion, using a constant amount of working memory and an auxiliary tape storage which is written in the first pass and consumed by the second. The second algorithm is a single-pass and optimally streaming algorithm which outputs as much of the parse tree as is semantically possible based on the input prefix read so far, and resorts to buffering as many symbols as is required to resolve the next choice. Optimality is obtained by performing a PSPACE-complete pre-analysis on the regular expression. In the second part we present Kleenex, a language for expressing high-performance streaming string processing programs as regular grammars with embedded semantic actions, and its compilation to streaming string transducers with worst-case linear-time performance. Its underlying theory is based on transducer decomposition into oracle and action machines, and a finite-state specialization of the streaming parsing algorithm presented in the first part. In the second part we also develop a new linear-time streaming parsing algorithm for parsing expression grammars (PEG) which generalizes the regular grammars of Kleenex. The algorithm is based on a bottom-up tabulation algorithm reformulated using least fixed points and evaluated using an instance of the chaotic iteration scheme by Cousot and Cousot

arXiv.org e-Print Archive

Copenhagen University Research Information System

Verifying Programs with Dynamic 1-Selector-Linked Structures in Regular Model Checking

Author: A. Bouajjani
A. Bouajjani
A. Deutsch
B. Jonsson
M. Bozga
N. Immerman
N. Klarlund
P. Wolper
P.A. Abdulla
R. Manevich
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2005
Field of study

International audienceWe address the problem of automatic verification of programs with dynamic data structures. We consider the case of sequential, non-recursive programs manipulating 1-selector-linked structures such as traditional linked lists (possibly sharing their tails) and circular lists. We propose an automata-based approach for a symbolic verification of such programs using the regular model checking framework. Given a program, the configurations of the memory are systematically encoded as words over a suitable finite alphabet, potentially infinite sets of configurations are represented by finite-state automata, and statements of the program are automatically translated into finite-state transducers defining regular relations between configurations. Then, abstract regular model checking techniques are applied in order to automatically check safety properties concerning the shape of the computed configurations or relating the input and output configurations. For that, we introduce new techniques for the computation of abstractions of the set of reachable configurations, and to refine these abstractions if spurious counterexamples are detected. Finally, we present experimental results showing the applicability of the approach and its efficiency

Crossref

HAL Descartes

Hal-Diderot

Recommended from our members

Symbolic Model Learning: New Algorithms and Applications

Author: Argyros Georgios
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2019
Field of study

In this thesis, we study algorithms which can be used to extract, or learn, formal mathematical models from software systems and then using these models to test whether the given software systems satisfy certain security properties such as robustness against code injection attacks. Specifically, we focus on studying learning algorithms for automata and transducers and the symbolic extensions of these models, namely symbolic finite automata (SFAs). In a high level, this thesis contributes the following results: 1. In the first part of the thesis, we present a unified treatment of many common variations of the seminal L* algorithm for learning deterministic finite automata (DFAs) as a congruence learning algorithm for the underlying Nerode congruence which forms the basis of automata theory. Under this formulation the basic data structures used by different variations are unified as different ways to implement the Nerode congruence using queries. 2. Next, building on the new formulation of L*-style algorithms we proceed to develop new algorithms for learning transducer models. Firstly, we present the first algorithm for learning deterministic partial transducers. Furthermore, we extend my algorithm into non-deterministic models by introducing a novel, generalized congruence relation over string transformations which is able to capture a subclass of string transformations with regular lookahead. We demonstrate that this class is able to capture many practical string transformation from the domain of string sanitizers in Web applications. 3. Classical learning algorithms for automata and transducers operate over finite alphabets and have a query complexity that scales linearly with the size of the alphabet. However, in practice, this dependence on the alphabet size hinders the performance of the algorithms. To address this issue, we develop the MAT* algorithm for learning symbolic finite state automata (SFAs) which operate over infinite alphabets. In practice, the MAT* learning algorithm allow us to plug custom transition learning algorithms which will efficiently infer the predicates in the transitions of the SFA without querying the whole alphabet set. 4. Finally, we use our learning algorithm toolbox as the basis for the development of a set of black-box testing algorithms. More specifically, we present Grammar Oriented Filter Auditing (GOFA), a novel technique which allows one to utilize my learning algorithms to evaluate the robustness of a string sanitizer or filter against a set of attack strings given as a context-free grammar. Furthermore, because such grammars are many times unavailable, we developed sfadiff a differential testing technique based on symbolic automata learning which can be used in order to perform differential testing of two different parser implementations using SFA learning algorithms and we demonstrate how our algorithm can be used to develop program fingerprints. We evaluate our algorithms against state-of-the-art Web Application Firewalls and discover over 15 previously unknown vulnerabilities which result in evading the firewalls and performing code injection attacks in the backend Web application. Finally, we show how our learning algorithms can uncover vulnerabilities which are missed by other black-box methods such as fuzzing and grammar-based testing

Columbia University Academic Commons

On the Spectrum of Hecke Type Operators related to some Fractal Groups

Author: Bartholdi Laurent
Grigorchuk Rostislav I.
Publication venue
Publication date: 01/01/1999
Field of study

We give the first example of a connected 4-regular graph whose Laplace operator's spectrum is a Cantor set, as well as several other computations of spectra following a common ``finite approximation'' method. These spectra are simple transforms of the Julia sets associated to some quadratic maps. The graphs involved are Schreier graphs of fractal groups of intermediate growth, and are also ``substitutional graphs''. We also formulate our results in terms of Hecke type operators related to some irreducible quasi-regular representations of fractal groups and in terms of the Markovian operator associated to noncommutative dynamical systems via which these fractal groups were originally defined. In the computations we performed, the self-similarity of the groups is reflected in the self-similarity of some operators; they are approximated by finite counterparts whose spectrum is computed by an ad hoc factorization process.Comment: 1 color figure, 2 color diagrams, many figure

arXiv.org e-Print Archive

CiteSeerX

MPG.PuRe

Archive ouverte UNIGE

Logics for Unranked Trees: An Overview

Author: A. Arnold
B. Courcelle
B. Courcelle
B.-H. Schlingloff
E. Clarke
F. Moller
F. Neven
F. Neven
G. Gottlob
G. Gottlob
I. Walukiewicz
J. Niehren
J.R. Büchi
J.W. Thatcher
J.W. Thatcher
L. Cardelli
L. Cardelli
L. Libkin
L. Stockmeyer
M. Benedikt
M. Grohe
M. Rabin
M.Y. Vardi
P. Bouyer
S. Abiteboul
T. Hafer
W. Thomas
Publication venue: 'Logical Methods in Computer Science e.V.'
Publication date: 01/01/2005
Field of study

Labeled unranked trees are used as a model of XML documents, and logical languages for them have been studied actively over the past several years. Such logics have different purposes: some are better suited for extracting data, some for expressing navigational properties, and some make it easy to relate complex properties of trees to the existence of tree automata for those properties. Furthermore, logics differ significantly in their model-checking properties, their automata models, and their behavior on ordered and unordered trees. In this paper we present a survey of logics for unranked trees

arXiv.org e-Print Archive

CiteSeerX

Crossref

Episciences.org

Directory of Open Access Journals

Edinburgh Research Explorer

Programming Using Automata and Transducers

Author: D\u27antoni Loris
Publication venue: ScholarlyCommons
Publication date: 01/01/2015
Field of study

Automata, the simplest model of computation, have proven to be an effective tool in reasoning about programs that operate over strings. Transducers augment automata to produce outputs and have been used to model string and tree transformations such as natural language translations. The success of these models is primarily due to their closure properties and decidable procedures, but good properties come at the price of limited expressiveness. Concretely, most models only support finite alphabets and can only represent small classes of languages and transformations. We focus on addressing these limitations and bridge the gap between the theory of automata and transducers and complex real-world applications: Can we extend automata and transducer models to operate over structured and infinite alphabets? Can we design languages that hide the complexity of these formalisms? Can we define executable models that can process the input efficiently? First, we introduce succinct models of transducers that can operate over large alphabets and design BEX, a language for analysing string coders. We use BEX to prove the correctness of UTF and BASE64 encoders and decoders. Next, we develop a theory of tree transducers over infinite alphabets and design FAST, a language for analysing tree-manipulating programs. We use FAST to detect vulnerabilities in HTML sanitizers, check whether augmented reality taggers conflict, and optimize and analyze functional programs that operate over lists and trees. Finally, we focus on laying the foundations of stream processing of hierarchical data such as XML files and program traces. We introduce two new efficient and executable models that can process the input in a left-to-right linear pass: symbolic visibly pushdown automata and streaming tree transducers. Symbolic visibly pushdown automata are closed under Boolean operations and can specify and efficiently monitor complex properties for hierarchical structures over infinite alphabets. Streaming tree transducers can express and efficiently process complex XML transformations while enjoying decidable procedures

CiteSeerX

ScholarlyCommons@Penn

Concatenation of graphs

Author: Engelfriet J.
Vereijken J.J.
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/1994
Field of study

Repository TU/e

Pure OAI Repository

Chain-Free String Constraints (Technical Report)

Author: Abdulla Parosh Aziz
Diep Mohamed Faouzi Atig Bui Phi
Holík Lukáš
Janků Petr
Publication venue
Publication date: 08/07/2023
Field of study

We address the satisfiability problem for string constraints that combine relational constraints represented by transducers, word equations, and string length constraints. This problem is undecidable in general. Therefore, we propose a new decidable fragment of string constraints, called weakly chaining string constraints, for which we show that the satisfiability problem is decidable. This fragment pushes the borders of decidability of string constraints by generalising the existing straight-line as well as the acyclic fragment of the string logic. We have developed a prototype implementation of our new decision procedure, and integrated it into in an existing framework that uses CEGAR with under-approximation of string constraints based on flattening. Our experimental results show the competitiveness and accuracy of the new framework

arXiv.org e-Print Archive