620 research outputs found
Relational Constraint Driven Test Case Synthesis for Web Applications
This paper proposes a relational constraint driven technique that synthesizes
test cases automatically for web applications. Using a static analysis,
servlets can be modeled as relational transducers, which manipulate backend
databases. We present a synthesis algorithm that generates a sequence of HTTP
requests for simulating a user session. The algorithm relies on backward
symbolic image computation for reaching a certain database state, given a code
coverage objective. With a slight adaptation, the technique can be used for
discovering workflow attacks on web applications.Comment: In Proceedings TAV-WEB 2010, arXiv:1009.330
Tree Transducers and Formal Methods (Dagstuhl Seminar 13192)
The aim of this Dagstuhl Seminar was to bring together researchers from various research areas related to the theory and application of tree transducers. Recently, interest in tree transducers has been revived due to surprising new applications in areas such as XML databases, security verification, programming language theory, and linguistics. This seminar therefore aimed to inspire the exchange of theoretical results and information regarding the practical requirements related to tree transducers
Backward Reachability of Array-based Systems by SMT solving: Termination and Invariant Synthesis
The safety of infinite state systems can be checked by a backward
reachability procedure. For certain classes of systems, it is possible to prove
the termination of the procedure and hence conclude the decidability of the
safety problem. Although backward reachability is property-directed, it can
unnecessarily explore (large) portions of the state space of a system which are
not required to verify the safety property under consideration. To avoid this,
invariants can be used to dramatically prune the search space. Indeed, the
problem is to guess such appropriate invariants. In this paper, we present a
fully declarative and symbolic approach to the mechanization of backward
reachability of infinite state systems manipulating arrays by Satisfiability
Modulo Theories solving. Theories are used to specify the topology and the data
manipulated by the system. We identify sufficient conditions on the theories to
ensure the termination of backward reachability and we show the completeness of
a method for invariant synthesis (obtained as the dual of backward
reachability), again, under suitable hypotheses on the theories. We also
present a pragmatic approach to interleave invariant synthesis and backward
reachability so that a fix-point for the set of backward reachable states is
more easily obtained. Finally, we discuss heuristics that allow us to derive an
implementation of the techniques in the model checker MCMT, showing remarkable
speed-ups on a significant set of safety problems extracted from a variety of
sources.Comment: Accepted for publication in Logical Methods in Computer Scienc
Stream Processing using Grammars and Regular Expressions
In this dissertation we study regular expression based parsing and the use of
grammatical specifications for the synthesis of fast, streaming
string-processing programs.
In the first part we develop two linear-time algorithms for regular
expression based parsing with Perl-style greedy disambiguation. The first
algorithm operates in two passes in a semi-streaming fashion, using a constant
amount of working memory and an auxiliary tape storage which is written in the
first pass and consumed by the second. The second algorithm is a single-pass
and optimally streaming algorithm which outputs as much of the parse tree as is
semantically possible based on the input prefix read so far, and resorts to
buffering as many symbols as is required to resolve the next choice. Optimality
is obtained by performing a PSPACE-complete pre-analysis on the regular
expression.
In the second part we present Kleenex, a language for expressing
high-performance streaming string processing programs as regular grammars with
embedded semantic actions, and its compilation to streaming string transducers
with worst-case linear-time performance. Its underlying theory is based on
transducer decomposition into oracle and action machines, and a finite-state
specialization of the streaming parsing algorithm presented in the first part.
In the second part we also develop a new linear-time streaming parsing
algorithm for parsing expression grammars (PEG) which generalizes the regular
grammars of Kleenex. The algorithm is based on a bottom-up tabulation algorithm
reformulated using least fixed points and evaluated using an instance of the
chaotic iteration scheme by Cousot and Cousot
An Arabic CCG approach for determining constituent types from Arabic Treebank
AbstractConverting a treebank into a CCGbank opens the respective language to the sophisticated tools developed for Combinatory Categorial Grammar (CCG) and enriches cross-linguistic development. The conversion is primarily a three-step process: determining constituents’ types, binarization, and category conversion. Usually, this process involves a preprocessing step to the Treebank of choice for correcting brackets and normalizing tags for any changes that were introduced during the manual annotation, as well as extracting morpho-syntactic information that is necessary for determining constituents’ types. In this article, we describe the required preprocessing step on the Arabic Treebank, as well as how to determine Arabic constituents’ types. We conducted an experiment on parts 1 and 2 of the Penn Arabic Treebank (PATB) aimed at converting the PATB into an Arabic CCGbank. The performance of our algorithm when applied to ATB1v2.0 & ATB2v2.0 was 99% identification of head nodes and 100% coverage over the Treebank data
- …