13,273 research outputs found
Light On String Solving: Approaches to Efficiently and Correctly Solving String Constraints
Widespread use of string solvers in formal analysis of string-heavy programs has led to a growing demand for more efficient and reliable techniques which can be applied in this context, especially for real-world cases. Designing an algorithm for the (generally undecidable) satisfiability problem for systems of string constraints requires a thorough understanding of the structure of constraints present in the targeted cases. We target the aforementioned case in different perspectives: We present an algorithm which works by reformulating the satisfiability of bounded word equations as a reachability problem for non-deterministic finite automata. Secondly, we present a transformation-system-based technique to solving string constraints. Thirdly, we investigate benchmarks presented in the literature containing regular expression membership predicates and design a decission procedure for a PSPACE-complete sub-theory. Additionally, we introduce a new benchmarking framework for string solvers and use it to showcase the power of our algorithms via an extensive empirical evaluation over a diverse set of benchmarks
Efficient Multistriding of Large Non-deterministic Finite State Automata for Deep Packet Inspection
Multistride automata speed up input matching because each multistriding transformation halves the size of the input string, leading to a potential 2x speedup. However, up to now little effort has been spent in optimizing the building process of multistride automata, with the result that current algorithms cannot be applied to real-life, large automata such as the ones used in commercial IDSs, because the time and the memory space needed to create the new automaton quickly becomes unfeasible. In this paper, new algorithms for efficient building of multistride NFAs for packet inspection are presented, explaining how these new techniques can outperform the previous algorithms in terms of required time and memory usag
Random Generation of Nondeterministic Finite-State Tree Automata
Algorithms for (nondeterministic) finite-state tree automata (FTAs) are often
tested on random FTAs, in which all internal transitions are equiprobable. The
run-time results obtained in this manner are usually overly optimistic as most
such generated random FTAs are trivial in the sense that the number of states
of an equivalent minimal deterministic FTA is extremely small. It is
demonstrated that nontrivial random FTAs are obtained only for a narrow band of
transition probabilities. Moreover, an analytic analysis yields a formula to
approximate the transition probability that yields the most complex random
FTAs, which should be used in experiments.Comment: In Proceedings TTATT 2013, arXiv:1311.5058. Andreas Maletti and
Daniel Quernheim were financially supported by the German Research Foundation
(DFG) grant MA/4959/1-
Joining Extractions of Regular Expressions
Regular expressions with capture variables, also known as "regex formulas,"
extract relations of spans (interval positions) from text. These relations can
be further manipulated via Relational Algebra as studied in the context of
document spanners, Fagin et al.'s formal framework for information extraction.
We investigate the complexity of querying text by Conjunctive Queries (CQs) and
Unions of CQs (UCQs) on top of regex formulas. We show that the lower bounds
(NP-completeness and W[1]-hardness) from the relational world also hold in our
setting; in particular, hardness hits already single-character text! Yet, the
upper bounds from the relational world do not carry over. Unlike the relational
world, acyclic CQs, and even gamma-acyclic CQs, are hard to compute. The source
of hardness is that it may be intractable to instantiate the relation defined
by a regex formula, simply because it has an exponential number of tuples. Yet,
we are able to establish general upper bounds. In particular, UCQs can be
evaluated with polynomial delay, provided that every CQ has a bounded number of
atoms (while unions and projection can be arbitrary). Furthermore, UCQ
evaluation is solvable with FPT (Fixed-Parameter Tractable) delay when the
parameter is the size of the UCQ
Should We Learn Probabilistic Models for Model Checking? A New Approach and An Empirical Study
Many automated system analysis techniques (e.g., model checking, model-based
testing) rely on first obtaining a model of the system under analysis. System
modeling is often done manually, which is often considered as a hindrance to
adopt model-based system analysis and development techniques. To overcome this
problem, researchers have proposed to automatically "learn" models based on
sample system executions and shown that the learned models can be useful
sometimes. There are however many questions to be answered. For instance, how
much shall we generalize from the observed samples and how fast would learning
converge? Or, would the analysis result based on the learned model be more
accurate than the estimation we could have obtained by sampling many system
executions within the same amount of time? In this work, we investigate
existing algorithms for learning probabilistic models for model checking,
propose an evolution-based approach for better controlling the degree of
generalization and conduct an empirical study in order to answer the questions.
One of our findings is that the effectiveness of learning may sometimes be
limited.Comment: 15 pages, plus 2 reference pages, accepted by FASE 2017 in ETAP
- …