31,209 research outputs found
Regular Expression Matching and Operational Semantics
Many programming languages and tools, ranging from grep to the Java String
library, contain regular expression matchers. Rather than first translating a
regular expression into a deterministic finite automaton, such implementations
typically match the regular expression on the fly. Thus they can be seen as
virtual machines interpreting the regular expression much as if it were a
program with some non-deterministic constructs such as the Kleene star. We
formalize this implementation technique for regular expression matching using
operational semantics. Specifically, we derive a series of abstract machines,
moving from the abstract definition of matching to increasingly realistic
machines. First a continuation is added to the operational semantics to
describe what remains to be matched after the current expression. Next, we
represent the expression as a data structure using pointers, which enables
redundant searches to be eliminated via testing for pointer equality. From
there, we arrive both at Thompson's lockstep construction and a machine that
performs some operations in parallel, suitable for implementation on a large
number of cores, such as a GPU. We formalize the parallel machine using process
algebra and report some preliminary experiments with an implementation on a
graphics processor using CUDA.Comment: In Proceedings SOS 2011, arXiv:1108.279
Simultaneous Finite Automata: An Efficient Data-Parallel Model for Regular Expression Matching
Automata play important roles in wide area of computing and the growth of
multicores calls for their efficient parallel implementation. Though it is
known in theory that we can perform the computation of a finite automaton in
parallel by simulating transitions, its implementation has a large overhead due
to the simulation. In this paper we propose a new automaton called simultaneous
finite automaton (SFA) for efficient parallel computation of an automaton. The
key idea is to extend an automaton so that it involves the simulation of
transitions. Since an SFA itself has a good property of parallelism, we can
develop easily a parallel implementation without overheads. We have implemented
a regular expression matcher based on SFA, and it has achieved over 10-times
speedups on an environment with dual hexa-core CPUs in a typical case.Comment: This paper has been accepted at the following conference: 2013
International Conference on Parallel Processing (ICPP- 2013), October 1-4,
2013 Ecole Normale Suprieure de Lyon, Lyon, Franc
From Finite Automata to Regular Expressions and Back--A Summary on Descriptional Complexity
The equivalence of finite automata and regular expressions dates back to the
seminal paper of Kleene on events in nerve nets and finite automata from 1956.
In the present paper we tour a fragment of the literature and summarize results
on upper and lower bounds on the conversion of finite automata to regular
expressions and vice versa. We also briefly recall the known bounds for the
removal of spontaneous transitions (epsilon-transitions) on non-epsilon-free
nondeterministic devices. Moreover, we report on recent results on the average
case descriptional complexity bounds for the conversion of regular expressions
to finite automata and brand new developments on the state elimination
algorithm that converts finite automata to regular expressions.Comment: In Proceedings AFL 2014, arXiv:1405.527
A parallel grid-based implementation for real time processing of event log data in collaborative applications
Collaborative applications usually register user interaction in the form of semi-structured plain text event log data. Extracting and structuring of data is a prerequisite for later key processes such as the analysis of interactions, assessment of group activity, or the provision of awareness and feedback. Yet, in real situations of online collaborative activity, the processing of log data is usually done offline since structuring event log data is, in general, a computationally costly process and the amount of log data tends to be very large. Techniques to speed and scale up the structuring and processing of log data with minimal impact on the performance of the collaborative application are thus desirable to be able to process log data in real time. In this paper, we present a parallel grid-based implementation for processing in real time the event log data generated in collaborative applications. Our results show the feasibility of using grid middleware to speed and scale up the process of structuring and processing semi-structured event log data. The Grid prototype follows the Master-Worker (MW) paradigm. It is implemented using the Globus Toolkit (GT) and is tested on the Planetlab platform
Logical and Algebraic Characterizations of Rational Transductions
Rational word languages can be defined by several equivalent means: finite
state automata, rational expressions, finite congruences, or monadic
second-order (MSO) logic. The robust subclass of aperiodic languages is defined
by: counter-free automata, star-free expressions, aperiodic (finite)
congruences, or first-order (FO) logic. In particular, their algebraic
characterization by aperiodic congruences allows to decide whether a regular
language is aperiodic.
We lift this decidability result to rational transductions, i.e.,
word-to-word functions defined by finite state transducers. In this context,
logical and algebraic characterizations have also been proposed. Our main
result is that one can decide if a rational transduction (given as a
transducer) is in a given decidable congruence class. We also establish a
transfer result from logic-algebra equivalences over languages to equivalences
over transductions. As a consequence, it is decidable if a rational
transduction is first-order definable, and we show that this problem is
PSPACE-complete
On the descriptional complexity of iterative arrays
The descriptional complexity of iterative arrays (lAs) is studied. Iterative arrays are a parallel computational model with a sequential processing of the input. It is shown that lAs when compared to deterministic finite automata or pushdown automata may provide savings in size which are not bounded by any recursive function, so-called non-recursive trade-offs. Additional non-recursive trade-offs are proven to exist between lAs working in linear time and lAs working in real time. Furthermore, the descriptional complexity of lAs is compared with cellular automata (CAs) and non-recursive trade-offs are proven between two restricted classes. Finally, it is shown that many decidability questions for lAs are undecidable and not semidecidable
- …