427 research outputs found
An adaptive finite-state automata application to the problem of reducing the number of states in approximate string matching
This paper presents an alternative way to use finite-state automata in order to deal with approximate string matching. By exploring some adaptive features that enable any finitestate automaton model to change configuration during computational steps, dynamically deleting or creating new transitions, we can actually control the behavior and the topology of the automaton. We use these features for an application to approximate string matching trying to reduce the number of states requiredEje: VI Workshop de Agentes y Sistemas Inteligentes (WASI)Red de Universidades con Carreras en Informática (RedUNCI
An adaptive finite-state automata application to the problem of reducing the number of states in approximate string matching
This paper presents an alternative way to use finite-state automata in order to deal with approximate string matching. By exploring some adaptive features that enable any finitestate automaton model to change configuration during computational steps, dynamically deleting or creating new transitions, we can actually control the behavior and the topology of the automaton. We use these features for an application to approximate string matching trying to reduce the number of states requiredEje: VI Workshop de Agentes y Sistemas Inteligentes (WASI)Red de Universidades con Carreras en Informática (RedUNCI
Stream Processing using Grammars and Regular Expressions
In this dissertation we study regular expression based parsing and the use of
grammatical specifications for the synthesis of fast, streaming
string-processing programs.
In the first part we develop two linear-time algorithms for regular
expression based parsing with Perl-style greedy disambiguation. The first
algorithm operates in two passes in a semi-streaming fashion, using a constant
amount of working memory and an auxiliary tape storage which is written in the
first pass and consumed by the second. The second algorithm is a single-pass
and optimally streaming algorithm which outputs as much of the parse tree as is
semantically possible based on the input prefix read so far, and resorts to
buffering as many symbols as is required to resolve the next choice. Optimality
is obtained by performing a PSPACE-complete pre-analysis on the regular
expression.
In the second part we present Kleenex, a language for expressing
high-performance streaming string processing programs as regular grammars with
embedded semantic actions, and its compilation to streaming string transducers
with worst-case linear-time performance. Its underlying theory is based on
transducer decomposition into oracle and action machines, and a finite-state
specialization of the streaming parsing algorithm presented in the first part.
In the second part we also develop a new linear-time streaming parsing
algorithm for parsing expression grammars (PEG) which generalizes the regular
grammars of Kleenex. The algorithm is based on a bottom-up tabulation algorithm
reformulated using least fixed points and evaluated using an instance of the
chaotic iteration scheme by Cousot and Cousot
- …