6,885 research outputs found
From Regular Expression Matching to Parsing
Given a regular expression and a string , the regular expression
parsing problem is to determine if matches and if so, determine how it
matches, e.g., by a mapping of the characters of to the characters in .
Regular expression parsing makes finding matches of a regular expression even
more useful by allowing us to directly extract subpatterns of the match, e.g.,
for extracting IP-addresses from internet traffic analysis or extracting
subparts of genomes from genetic data bases. We present a new general
techniques for efficiently converting a large class of algorithms that
determine if a string matches regular expression into algorithms that
can construct a corresponding mapping. As a consequence, we obtain the first
efficient linear space solutions for regular expression parsing
Stream Processing using Grammars and Regular Expressions
In this dissertation we study regular expression based parsing and the use of
grammatical specifications for the synthesis of fast, streaming
string-processing programs.
In the first part we develop two linear-time algorithms for regular
expression based parsing with Perl-style greedy disambiguation. The first
algorithm operates in two passes in a semi-streaming fashion, using a constant
amount of working memory and an auxiliary tape storage which is written in the
first pass and consumed by the second. The second algorithm is a single-pass
and optimally streaming algorithm which outputs as much of the parse tree as is
semantically possible based on the input prefix read so far, and resorts to
buffering as many symbols as is required to resolve the next choice. Optimality
is obtained by performing a PSPACE-complete pre-analysis on the regular
expression.
In the second part we present Kleenex, a language for expressing
high-performance streaming string processing programs as regular grammars with
embedded semantic actions, and its compilation to streaming string transducers
with worst-case linear-time performance. Its underlying theory is based on
transducer decomposition into oracle and action machines, and a finite-state
specialization of the streaming parsing algorithm presented in the first part.
In the second part we also develop a new linear-time streaming parsing
algorithm for parsing expression grammars (PEG) which generalizes the regular
grammars of Kleenex. The algorithm is based on a bottom-up tabulation algorithm
reformulated using least fixed points and evaluated using an instance of the
chaotic iteration scheme by Cousot and Cousot
A Neural Network Architecture for Syntax Analysis
Artificial neural networks (ANNs), due to their inherent parallelism and potential fault tolerance, offer an attractive paradigm for robust and efficient implementations of syntax analyzers. This paper proposes a modular neural network architecture for syntax analysis on continuous input stream of characters. The components of the proposed architecture include neural network designs for a stack, a lexical analyzer, a grammar parser and a parse tree construction module. The proposed NN stack allows simulation of a stack of large depth, needs no training, and hence is not application-specific. The proposed NN lexical analyzer provides a relatively efficient and high performance alternative to current computer systems for lexical analysis especially in natural language processing applications. The proposed NN parser generates parse trees by parsing strings from widely used subsets of deterministic context-free languages (generated by LR grammars). The estimated performance of the proposed neural network architecture (based on current CMOS VLSI technology) for syntax analysis is compared with that of commonly used approaches to syntax analysis in current computer systems. The results of this performance comparison suggest that the proposed neural network architecture offers an attractive approach for syntax analysis in a wide range of practical applications such as programming language compilation and natural language processing
Compiler Front-end for the IEC 61131-3 v3 Languages
Análises LĂ©xica e Sintática concluĂdas. Abstract Syntax Treequase completa. Falta validar o trabalho. Falta concluir o documento
A Survey on Array Storage, Query Languages, and Systems
Since scientific investigation is one of the most important providers of
massive amounts of ordered data, there is a renewed interest in array data
processing in the context of Big Data. To the best of our knowledge, a unified
resource that summarizes and analyzes array processing research over its long
existence is currently missing. In this survey, we provide a guide for past,
present, and future research in array processing. The survey is organized along
three main topics. Array storage discusses all the aspects related to array
partitioning into chunks. The identification of a reduced set of array
operators to form the foundation for an array query language is analyzed across
multiple such proposals. Lastly, we survey real systems for array processing.
The result is a thorough survey on array data storage and processing that
should be consulted by anyone interested in this research topic, independent of
experience level. The survey is not complete though. We greatly appreciate
pointers towards any work we might have forgotten to mention.Comment: 44 page
- …