Search CORE

12 research outputs found

The PAPAGENO Parallel-Parser Generator

Author: A. Barenghi
A. Barenghi
C. Ghezzi
D. Grune
D. Sarkar
K. Bosschere De
M.D. Mickunas
R.W. Floyd
R.W. Floyd
S. Crespi Reghizzi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

The increasing use of multicore processors has deeply transformed computing paradigms and applications. The wide availability of multicore systems had an impact also in the field of compiler technology, although the research on deterministic parsing did not prove to be effective in exploiting the architectural advantages, the main impediment being the inherent sequential nature of traditional LL and LR algorithms. We present PAPAGENO, an automated parser generator relying on operator precedence grammars. We complemented the PAPAGENO-generated parallel parsers with parallel lexing techniques, obtaining near-linear speedups on multicore machines, and the same speed as Bison parsers on sequential execution

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

Parallel parsing made practical

Author: Barenghi Alessandro
CRESPI REGHIZZI Stefano
Mandrioli Dino
Panella Federica
Pradella Matteo
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

The property of local parsability allows to parse inputs through inspecting only a bounded-length string around the current token. This in turn enables the construction of a scalable, data-parallel parsing algorithm, which is presented in this work. Such an algorithm is easily amenable to be automatically generated via a parser generator tool, which was realized, and is also presented in the following. Furthermore, to complete the framework of a parallel input analysis, a parallel scanner can also combined with the parser. To prove the practicality of a parallel lexing and parsing approach, we report the results of the adaptation of JSON and Lua to a form fit for parallel parsing (i.e. an operator-precedence grammar) through simple grammar changes and scanning transformations. The approach is validated with performance figures from both high performance and embedded multicore platforms, obtained analyzing real-world inputs as a test-bench. The results show that our approach matches or dominates the performances of production-grade LR parsers in sequential execution, and achieves significant speedups and good scaling on multi-core machines. The work is concluded by a broad and critical survey of the past work on parallel parsing and future directions on the integration with semantic analysis and incremental parsing

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

A Parallel Data Processing System for Large Text Data based on OPG

Author: Liu Qiheng
劉啓恒
Publication venue: 情報理工学系研究科電子情報学専攻
Publication date: 23/03/2020
Field of study

学位の種別: 修士University of Tokyo(東京大学

Toward a theory of input-driven locally parsable languages

Author: CRESPI REGHIZZI Stefano
Lonati Violetta
Mandrioli Dino
Pradella Matteo
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

If a context-free language enjoys the local parsability property then, no matter how the source string is segmented, each segment can be parsed independently, and an efficient parallel parsing algorithm becomes possible. The new class of locally chain parsable languages (LCPLs), included in the deterministic context-free language family, is here defined by means of the chain-driven automaton and characterized by decidable properties of grammar derivations. Such automaton decides whether to reduce or not a substring in a way purely driven by the terminal characters, thus extending the well-known concept of input-driven (ID) alias visibly pushdown machines. The LCPL family extends and improves the practically relevant Floyd's operator-precedence (OP) languages which are known to strictly include the ID languages, and for which a parallel-parser generator exists

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

Generalizing input-driven languages: theoretical and practical benefits

Author: Mandrioli Dino
Pradella Matteo
Publication venue
Publication date: 02/05/2017
Field of study

Regular languages (RL) are the simplest family in Chomsky's hierarchy. Thanks to their simplicity they enjoy various nice algebraic and logic properties that have been successfully exploited in many application fields. Practically all of their related problems are decidable, so that they support automatic verification algorithms. Also, they can be recognized in real-time. Context-free languages (CFL) are another major family well-suited to formalize programming, natural, and many other classes of languages; their increased generative power w.r.t. RL, however, causes the loss of several closure properties and of the decidability of important problems; furthermore they need complex parsing algorithms. Thus, various subclasses thereof have been defined with different goals, spanning from efficient, deterministic parsing to closure properties, logic characterization and automatic verification techniques. Among CFL subclasses, so-called structured ones, i.e., those where the typical tree-structure is visible in the sentences, exhibit many of the algebraic and logic properties of RL, whereas deterministic CFL have been thoroughly exploited in compiler construction and other application fields. After surveying and comparing the main properties of those various language families, we go back to operator precedence languages (OPL), an old family through which R. Floyd pioneered deterministic parsing, and we show that they offer unexpected properties in two fields so far investigated in totally independent ways: they enable parsing parallelization in a more effective way than traditional sequential parsers, and exhibit the same algebraic and logic properties so far obtained only for less expressive language families

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

Logic Characterization of Invisibly Structured Languages: The Case of Floyd Languages

Author: D. Grune
J. Berstel
R.W. Floyd
S. Crespi Reghizzi
S. Crespi Reghizzi
V. Lonati
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Operator precedence grammars define a classical Boolean and deterministic context-free language family (called Floyd languages or FLs). FLs have been shown to strictly include the well-known Visibly Pushdown Languages, and enjoy the same nice closure properties. In this paper we provide a complete characterization of FLs in terms of a suitable Monadic Second-Order Logic. Traditional approaches to logic characterization of formal languages refer explicitly to the structures over which they are interpreted - e.g, trees or graphs - or to strings that are isomorphic to the structure, as in parenthesis languages. In the case of FLs, instead, the syntactic structure of input strings is “invisible” and must be reconstructed through parsing. This requires that logic formulae encode some typical context-free parsing actions, such as shift-reduce ones

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

AIR Universita degli studi di Milano

Applying Front End Compiler Process to Parse Polynomials in Parallel

Author: Tsegaye Amha W
Publication venue: Scholarship@Western
Publication date: 16/12/2020
Field of study

Parsing large expressions, in particular large polynomial expressions, is an important task for computer algebra systems. Despite of the apparent simplicity of the problem, its efficient software implementation brings various challenges. Among them is the fact that this is a memory bound application for which a multi-threaded implementation is necessarily limited by the characteristics of the memory organization of supporting hardware. In this thesis, we design, implement and experiment with a multi-threaded parser for large polynomial expressions. We extract parallelism by splitting the input character string, into meaningful sub-strings that can be parsed concurrently before being merged into a single polynomial. Our implementation targeting multi-core processors is realized with the Basic Polynomial Algebra Subprograms (BPAS). Experimental results show that the approach is promising both in terms of speedup factors and memory consumption

Scholarship@Western

GPU-based JSON data processing using structural indexes

Author: Vlaswinkel Koen R.
Publication venue
Publication date: 05/08/2021
Field of study

Pure OAI Repository

First-Order Logic Definability of Free Languages

Author: A Barenghi
A D’Ulizia
C Higuera de la
C Lautemann
D Grune
F Panella
R Alur
R Alur
R McNaughton
R McNaughton
S Crespi Reghizzi
S Crespi Reghizzi
S Crespi Reghizzi
S Crespi Reghizzi
S Crespi Reghizzi
V Lonati
V Lonati
WS Brainerd
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

Operator Precedence Languages: Their Automata-Theoretic and Logic Characterization

Author: Lonati Violetta
Mandrioli Dino
Panella Federica
Pradella Matteo
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2015
Field of study

Operator precedence languages were introduced half a century ago by Robert Floyd to support deterministic and efficient parsing of context-free languages. Recently, we renewed our interest in this class of languages thanks to a few distinguishing properties that make them attractive for exploiting various modern technologies. Precisely, their local parsability enables parallel and incremental parsing, whereas their closure properties make them amenable to automatic verification techniques, including model checking. In this paper we provide a fairly complete theory of this class of languages: we introduce a class of automata with the same recognizing power as the generative power of their grammars; we provide a characterization of their sentences in terms of monadic second-order logic as has been done in previous literature for more restricted language classes such as regular, parenthesis, and input-driven ones; we investigate preserved and lost properties when extending the language sentences from finite length to infinite length (

omega

-languages). As a result, we obtain a class of languages that enjoys many of the nice properties of regular languages (closure and decidability properties, logic characterization) but is considerably larger than other families---typically parenthesis and input-driven ones---with the same properties, covering “almost” all deterministic languages

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

AIR Universita degli studi di Milano