Search CORE

7,289 research outputs found

On the Complexity and Performance of Parsing with Derivatives

Author: Adams Michael D.
Hollenbeck Celeste
Might Matthew
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 16/04/2016
Field of study

Current algorithms for context-free parsing inflict a trade-off between ease of understanding, ease of implementation, theoretical complexity, and practical performance. No algorithm achieves all of these properties simultaneously. Might et al. (2011) introduced parsing with derivatives, which handles arbitrary context-free grammars while being both easy to understand and simple to implement. Despite much initial enthusiasm and a multitude of independent implementations, its worst-case complexity has never been proven to be better than exponential. In fact, high-level arguments claiming it is fundamentally exponential have been advanced and even accepted as part of the folklore. Performance ended up being sluggish in practice, and this sluggishness was taken as informal evidence of exponentiality. In this paper, we reexamine the performance of parsing with derivatives. We have discovered that it is not exponential but, in fact, cubic. Moreover, simple (though perhaps not obvious) modifications to the implementation by Might et al. (2011) lead to an implementation that is not only easy to understand but also highly performant in practice.Comment: 13 pages; 12 figures; implementation at http://bitbucket.org/ucombinator/parsing-with-derivatives/ ; published in PLDI '16, Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation, June 13 - 17, 2016, Santa Barbara, CA, US

arXiv.org e-Print Archive

Crossref

LL(1) Parsing with Derivatives and Zippers

Author: A Verified LL
Aho Alfred V.
Aho Alfred V.
An
Ausaf Fahad
Compilers
Deterministic
Doaitse Swierstra S
Functional
Fundamenta Some
Generalised
Knuth Donald E
Leijen Daan
Leiß Haas
Neelakantan
Parr Terence
Parsing Practical Packrat
Pierce Benjamin C.
Prokopec Aleksandar
The
Traytel Dmitriy
Traytel Dmitriy
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 21/01/2021
Field of study

In this paper, we present an efficient, functional, and formally verified parsing algorithm for LL(1) context-free expressions based on the concept of derivatives of formal languages. Parsing with derivatives is an elegant parsing technique, which, in the general case, suffers from cubic worst-case time complexity and slow performance in practice. We specialise the parsing with derivatives algorithm to LL(1) context-free expressions, where alternatives can be chosen given a single token of lookahead. We formalise the notion of LL(1) expressions and show how to efficiently check the LL(1) property. Next, we present a novel linear-time parsing with derivatives algorithm for LL(1) expressions operating on a zipper-inspired data structure. We prove the algorithm correct in Coq and present an implementation as a parser combinators framework in Scala, with enumeration and pretty printing capabilities.Comment: Appeared at PLDI'20 under the title "Zippy LL(1) Parsing with Derivatives

arXiv.org e-Print Archive

Crossref

Efficient Multi-Template Learning for Structured Prediction

Author: Mao Qi
Tsang Ivor W.
Publication venue
Publication date: 01/01/2013
Field of study

Conditional random field (CRF) and Structural Support Vector Machine (Structural SVM) are two state-of-the-art methods for structured prediction which captures the interdependencies among output variables. The success of these methods is attributed to the fact that their discriminative models are able to account for overlapping features on the whole input observations. These features are usually generated by applying a given set of templates on labeled data, but improper templates may lead to degraded performance. To alleviate this issue, in this paper, we propose a novel multiple template learning paradigm to learn structured prediction and the importance of each template simultaneously, so that hundreds of arbitrary templates could be added into the learning model without caution. This paradigm can be formulated as a special multiple kernel learning problem with exponential number of constraints. Then we introduce an efficient cutting plane algorithm to solve this problem in the primal, and its convergence is presented. We also evaluate the proposed learning paradigm on two widely-studied structured prediction tasks, \emph{i.e.} sequence labeling and dependency parsing. Extensive experimental results show that the proposed method outperforms CRFs and Structural SVMs due to exploiting the importance of each template. Our complexity analysis and empirical results also show that our proposed method is more efficient than OnlineMKL on very sparse and high-dimensional data. We further extend this paradigm for structured prediction using generalized

p

-block norm regularization with

p>1

, and experiments show competitive performances when

p \in [1,2)

arXiv.org e-Print Archive

OPUS - University of Technology Sydney