31,616 research outputs found
The ModelCC Model-Driven Parser Generator
Syntax-directed translation tools require the specification of a language by
means of a formal grammar. This grammar must conform to the specific
requirements of the parser generator to be used. This grammar is then annotated
with semantic actions for the resulting system to perform its desired function.
In this paper, we introduce ModelCC, a model-based parser generator that
decouples language specification from language processing, avoiding some of the
problems caused by grammar-driven parser generators. ModelCC receives a
conceptual model as input, along with constraints that annotate it. It is then
able to create a parser for the desired textual syntax and the generated parser
fully automates the instantiation of the language conceptual model. ModelCC
also includes a reference resolution mechanism so that ModelCC is able to
instantiate abstract syntax graphs, rather than mere abstract syntax trees.Comment: In Proceedings PROLE 2014, arXiv:1501.0169
Pattern matching in compilers
In this thesis we develop tools for effective and flexible pattern matching.
We introduce a new pattern matching system called amethyst. Amethyst is not
only a generator of parsers of programming languages, but can also serve as an
alternative to tools for matching regular expressions.
Our framework also produces dynamic parsers. Its intended use is in the
context of IDE (accurate syntax highlighting and error detection on the fly).
Amethyst offers pattern matching of general data structures. This makes it a
useful tool for implementing compiler optimizations such as constant folding,
instruction scheduling, and dataflow analysis in general.
The parsers produced are essentially top-down parsers. Linear time complexity
is obtained by introducing the novel notion of structured grammars and
regularized regular expressions. Amethyst uses techniques known from compiler
optimizations to produce effective parsers.Comment: master thesi
On Tree-Based Neural Sentence Modeling
Neural networks with tree-based sentence encoders have shown better results
on many downstream tasks. Most of existing tree-based encoders adopt syntactic
parsing trees as the explicit structure prior. To study the effectiveness of
different tree structures, we replace the parsing trees with trivial trees
(i.e., binary balanced tree, left-branching tree and right-branching tree) in
the encoders. Though trivial trees contain no syntactic information, those
encoders get competitive or even better results on all of the ten downstream
tasks we investigated. This surprising result indicates that explicit syntax
guidance may not be the main contributor to the superior performances of
tree-based neural sentence modeling. Further analysis show that tree modeling
gives better results when crucial words are closer to the final representation.
Additional experiments give more clues on how to design an effective tree-based
encoder. Our code is open-source and available at
https://github.com/ExplorerFreda/TreeEnc.Comment: To Appear at EMNLP 201
- …