3 research outputs found
Recurrent Graph Syntax Encoder for Neural Machine Translation
Syntax-incorporated machine translation models have been proven successful in
improving the model's reasoning and meaning preservation ability. In this
paper, we propose a simple yet effective graph-structured encoder, the
Recurrent Graph Syntax Encoder, dubbed \textbf{RGSE}, which enhances the
ability to capture useful syntactic information. The RGSE is done over a
standard encoder (recurrent or self-attention encoder), regarding recurrent
network units as graph nodes and injects syntactic dependencies as edges, such
that RGSE models syntactic dependencies and sequential information
(\textit{i.e.}, word order) simultaneously. Our approach achieves considerable
improvements over several syntax-aware NMT models in EnglishGerman
and EnglishCzech translation tasks. And RGSE-equipped big model
obtains competitive result compared with the state-of-the-art model in WMT14
En-De task. Extensive analysis further verifies that RGSE could benefit long
sentence modeling, and produces better translations.Comment: Work in Progres
Transformer-Based Neural Text Generation with Syntactic Guidance
We study the problem of using (partial) constituency parse trees as syntactic
guidance for controlled text generation. Existing approaches to this problem
use recurrent structures, which not only suffer from the long-term dependency
problem but also falls short in modeling the tree structure of the syntactic
guidance. We propose to leverage the parallelism of Transformer to better
incorporate parse trees. Our method first expands a partial template
constituency parse tree to a full-fledged parse tree tailored for the input
source text, and then uses the expanded tree to guide text generation. The
effectiveness of our model in this process hinges upon two new attention
mechanisms: 1) a path attention mechanism that forces one node to attend to
only other nodes located in its path in the syntax tree to better incorporate
syntax guidance; 2) a multi-encoder attention mechanism that allows the decoder
to dynamically attend to information from multiple encoders. Our experiments in
the controlled paraphrasing task show that our method outperforms SOTA models
both semantically and syntactically, improving the best baseline's BLEU score
from 11.83 to 26.27.Comment: 11 pages, 4 figures and 5 table
Graph-to-Graph Transformer for Transition-based Dependency Parsing
We propose the Graph2Graph Transformer architecture for conditioning on and
predicting arbitrary graphs, and apply it to the challenging task of
transition-based dependency parsing. After proposing two novel Transformer
models of transition-based dependency parsing as strong baselines, we show that
adding the proposed mechanisms for conditioning on and predicting graphs of
Graph2Graph Transformer results in significant improvements, both with and
without BERT pre-training. The novel baselines and their integration with
Graph2Graph Transformer significantly outperform the state-of-the-art in
traditional transition-based dependency parsing on both English Penn Treebank,
and 13 languages of Universal Dependencies Treebanks. Graph2Graph Transformer
can be integrated with many previous structured prediction methods, making it
easy to apply to a wide range of NLP tasks.Comment: Accepted to Findings of EMNLP 202