5 research outputs found
Dependency-based Hybrid Trees for Semantic Parsing
We propose a novel dependency-based hybrid tree model for semantic parsing,
which converts natural language utterance into machine interpretable meaning
representations. Unlike previous state-of-the-art models, the semantic
information is interpreted as the latent dependency between the natural
language words in our joint representation. Such dependency information can
capture the interactions between the semantics and natural language words. We
integrate a neural component into our model and propose an efficient
dynamic-programming algorithm to perform tractable inference. Through extensive
experiments on the standard multilingual GeoQuery dataset with eight languages,
we demonstrate that our proposed approach is able to achieve state-of-the-art
performance across several languages. Analysis also justifies the effectiveness
of using our new dependency-based representation.Comment: Accepted by EMNLP 201
Graph-to-Tree Neural Networks for Learning Structured Input-Output Translation with Applications to Semantic Parsing and Math Word Problem
The celebrated Seq2Seq technique and its numerous variants achieve excellent
performance on many tasks such as neural machine translation, semantic parsing,
and math word problem solving. However, these models either only consider input
objects as sequences while ignoring the important structural information for
encoding, or they simply treat output objects as sequence outputs instead of
structural objects for decoding. In this paper, we present a novel
Graph-to-Tree Neural Networks, namely Graph2Tree consisting of a graph encoder
and a hierarchical tree decoder, that encodes an augmented graph-structured
input and decodes a tree-structured output. In particular, we investigated our
model for solving two problems, neural semantic parsing and math word problem.
Our extensive experiments demonstrate that our Graph2Tree model outperforms or
matches the performance of other state-of-the-art models on these tasks.Comment: Long Paper in EMNLP 2020. 12 pages including reference
A Pilot Study of Text-to-SQL Semantic Parsing for Vietnamese
Semantic parsing is an important NLP task. However, Vietnamese is a
low-resource language in this research area. In this paper, we present the
first public large-scale Text-to-SQL semantic parsing dataset for Vietnamese.
We extend and evaluate two strong semantic parsing baselines EditSQL (Zhang et
al., 2019) and IRNet (Guo et al., 2019) on our dataset. We compare the two
baselines with key configurations and find that: automatic Vietnamese word
segmentation improves the parsing results of both baselines; the normalized
pointwise mutual information (NPMI) score (Bouma, 2009) is useful for schema
linking; latent syntactic features extracted from a neural dependency parser
for Vietnamese also improve the results; and the monolingual language model
PhoBERT for Vietnamese (Nguyen and Nguyen, 2020) helps produce higher
performances than the recent best multilingual language model XLM-R (Conneau et
al., 2020).Comment: EMNLP 2020 (Findings
From Paraphrasing to Semantic Parsing: Unsupervised Semantic Parsing via Synchronous Semantic Decoding
Semantic parsing is challenging due to the structure gap and the semantic gap
between utterances and logical forms. In this paper, we propose an unsupervised
semantic parsing method - Synchronous Semantic Decoding (SSD), which can
simultaneously resolve the semantic gap and the structure gap by jointly
leveraging paraphrasing and grammar constrained decoding. Specifically, we
reformulate semantic parsing as a constrained paraphrasing problem: given an
utterance, our model synchronously generates its canonical utterance and
meaning representation. During synchronous decoding: the utterance paraphrasing
is constrained by the structure of the logical form, therefore the canonical
utterance can be paraphrased controlledly; the semantic decoding is guided by
the semantics of the canonical utterance, therefore its logical form can be
generated unsupervisedly. Experimental results show that SSD is a promising
approach and can achieve competitive unsupervised semantic parsing performance
on multiple datasets.Comment: Accepted by ACL 202
Bootstrapping a Crosslingual Semantic Parser
Recent progress in semantic parsing scarcely considers languages other than
English but professional translation can be prohibitively expensive. We adapt a
semantic parser trained on a single language, such as English, to new languages
and multiple domains with minimal annotation. We query if machine translation
is an adequate substitute for training data, and extend this to investigate
bootstrapping using joint training with English, paraphrasing, and multilingual
pre-trained models. We develop a Transformer-based parser combining paraphrases
by ensembling attention over multiple encoders and present new versions of ATIS
and Overnight in German and Chinese for evaluation. Experimental results
indicate that MT can approximate training data in a new language for accurate
parsing when augmented with paraphrasing through multiple MT engines.
Considering when MT is inadequate, we also find that using our approach
achieves parsing accuracy within 2% of complete translation using only 50% of
training data.Comment: Camera Ready for EMNLP2020 Finding