Search CORE

1,034 research outputs found

Supervised Syntax-based Alignment between English Sentences and Abstract Meaning Representation Graphs

Author: Chu Chenhui
Kurohashi Sadao
Publication venue
Publication date: 17/02/2017
Field of study

As alignment links are not given between English sentences and Abstract Meaning Representation (AMR) graphs in the AMR annotation, automatic alignment becomes indispensable for training an AMR parser. Previous studies formalize it as a string-to-string problem and solve it in an unsupervised way, which suffers from data sparseness due to the small size of training data for English-AMR alignment. In this paper, we formalize it as a syntax-based alignment problem and solve it in a supervised manner based on syntax trees, which can address the data sparseness problem by generalizing English-AMR tokens to syntax tags. Experiments verify the effectiveness of the proposed method not only for English-AMR alignment, but also for AMR parsing.Comment: Updated the paper with AMR parsing result

arXiv.org e-Print Archive

Multilingual Chart-based Constituency Parse Extraction from Pre-trained Language Models

Author: Kim Taeuk
Lee Sang-goo
Li Bowen
Publication venue
Publication date: 11/04/2021
Field of study

As it has been unveiled that pre-trained language models (PLMs) are to some extent capable of recognizing syntactic concepts in natural language, much effort has been made to develop a method for extracting complete (binary) parses from PLMs without training separate parsers. We improve upon this paradigm by proposing a novel chart-based method and an effective top-K ensemble technique. Moreover, we demonstrate that we can broaden the scope of application of the approach into multilingual settings. Specifically, we show that by applying our method on multilingual PLMs, it becomes possible to induce non-trivial parses for sentences from nine languages in an integrated and language-agnostic manner, attaining performance superior or comparable to that of unsupervised PCFGs. We also verify that our approach is robust to cross-lingual transfer. Finally, we provide analyses on the inner workings of our method. For instance, we discover universal attention heads which are consistently sensitive to syntactic information irrespective of the input language.Comment: preprin

arXiv.org e-Print Archive

Perturbed Masking: Parameter-free Probing for Analyzing and Interpreting BERT

Author: Chen Yun
Kao Ben
Liu Qun
Wu Zhiyong
Publication venue
Publication date: 28/05/2021
Field of study

By introducing a small set of additional parameters, a probe learns to solve specific linguistic tasks (e.g., dependency parsing) in a supervised manner using feature representations (e.g., contextualized embeddings). The effectiveness of such probing tasks is taken as evidence that the pre-trained model encodes linguistic knowledge. However, this approach of evaluating a language model is undermined by the uncertainty of the amount of knowledge that is learned by the probe itself. Complementary to those works, we propose a parameter-free probing technique for analyzing pre-trained language models (e.g., BERT). Our method does not require direct supervision from the probing tasks, nor do we introduce additional parameters to the probing process. Our experiments on BERT show that syntactic trees recovered from BERT using our method are significantly better than linguistically-uninformed baselines. We further feed the empirically induced dependency structures into a downstream sentiment classification task and find its improvement compatible with or even superior to a human-designed dependency schema.Comment: ACL202

arXiv.org e-Print Archive

Depth-bounding is effective: Improvements and evaluation of unsupervised PCFG induction

Author: Doshi-Velez Finale
Jin Lifeng
Miller Timothy
Schuler William
Schwartz Lane
Publication venue
Publication date: 09/09/2018
Field of study

There have been several recent attempts to improve the accuracy of grammar induction systems by bounding the recursive complexity of the induction model (Ponvert et al., 2011; Noji and Johnson, 2016; Shain et al., 2016; Jin et al., 2018). Modern depth-bounded grammar inducers have been shown to be more accurate than early unbounded PCFG inducers, but this technique has never been compared against unbounded induction within the same system, in part because most previous depth-bounding models are built around sequence models, the complexity of which grows exponentially with the maximum allowed depth. The present work instead applies depth bounds within a chart-based Bayesian PCFG inducer (Johnson et al., 2007b), where bounding can be switched on and off, and then samples trees with and without bounding. Results show that depth-bounding is indeed significantly effective in limiting the search space of the inducer and thereby increasing the accuracy of the resulting parsing model. Moreover, parsing results on English, Chinese and German show that this bounded model with a new inference technique is able to produce parse trees more accurately than or competitively with state-of-the-art constituency-based grammar induction models.Comment: EMNLP 201

arXiv.org e-Print Archive

Self-Training for Unsupervised Parsing with PRPN

Author: Bowman Samuel R.
Kann Katharina
Mohananey Anhad
Publication venue
Publication date: 27/05/2020
Field of study

Neural unsupervised parsing (UP) models learn to parse without access to syntactic annotations, while being optimized for another task like language modeling. In this work, we propose self-training for neural UP models: we leverage aggregated annotations predicted by copies of our model as supervision for future copies. To be able to use our model's predictions during training, we extend a recent neural UP architecture, the PRPN (Shen et al., 2018a) such that it can be trained in a semi-supervised fashion. We then add examples with parses predicted by our model to our unlabeled UP training data. Our self-trained model outperforms the PRPN by 8.1% F1 and the previous state of the art by 1.6% F1. In addition, we show that our architecture can also be helpful for semi-supervised parsing in ultra-low-resource settings.Comment: Accepted for publication at the 16th International Conference on Parsing Technologies (IWPT), 202

arXiv.org e-Print Archive

Do latent tree learning models identify meaningful structure in sentences?

Author: Bowman Samuel R.
Drozdov Andrew
Williams Adina
Publication venue
Publication date: 26/02/2018
Field of study

Recent work on the problem of latent tree learning has made it possible to train neural networks that learn to both parse a sentence and use the resulting parse to interpret the sentence, all without exposure to ground-truth parse trees at training time. Surprisingly, these models often perform better at sentence understanding tasks than models that use parse trees from conventional parsers. This paper aims to investigate what these latent tree learning models learn. We replicate two such models in a shared codebase and find that (i) only one of these models outperforms conventional tree-structured models on sentence classification, (ii) its parsing strategies are not especially consistent across random restarts, (iii) the parses it produces tend to be shallower than standard Penn Treebank (PTB) parses, and (iv) they do not resemble those of PTB or any other semantic or syntactic formalism that the authors are aware of.Comment: 15 pages, 6 figures, 4 tables. v1. was submitted to TACL, v2. was accepted to TACL, name change, additional baselines (R/L branching

arXiv.org e-Print Archive

Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks

Author: Courville Aaron
Shen Yikang
Sordoni Alessandro
Tan Shawn
Publication venue
Publication date: 08/05/2019
Field of study

Natural language is hierarchically structured: smaller units (e.g., phrases) are nested within larger units (e.g., clauses). When a larger constituent ends, all of the smaller constituents that are nested within it must also be closed. While the standard LSTM architecture allows different neurons to track information at different time scales, it does not have an explicit bias towards modeling a hierarchy of constituents. This paper proposes to add such an inductive bias by ordering the neurons; a vector of master input and forget gates ensures that when a given neuron is updated, all the neurons that follow it in the ordering are also updated. Our novel recurrent architecture, ordered neurons LSTM (ON-LSTM), achieves good performance on four different tasks: language modeling, unsupervised parsing, targeted syntactic evaluation, and logical inference.Comment: Published as a conference paper at ICLR 201

arXiv.org e-Print Archive

CRF Autoencoder for Unsupervised Dependency Parsing

Author: Cai Jiong
Jiang Yong
Tu Kewei
Publication venue
Publication date: 03/08/2017
Field of study

Unsupervised dependency parsing, which tries to discover linguistic dependency structures from unannotated data, is a very challenging task. Almost all previous work on this task focuses on learning generative models. In this paper, we develop an unsupervised dependency parsing model based on the CRF autoencoder. The encoder part of our model is discriminative and globally normalized which allows us to use rich features as well as universal linguistic priors. We propose an exact algorithm for parsing as well as a tractable learning algorithm. We evaluated the performance of our model on eight multilingual treebanks and found that our model achieved comparable performance with state-of-the-art approaches.Comment: EMNLP 201

arXiv.org e-Print Archive

On the Role of Supervision in Unsupervised Constituency Parsing

Author: Gimpel Kevin
Livescu Karen
Shi Haoyue
Publication venue
Publication date: 06/10/2020
Field of study

We analyze several recent unsupervised constituency parsing models, which are tuned with respect to the parsing

F_1

score on the Wall Street Journal (WSJ) development set (1,700 sentences). We introduce strong baselines for them, by training an existing supervised parsing model (Kitaev and Klein, 2018) on the same labeled examples they access. When training on the 1,700 examples, or even when using only 50 examples for training and 5 for development, such a few-shot parsing approach can outperform all the unsupervised parsing methods by a significant margin. Few-shot parsing can be further improved by a simple data augmentation method and self-training. This suggests that, in order to arrive at fair conclusions, we should carefully consider the amount of labeled data used for model development. We propose two protocols for future work on unsupervised parsing: (i) use fully unsupervised criteria for hyperparameter tuning and model selection; (ii) use as few labeled examples as possible for model development, and compare to few-shot parsing trained on the same labeled examples.Comment: EMNLP 2020. Project page: https://ttic.uchicago.edu/~freda/project/rsucp

arXiv.org e-Print Archive

How Important is Syntactic Parsing Accuracy? An Empirical Evaluation on Rule-Based Sentiment Analysis

Author: Alonso-Alonso Iago
Gómez-Rodríguez Carlos
Vilares David
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/10/2017
Field of study

Syntactic parsing, the process of obtaining the internal structure of sentences in natural languages, is a crucial task for artificial intelligence applications that need to extract meaning from natural language text or speech. Sentiment analysis is one example of application for which parsing has recently proven useful. In recent years, there have been significant advances in the accuracy of parsing algorithms. In this article, we perform an empirical, task-oriented evaluation to determine how parsing accuracy influences the performance of a state-of-the-art rule-based sentiment analysis system that determines the polarity of sentences from their parse trees. In particular, we evaluate the system using four well-known dependency parsers, including both current models with state-of-the-art accuracy and more innacurate models which, however, require less computational resources. The experiments show that all of the parsers produce similarly good results in the sentiment analysis task, without their accuracy having any relevant influence on the results. Since parsing is currently a task with a relatively high computational cost that varies strongly between algorithms, this suggests that sentiment analysis researchers and users should prioritize speed over accuracy when choosing a parser; and parsing researchers should investigate models that improve speed further, even at some cost to accuracy.Comment: 19 pages. Accepted for publication in Artificial Intelligence Review. This update only adds the DOI link to comply with journal's term

arXiv.org e-Print Archive