152 research outputs found
A Full Non-Monotonic Transition System for Unrestricted Non-Projective Parsing
Restricted non-monotonicity has been shown beneficial for the projective
arc-eager dependency parser in previous research, as posterior decisions can
repair mistakes made in previous states due to the lack of information. In this
paper, we propose a novel, fully non-monotonic transition system based on the
non-projective Covington algorithm. As a non-monotonic system requires
exploration of erroneous actions during the training process, we develop
several non-monotonic variants of the recently defined dynamic oracle for the
Covington parser, based on tight approximations of the loss. Experiments on
datasets from the CoNLL-X and CoNLL-XI shared tasks show that a non-monotonic
dynamic oracle outperforms the monotonic version in the majority of languages.Comment: 11 pages. Accepted for publication at ACL 201
Elimination of Spurious Ambiguity in Transition-Based Dependency Parsing
We present a novel technique to remove spurious ambiguity from transition
systems for dependency parsing. Our technique chooses a canonical sequence of
transition operations (computation) for a given dependency tree. Our technique
can be applied to a large class of bottom-up transition systems, including for
instance Nivre (2004) and Attardi (2006)
A non-projective greedy dependency parser with bidirectional LSTMs
The LyS-FASTPARSE team presents BIST-COVINGTON, a neural implementation of
the Covington (2001) algorithm for non-projective dependency parsing. The
bidirectional LSTM approach by Kipperwasser and Goldberg (2016) is used to
train a greedy parser with a dynamic oracle to mitigate error propagation. The
model participated in the CoNLL 2017 UD Shared Task. In spite of not using any
ensemble methods and using the baseline segmentation and PoS tagging, the
parser obtained good results on both macro-average LAS and UAS in the big
treebanks category (55 languages), ranking 7th out of 33 teams. In the all
treebanks category (LAS and UAS) we ranked 16th and 12th. The gap between the
all and big categories is mainly due to the poor performance on four parallel
PUD treebanks, suggesting that some `suffixed' treebanks (e.g. Spanish-AnCora)
perform poorly on cross-treebank settings, which does not occur with the
corresponding `unsuffixed' treebank (e.g. Spanish). By changing that, we obtain
the 11th best LAS among all runs (official and unofficial). The code is made
available at https://github.com/CoNLL-UD-2017/LyS-FASTPARSEComment: 12 pages, 2 figures, 5 table
The Comparative Evaluation of Dependency Parsers in Parsing Estonian
Loomuliku keele töötluse (LKT) tehnoloogia on pidevalt arenemas, viimastel kümnenditel on selles valdkonnas toimunud väga suured edasiminekud. Üks LKT põhiülesanne on sõltuvussüntaksi analüüs, mis on sageli aluseks ka paljudele teistele ülesannetele, näiteks masintõlkele, nimeolemite tuvastamisele jne. Sõltuvussüntaksi analüüsi eesmärgiks on leida lause süntaktiline struktuur ja tuvastada sõnadevahelised grammatilised seosed. Enamik sõltuvussüntaksi analüüsi uuringuid on keskendunud inglise keele analüüsimisele. Antud ma-gistritöö eesmärgiks on hinnata ja võrrelda erinevate süntaksianalüsaatorite tulemuslikkust eesti keele analüüsimisel. Võrdlusesse valitud sõltuvussüntaksi analüsaatorid on: MaltParser, spaCy, Stanford’i neuroanalüsaator (nndep), SyntaxNet ja UDPipe. Hindamiseks kasutati peamiselt märgendatud seoste täpsust (Labelled Attachment Score), märgendamata seoste täpsust (Unlabelled Attachment Score) ning märgenduse täpsust (Label Accuracy). Magistritöö käigus treeniti spaCy, Stanfordi neuroparseri ning UDParseri mudelid eesti keele süntaksi analüüsimiseks, MaltParseri ja SyntaksNet’i jaoks kasutati eksperimentides olemasolevaid eeltreenitud mudeleid.Natural Language Processing (NLP) technology has been constantly developing and has seen a vast improvement in the last couple of decades. One key task in NLP is dependency parsing that oftentimes is a prerequisite for many other tasks such as machine translation, Named Entity Recognition (NER) and so on. The idea of dependency parsing is to perform a syntactic analysis of a sentence and extract the grammatical relations among the words in that sentence. Most research on dependency parsing has been focusing on English text parsing. In this thesis, an effort has been made to evaluate and compare the performance of some of the state-of-the-art dependency parsers in parsing Estonian. The dependency parsers chosen for evaluation are: MaltParser, spaCy, Stanford neural network dependency parser (nndep), SyntaxNet and UDPipe. The comparison is done using mainly Labelled Attachment Score (LAS), Unlabelled Attachment Score (UAS) and Label Accuracy (LA). New models for Estonian were trained for the spaCy, Stanford nndep and UDPipe parsers while pretrained models for the MaltParser and SyntaxNet were used in the experiments
On the Challenges of Fully Incremental Neural Dependency Parsing
Since the popularization of BiLSTMs and Transformer-based bidirectional
encoders, state-of-the-art syntactic parsers have lacked incrementality,
requiring access to the whole sentence and deviating from human language
processing. This paper explores whether fully incremental dependency parsing
with modern architectures can be competitive. We build parsers combining
strictly left-to-right neural encoders with fully incremental sequence-labeling
and transition-based decoders. The results show that fully incremental parsing
with modern architectures considerably lags behind bidirectional parsing,
noting the challenges of psycholinguistically plausible parsing.Comment: Accepted at IJCNLP-AACL 202
- …