Search CORE

289 research outputs found

Overview of the SPMRL 2013 shared task: cross-framework evaluation of parsing morphologically rich languages

Author: Candito Marie
Choi Jinho
Farkas Richard
Foster Jennifer
Goenaga Iakes
Gojenola Koldo
Goldberg Yoav
Green Spence
Habash Nizar
Kuhlmann Marco
Kübler Sandra
Maier Wolfgang
Nivre Joakim
Przepiórkowski Adam
Roth Ryan
Seddah Djamé
Seeker Wolfgang
Tsarfaty Reut
Versley Yannick
Villemonte de la Clérgerie Eric
Vincze Veronika
Wolinski Marcin
Wróblewska Alina
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 18/10/2013
Field of study

This paper reports on the first shared task on statistical parsing of morphologically rich languages (MRLs). The task features data sets from nine languages, each available both in constituency and dependency annotation. We report on the preparation of the data sets, on the proposed parsing scenarios, and on the evaluation metrics for parsing MRLs given different representation types. We present and analyze parsing results obtained by the task participants, and then provide an analysis and comparison of the parsers across languages and frameworks, reported for gold input as well as more realistic parsing scenarios

Irish Universities

DCU Online Research Access Service

Overview of the SPMRL 2013 Shared Task: A Cross-Framework Evaluation of Parsing Morphologically Rich Languages

Author: Candito Marie
Choi Jinho D.
Farkas Richárd
Foster Jennifer
Goenaga Iakes
Gojenola Galletebeitia Koldo
Goldberg Yoav
Green Spence
Habash Nizar
Kuhlmann Marco
Kübler Sandra
Maier Wolfgang
Nivre Joakim
PrzepiÓrkowski Adam
Roth Ryan
Seddah Djamé
Seeker Wolfgang
Tsarfaty Reut
Versley Yannick
Villemonte de La Clergerie Éric
Vincze Veronika
Wolińsk Marcin
WrÓblewska Alina
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 18/10/2013
Field of study

International audienceThis paper reports on the first shared task on statistical parsing of morphologically rich lan- guages (MRLs). The task features data sets from nine languages, each available both in constituency and dependency annotation. We report on the preparation of the data sets, on the proposed parsing scenarios, and on the eval- uation metrics for parsing MRLs given dif- ferent representation types. We present and analyze parsing results obtained by the task participants, and then provide an analysis and comparison of the parsers across languages and frameworks, reported for gold input as well as more realistic parsing scenarios

INRIA a CCSD electronic archive server

Hal-Diderot

Dependency parsing of Turkish

Author: Eryigit Gulsen
Eryiğit Gülşen
Nivre Joakim
Oflazer Kemal
Publication venue: 'MIT Press - Journals'
Publication date: 01/09/2006
Field of study

The suitability of different parsing methods for different languages is an important topic in syntactic parsing. Especially lesser-studied languages, typologically different from the languages for which methods have originally been developed, poses interesting challenges in this respect. This article presents an investigation of data-driven dependency parsing of Turkish, an agglutinative free constituent order language that can be seen as the representative of a wider class of languages of similar type. Our investigations show that morphological structure plays an essential role in finding syntactic relations in such a language. In particular, we show that employing sublexical representations called inflectional groups, rather than word forms, as the basic parsing units improves parsing accuracy. We compare two different parsing methods, one based on a probabilistic model with beam search, the other based on discriminative classifiers and a deterministic parsing strategy, and show that the usefulness of sublexical units holds regardless of parsing method.We examine the impact of morphological and lexical information in detail and show that, properly used, this kind of information can improve parsing accuracy substantially. Applying the techniques presented in this article, we achieve the highest reported accuracy for parsing the Turkish Treebank

CiteSeerX

Crossref

Sabanci University Research Database

Proceedings

Author: Bick Eckhard
Hagen Kristin
Müürisep Kaili
Trosterud Trond
Publication venue
Publication date: 17/11/2011
Field of study

Proceedings of the NODALIDA 2011 Workshop Constraint Grammar Applications. Editors: Eckhard Bick, Kristin Hagen, Kaili Müürisep, Trond Trosterud. NEALT Proceedings Series, Vol. 14 (2011), vi+69 pp. © 2011 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/19231

DSpace at Tartu University Library

Token-based typology and word order entropy: A study based on universal dependencies

Author: Levshina N.
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2019
Field of study

The present paper discusses the benefits and challenges of token-based typology, which takes into account the frequencies of words and constructions in language use. This approach makes it possible to introduce new criteria for language classification, which would be difficult or impossible to achieve with the traditional, type-based approach. This point is illustrated by several quantitative studies of word order variation, which can be measured as entropy at different levels of granularity. I argue that this variation can be explained by general functional mechanisms and pressures, which manifest themselves in language use, such as optimization of processing (including avoidance of ambiguity) and grammaticalization of predictable units occurring in chunks. The case studies are based on multilingual corpora, which have been parsed using the Universal Dependencies annotation scheme

MPG.PuRe

Subject-verb agreement in real time: active feature maintenances as syntactic prediction.

Author: Ristic Bojana
Publication venue
Publication date: 10/01/2020
Field of study

179 p.The current dissertation tests whether the long-distance subject-verb establishment is maintained active over the course of the sentence, by maintaining morphosyntactic information such as syntactic category and number features. To this end, we looked at how the maintained representation affects the interpolated elements, focusing on two effects that the maintained features might generate: similarity-based interference and disambiguation. We performed four eye-tracking experiments (reading and visual world paradigm) and showed that subject-verb dependency establishment is characterized by active maintenance of the subject's category feature (English and Spanish experiments) and number feature (Basque experiments). Our effects, which occur prior to the integration site (the verb), can be ascribed to the top-down pre-activation mechanisms and thus syntactic prediction. Importantly, this implies that subject-verb agreement occurs in real-time sentence comprehension, i.e. it is psychologically real

Archivo Digital para la Docencia y la Investigación

Inquiries into the lexicon-syntax relations in Basque

Author: Oyharçabal Bernard
Publication venue: Servicio Editorial de la Universidad del País Vasco/Euskal Herriko Unibertsitatearen Argitalpen Zerbitzua
Publication date: 15/02/2003
Field of study

Index:- Foreword. B. Oyharçabal.- Morphosyntactic disambiguation and shallow parsing in computational processing in Basque. I. Aduriz, A. Díaz de Ilarraza.- The transitivity of borrowed verbs in Basque: an outline. X. Alberdi.- Patrixa: a unification-based parser for Basque and its application to the automatic analysis of verbs. I. Aldezabal, M. J. Aranzabe, A. Atutxa, K.Gojenola, K, Sarasola.- Learning argument/adjunct distinction for Basque. I. Aldezabal, M. J. Aranzabe, K. Gojenola, K, Sarasola, A. Atutxa.- Analyzing verbal subcategorization aimed at its computation application. I. Aldezabal, P. Goenaga.- Automatic extraction of verb paterns from “hauta-lanerako euskal hiztegia”. J. M. Arriola, X. Artola, A. Soroa.- The case of an enlightening, provoking an admirable Basque derivational siffux with implications for the theory of argument structure. X. Artiagoitia.- Verb-deriving processes in Basque. J. C. Odriozola.- Lexical causatives and causative alternation in Basque. B. Oyharçabal.- Causation and semantic control; diagnosis of incorrect use in minorized languages. I. Zabala.- Subject index.- Contributions

Archivo Digital para la Docencia y la Investigación

Universidad del País Vasco / Euskal Herriko Unibertsitatea: Ciencia - Portal de revistas digitales de la UPV/EHU