Search CORE

154 research outputs found

Anaphora Annotation in Hindi Dependency TreeBank

Author: Dakwale Praveen
Himanshu Dipti M
Sharma Himanshu
Publication venue: 'Faculty of Computer Science, Universitas Indonesia'
Publication date: 01/01/2012
Field of study

Statistical parsing of morphologically rich languages (SPMRL): what, how and whither

Author: Candito Marie
Foster Jennifer
Goldberg Yoav
Kübler Sandra
Rehbein Ines
Seddah Djamé
Tounsi Lamia
Tsarfaty Reut
Versley Yannick
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2010
Field of study

The term Morphologically Rich Languages (MRLs) refers to languages in which significant information concerning syntactic units and relations is expressed at word-level. There is ample evidence that the application of readily available statistical parsing models to such languages is susceptible to serious performance degradation. The first workshop on statistical parsing of MRLs hosts a variety of contributions which show that despite language-specific idiosyncrasies, the problems associated with parsing MRLs cut across languages and parsing frameworks. In this paper we review the current state-of-affairs with respect to parsing MRLs and point out central challenges. We synthesize the contributions of researchers working on parsing Arabic, Basque, French, German, Hebrew, Hindi and Korean to point out shared solutions across languages. The overarching analysis suggests itself as a source of directions for future investigations

CiteSeerX

INRIA a CCSD electronic archive server

Irish Universities

DCU Online Research Access Service

Hal-Diderot

Towards interoperable discourse annotation: discourse features in the Ontologies of Linguistic Annotation

Author: Chiarcos Christian
Publication venue
Publication date: 03/05/2023
Field of study

This paper describes the extension of the Ontologies of Linguistic Annotation (OLiA) with respect to discourse features. The OLiA ontologies provide a a terminology repository that can be employed to facilitate the conceptual (semantic) interoperability of annotations of discourse phenomena as found in the most important corpora available to the community, including OntoNotes, the RST Discourse Treebank and the Penn Discourse Treebank. Along with selected schemes for information structure and coreference, discourse relations are discussed with special emphasis on the Penn Discourse Treebank and the RST Discourse Treebank. For an example contained in the intersection of both corpora, I show how ontologies can be employed to generalize over divergent annotation schemes

OPUS Augsburg

Hindi CCGbank: CCG Treebank from the Hindi Dependency Treebank

Author: A Bharati
A Joshi
A Mahajan
B Kumari
Bharat Ram Ambati
C Shastri
D Hays
J Hockenmaier
J Nivre
J Robinson
M Kuhlmann
M Lewis
M Palmer
M Steedman
Mark Steedman
MP Marcus
N Xue
S Clark
S Reddy
S Uematsu
T Mohanan
Tejaswini Deoskar
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Crossref

Springer - Publisher Connector

Edinburgh Research Explorer

Treebanking User-Generated Content: A Proposal for a Unified Representation in Universal Dependencies

Author: Bosco Cristina
Cassidy Lauren
Cetinoglu Ozlem
Cignarella Alessandra Teresa
Lynn Teresa
Rehbein Ines
Ruppenhofer Joseph
Sanguinetti Manuela
Seddah Djamé
Zeldes Amir
Publication venue: ELRA, Language Resources Association
Publication date: 01/01/2020
Field of study

The paper presents a discussion on the main linguistic phenomena of user-generated texts found in web and social media, and proposes a set of annotation guidelines for their treatment within the Universal Dependencies (UD) framework. Given on the one hand the increasing number of treebanks featuring user-generated content, and its somewhat inconsistent treatment in these resources on the other, the aim of this paper is twofold: (1) to provide a short, though comprehensive, overview of such treebanks - based on available literature - along with their main features and a comparative analysis of their annotation criteria, and (2) to propose a set of tentative UD-based annotation guidelines, to promote consistent treatment of the particular phenomena found in these types of texts. The main goal of this paper is to provide a common framework for those teams interested in developing similar resources in UD, thus enabling cross-linguistic consistency, which is a principle that has always been in the spirit of UD

Archivio istituzionale della ricerca - Università di Cagliari

Treebanking User-Generated Content: A Proposal for a Unified Representation in Universal Dependencies

Author: Amir Zeldes
Bosco Cristina
Cignarella Alessandra Teresa
Djam&#233
Ines Rehbein
Josef Ruppenhofer
Lauren Cassidy
Ozlem Cetinoglu
Sanguinetti Manuela
Teresa Lynn
Publication venue: ELRA – European Language Resources Association
Publication date: 01/01/2020
Field of study

Institutional Research Information System University of Turin

Treebanking user-generated content: A proposal for a unified representation in universal dependencies

Author: Bosco Cristina
Cassidy Lauren
Cignarella Alessandra Teresa
Lynn Teresa
Rehbein Ines
Ruppenhofer Josef
Sanguinetti Manuela
Seddah Djamé
Zeldes Amir
Çetinoğlu Özlem
Publication venue: ELRA ; IDS, Bibliothek
Publication date: 01/01/2020
Field of study

MAnnheim DOCument Server

Treebanking user-generated content: a proposal for a unified representation in universal dependencies

Author: Bosco Cristina
Cassidy Lauren
Cignarella Alessandra Teresa
Lynn Teresa
Rehbein Ines
Ruppenhofer Josef
Sanguinetti Manuela
Seddah Djamé
Zeldes Amir
Çetinoglu Özlem
Publication venue: European Language Resources Association (ELRA)
Publication date: 01/05/2020
Field of study

DCU Online Research Access Service

Discourse structure and language technology

Author: Agarwal
Al-Saif
Al-Saif
Asher
B. WEBBER
Baldridge
Barzilay
Barzilay
Bex
Buch-Kromann
Buch-Kromann
Bunt
Burchardt
Burstein
Callison-Birch
Chambers
Chen
Chiarcos
Choi
Dale
Daume
Do
Eales
Egg
Eisenstein
Elsner
Elsner
Elwell
Finlayson
Foster
Galley
Ghorbel
Ghosh
Ghosh
Grosz
Grosz
Grosz
Gu
Guo
Halliday
Hardmeier
Hardt
Hearst
Higgins
Hirohata
Holler
Hovy
Ide
Kan
Kingsbury
Koppel
Lee
Lee
Liakata
Lin
Lochbaum
Louis
M. EGG
Maamouri
Malioutov
Mandler
Marcu
Marcu
Marcu
Marcus
Martin
Maslennikov
McDonald
McKnight
Meyer
Mladová
Moore
Moore
Moore
Moser
Nagard
Oza
Palau
Pang
Pang
Paris
Patwardhan
Petukhova
Petukhova
Pitler
Pitler
Polanyi
Polanyi
Polanyi
Prasad
Prasad
Prasad
Prasad
Propp
Purver
Purver
Sagae
Sagae
Say
Schank
Sibun
Soricut
Stede
Subba
Taboada
Teufel
Thione
Tonelli
Turney
V. KORDONI
Versley
Voll
Walker
Wang
Webber
Webber
Wellner
Woods
Zeyrek
Zeyrek
Zeyrek
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 08/12/2011
Field of study

This publication is with permission of the rights owner freely accessible due to an Alliance licence and a national licence (funded by the DFG, German Research Foundation) respectively.An increasing number of researchers and practitioners in Natural Language Engineering face the prospect of having to work with entire texts, rather than individual sentences. While it is clear that text must have useful structure, its nature may be less clear, making it more difficult to exploit in applications. This survey of work on discourse structure thus provides a primer on the bases of which discourse is structured along with some of their formal properties. It then lays out the current state-of-the-art with respect to algorithms for recognizing these different structures, and how these algorithms are currently being used in Language Technology applications. After identifying resources that should prove useful in improving algorithm performance across a range of languages, we conclude by speculating on future discourse structure-enabled technology.Peer Reviewe

Crossref

Edinburgh Research Explorer

Dokumenten-Publikationsserver der Humboldt-Universität zu Berlin

Linguistic Tests for Discourse Relations

Author: Gastel Anna
Versley Yannick
Publication venue: University of Illinois at Chicago Library
Publication date: 09/07/2013
Field of study

Discourse structure and discourse relations are an important ingredient in systems for the analysis of text that go beyond the boundary of single clauses. Discourse relations often indicate important additional information about the connection between two clauses, such as causality, and are widely believed to have an influence on aspects of reference resolution.In this article, we first present the general design choices that are to be made in the design of an annotation scheme for discourse structure and discourse relations. In a second part, we present the scheme used in our annotation of selected articles from the TüBa-D/Z treebank of German (Telljohann et al., 2009). The scheme used in the annotation is theory-neutral, but informed by more detailed linguistic knowledge in the way of linguistic tests that can help disambiguate between several plausible relations

University of Illinois at Chicago: Journals@UIC

Dialogue & Discourse (E-Journal - Universität Bielefeld)