Search CORE

8 research outputs found

Intégration des constructions à verbe support dans TimeML

Author: Bittar André
Danlos Laurence
Publication venue: HAL CCSD
Publication date: 24/06/2009
Field of study

National audienceTimeML is a markup language developed for the annotation of temporal information in texts, in particular events, temporal expressions and the relations which hold between the two. General annotation guidelines have been developed to guide the annotator in this task, but certain linguistic phenomena have yet to be dealt with in detail. A common problem in NLP tasks, whether in translation, generation or understanding, is that of the encoding of light verb constructions. Relatively little attention has been paid to this problem, until now, in the TimeML framework. In this article, we propose annotation guidelines for light verb constructions.Le langage TimeML a été conçu pour l'annotation des informations temporelles dans les textes, notamment les événements, les expressions de temps et les relations entre les deux. Des consignes d'annotation générales ont été élaborées afin de guider l'annotateur dans cette tâche, mais certains phénomènes linguistiques restent à traiter en détail. Un problème commun dans les tâches de TAL, que ce soit en traduction, en génération ou en compréhension, est celui de l'encodage des constructions à verbe support. Relativement peu d'attention a été portée, jusqu'à maintenant, sur ce problème dans le cadre du langage TimeML. Dans cet article, nous proposons des consignes d'annotation pour les constructions à verbe support

INRIA a CCSD electronic archive server

Hal-Diderot

Intégration des constructions à verbe support dans TimeML

Author: Bittar André
Danlos Laurence
Publication venue: HAL CCSD
Publication date: 24/06/2009
Field of study

INRIA a CCSD electronic archive server

CoNLL-Merge: efficient harmonization of concurrent tokenization and textual variation

Author: Chiarcos Christian
Schenk Nico
Publication venue
Publication date: 27/04/2023
Field of study

The proper detection of tokens in of running text represents the initial processing step in modular NLP pipelines. But strategies for defining these minimal units can differ, and conflicting analyses of the same text seriously limit the integration of subsequent linguistic annotations into a shared representation. As a solution, we introduce CoNLL Merge, a practical tool for harmonizing TSV-related data models, as they occur, e.g., in multi-layer corpora with non-sequential, concurrent tokenizations, but also in ensemble combinations in Natural Language Processing. CoNLL Merge works unsupervised, requires no manual intervention or external data sources, and comes with a flexible API for fully automated merging routines, validity and sanity checks. Users can chose from several merging strategies, and either preserve a reference tokenization (with possible losses of annotation granularity), create a common tokenization layer consisting of minimal shared subtokens (loss-less in terms of annotation granularity, destructive against a reference tokenization), or present tokenization clashes (loss-less and non-destructive, but introducing empty tokens as place-holders for unaligned elements). We demonstrate the applicability of the tool on two use cases from natural language processing and computational philology

OPUS Augsburg

CoNLL-Merge: Efficient Harmonization of Concurrent Tokenization and Textual Variation

Author: Chiarcos Christian
Schenk Niko
Publication venue: OASIcs - OpenAccess Series in Informatics. 2nd Conference on Language, Data and Knowledge (LDK 2019)
Publication date: 01/01/2019
Field of study

Dagstuhl Research Online Publication Server

Resource Interoperability for Sustainable Benchmarking: The Case of Events:The case of events

Author: Aroyo L.M.
Inel O.A.
Morante Vallejo R.
van Son C.M.
Vossen P.T.J.M.
Publication venue: European Language Resources Association (ELRA)
Publication date: 01/01/2018
Field of study

VU Research Portal

Merging PropBank, NomBank, TimeBank, Penn Discourse Treebank and Coreference

Author: Adam Meyers
James Pustejovsky
Martha Palmer
Massimo Poesio
Publication venue
Publication date: 01/01/2005
Field of study

Many recent annotation efforts for English have focused on pieces of the larger problem of semantic annotation, rather than initially producing a single unified representation. This paper discusses the issues involved in merging four of these efforts into a unifie

CiteSeerX

Crossref

Discourse structure and language technology

Author: Agarwal
Al-Saif
Al-Saif
Asher
B. WEBBER
Baldridge
Barzilay
Barzilay
Bex
Buch-Kromann
Buch-Kromann
Bunt
Burchardt
Burstein
Callison-Birch
Chambers
Chen
Chiarcos
Choi
Dale
Daume
Do
Eales
Egg
Eisenstein
Elsner
Elsner
Elwell
Finlayson
Foster
Galley
Ghorbel
Ghosh
Ghosh
Grosz
Grosz
Grosz
Gu
Guo
Halliday
Hardmeier
Hardt
Hearst
Higgins
Hirohata
Holler
Hovy
Ide
Kan
Kingsbury
Koppel
Lee
Lee
Liakata
Lin
Lochbaum
Louis
M. EGG
Maamouri
Malioutov
Mandler
Marcu
Marcu
Marcu
Marcus
Martin
Maslennikov
McDonald
McKnight
Meyer
Mladová
Moore
Moore
Moore
Moser
Nagard
Oza
Palau
Pang
Pang
Paris
Patwardhan
Petukhova
Petukhova
Pitler
Pitler
Polanyi
Polanyi
Polanyi
Prasad
Prasad
Prasad
Prasad
Propp
Purver
Purver
Sagae
Sagae
Say
Schank
Sibun
Soricut
Stede
Subba
Taboada
Teufel
Thione
Tonelli
Turney
V. KORDONI
Versley
Voll
Walker
Wang
Webber
Webber
Wellner
Woods
Zeyrek
Zeyrek
Zeyrek
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 08/12/2011
Field of study

This publication is with permission of the rights owner freely accessible due to an Alliance licence and a national licence (funded by the DFG, German Research Foundation) respectively.An increasing number of researchers and practitioners in Natural Language Engineering face the prospect of having to work with entire texts, rather than individual sentences. While it is clear that text must have useful structure, its nature may be less clear, making it more difficult to exploit in applications. This survey of work on discourse structure thus provides a primer on the bases of which discourse is structured along with some of their formal properties. It then lays out the current state-of-the-art with respect to algorithms for recognizing these different structures, and how these algorithms are currently being used in Language Technology applications. After identifying resources that should prove useful in improving algorithm performance across a range of languages, we conclude by speculating on future discourse structure-enabled technology.Peer Reviewe

Crossref

Edinburgh Research Explorer

Dokumenten-Publikationsserver der Humboldt-Universität zu Berlin

2nd Conference on Language, Data and Knowledge (LDK 2019), May 20–23, 2019, Leipzig, Germany

Author: Buitelaar Paul
Chiarcos Christian
de Melo Gerard
Dojchinovski Milan
Eskevich Maria
Fäth Christian
Klimek Bettina
McCrae John P.
Publication venue
Publication date: 27/04/2023
Field of study

OPUS Augsburg