Search CORE

288 research outputs found

A Discourse-Annotated Corpus of Conjoined VPs

Author: Joshi Aravind
Lee Alan
Prasad Rashmi
Webber Bonnie
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 11/08/2016
Field of study

Discourse relations and conjoined VPs: automated sense recognition

Author: Bonnie Webber
Pyatkin Valentina
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

Sense classification of discourse relations is a sub-task of shallow discourse parsing. Discourse relations can occur both across sentences (inter-sentential) and within sentences (intra-sentential), and more than one discourse relation can hold between the same units. Using a newly available corpus of discourse-annotated intra-sentential conjoined verb phrases, we demonstrate a sequential classification system for their multi-label sense classification. We assess the importance of each feature used in the classification, the feature scope, and what is lost in moving from gold standard manual parses to the output of an off-the-shelf parser

Archivio della ricerca- Università di Roma La Sapienza

Proceedings of the 10th Linguistic Annotation Workshop held in conjunction with ACL 2016 (LAW-X 2016), August 11, 2016, Berlin, Germany

Author: Friedrich Annemarie
Tomanek Katrin
Publication venue
Publication date: 01/01/2016
Field of study

OPUS Augsburg

CRPC-DB – A Discourse Bank for Portuguese

Author: Lejeune Pierre
Mendes Amália
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2022
Field of study

info:eu-repo/semantics/publishedVersio

Universidade de Lisboa: Repositório.UL

An annotated corpus for the analysis of VP ellipsis

Author: Bos Johan
Spenader Jennifer
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Verb Phrase Ellipsis (VPE) has been studied in great depth in theoretical linguistics, but empirical studies of VPE are rare. We extend the few previous corpus studies with an annotated corpus of VPE in all 25 sections of the Wall Street Journal corpus (WSJ) distributed with the Penn Treebank. We annotated the raw files using a stand-off annotation scheme that codes the auxiliary verb triggering the elided verb phrase, the start and end of the antecedent, the syntactic type of antecedent (VP, TV, NP, PP or AP), and the type of syntactic pattern between the source and target clauses of the VPE and its antecedent. We found 487 instances of VPE (including predicative ellipsis, antecedent-contained deletion, comparative constructions, and pseudo-gapping) plus 67 cases of related phenomena such as do so anaphora. Inter-annotator agreement was high, with a 0.97 average F-score for three annotators for one section of the WSJ. Our annotation is theory neutral, and has better coverage than earlier efforts that relied on automatic methods, e.g. simply searching the parsed version of the Penn Treebank for empty VP's achieves a high precision (0.95) but low recall (0.58) when compared with our manual annotation. The distribution of VPE source-target patterns deviates highly from the standard examples found in the theoretical linguistics literature on VPE, once more underlining the value of corpus studies. The resulting corpus will be useful for studying VPE phenomena as well as for evaluating natural language processing systems equipped with ellipsis resolution algorithms, and we propose evaluation measures for VPE detection and VPE antecedent selection. The stand-off annotation is freely available for research purposes

Proceedings - University of Groningen

University of Groningen

Springer - Publisher Connector

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Prosodic Cues "and" Syntactic Disambiguation

Author: Blodgett Allison R.
Publication venue: Ohio State University. Department of Linguistics
Publication date: 01/01/2000
Field of study

This work was supported in part by a Summer Graduate Research Fellowship in Cognitive Science provided by the Center for Cognitive Science at The Ohio State University

KnowledgeBank at OSU

Learning Sentence-internal Temporal Relations

Author: Lapata M.
Lascarides A.
Publication venue: 'AI Access Foundation'
Publication date: 01/01/2006
Field of study

In this paper we propose a data intensive approach for inferring sentence-internal temporal relations. Temporal inference is relevant for practical NLP applications which either extract or synthesize temporal information (e.g., summarisation, question answering). Our method bypasses the need for manual coding by exploiting the presence of markers like after", which overtly signal a temporal relation. We first show that models trained on main and subordinate clauses connected with a temporal marker achieve good performance on a pseudo-disambiguation task simulating temporal inference (during testing the temporal marker is treated as unseen and the models must select the right marker from a set of possible candidates). Secondly, we assess whether the proposed approach holds promise for the semi-automatic creation of temporal annotations. Specifically, we use a model trained on noisy and approximate data (i.e., main and subordinate clauses) to predict intra-sentential relations present in TimeBank, a corpus annotated rich temporal information. Our experiments compare and contrast several probabilistic models differing in their feature space, linguistic assumptions and data requirements. We evaluate performance against gold standard corpora and also against human subjects

arXiv.org e-Print Archive

CiteSeerX

Crossref

Edinburgh Research Explorer

The Bracketing Guidelines for the Penn Chinese Treebank (3.0)

Author: Huang Shizhe
Kroch Anthony
Xia Fei
Xue Nianwen
Publication venue: ScholarlyCommons
Publication date: 01/10/2000
Field of study

This document describes the bracketing guidelines for the Penn Chinese Treebank Project. The goal of the project is the creation of a 100-thousand-word corpus of Mandarin Chinese text with syntactic bracketing. The Chinese Treebank has been released via the Linguistic Data Consortium (LDC) and is available to the public. This document can be divided into six parts. Section I discusses six fundamental grammatical relations that are represented in the Treebank. Section II introduces the bracketing tagset, which includes 23 syntactic labels, 26 functional tags, and 7 tags for null elements. Section III, IV and V specify our annotation schemata for noun phrases, verbs phrases, and other minor categories, respectively. Section VI describes our treatment for empty categories, such as trace for syntactic movement, PRO for control, and pro for argument drop. Section VII and VIII cover the coordinated clauses and subordinating clauses. Section IX, X and XI specify the way we handle punctuation, ambiguity, and some problematic cases

ScholarlyCommons@Penn

A silence more eloquent : NP ellipsis in Mandarin discourse

Author: Charters A. Helen
Publication venue
Publication date: 08/06/2018
Field of study

The Australian National University