27 research outputs found
Identification of context markers for Russian nouns
Proceedings of the 18th Nordic Conference of Computational Linguistics
NODALIDA 2011.
Editors: Bolette Sandford Pedersen, Gunta Nešpore and Inguna Skadiņa.
NEALT Proceedings Series, Vol. 11 (2011), 344-347.
© 2011 The editors and contributors.
Published by
Northern European Association for Language
Technology (NEALT)
http://omilia.uio.no/nealt .
Electronically published at
Tartu University Library (Estonia)
http://hdl.handle.net/10062/1695
Split and Rephrase
We propose a new sentence simplification task (Split-and-Rephrase) where the
aim is to split a complex sentence into a meaning preserving sequence of
shorter sentences. Like sentence simplification, splitting-and-rephrasing has
the potential of benefiting both natural language processing and societal
applications. Because shorter sentences are generally better processed by NLP
systems, it could be used as a preprocessing step which facilitates and
improves the performance of parsers, semantic role labellers and machine
translation systems. It should also be of use for people with reading
disabilities because it allows the conversion of longer sentences into shorter
ones. This paper makes two contributions towards this new task. First, we
create and make available a benchmark consisting of 1,066,115 tuples mapping a
single complex sentence to a sequence of sentences expressing the same meaning.
Second, we propose five models (vanilla sequence-to-sequence to
semantically-motivated models) to understand the difficulty of the proposed
task.Comment: 11 pages, EMNLP 201
The human evaluation datasheet: a template for recording details of human evaluation experiments in NLP
This paper presents the Human Evaluation Datasheet (HEDS), a template for recording the details of individual human evaluation experiments in Natural Language Processing (NLP), and reports on first experience of researchers using HEDS sheets in practice. Originally taking inspiration from seminal papers by Bender and Friedman (2018), Mitchell et al. (2019), and Gebru et al. (2020), HEDS facilitates the recording of properties of human evaluations in sufficient detail, and with sufficient standardisation, to support comparability, meta-evaluation, and reproducibility assessments for human evaluations. These are crucial for scientifically principled evaluation, but the overhead of completing a detailed datasheet is substantial, and we discuss possible ways of addressing this and other issues observed in practice
ReproGen : Proposal for a Shared Task on Reproducibility of Human Evaluations in NLG
Peer reviewedPublisher PD
The 2022 ReproGen Shared Task on Reproducibility of Evaluations in NLG : Overview and Results
Publisher PD
Creating Training Corpora for NLG Micro-Planning
International audienceIn this paper, we focus on how to create data-to-text corpora which can support the learning of wide-coverage micro-planners i.e., generation systems that handle lexicalisation, aggregation, surface re-alisation, sentence segmentation and referring expression generation. We start by reviewing common practice in designing training benchmarks for Natural Language Generation. We then present a novel framework for semi-automatically creating linguistically challenging NLG corpora from existing Knowledge Bases. We apply our framework to DBpedia data and compare the resulting dataset with (Wen et al., 2016)'s dataset. We show that while (Wen et al., 2016)'s dataset is more than twice larger than ours, it is less diverse both in terms of input and in terms of text. We thus propose our corpus generation framework as a novel method for creating challenging data sets from which NLG models can be learned which are capable of generating text from KB data
WebNLG Challenge: Human Evaluation Results
This report presents the human evaluation results for the WebNLG Challenge which was held in 2017. The automatic evaluation results can be foundin [Gardent et al., 2017a]. In this report, we describe human evaluation design, communicate the results, and explore correlation between automaticand human assessments
Mapping Natural Language to Description Logic
International audienceWhile much work on automated ontology enrichment has focused on mining text for concepts and relations, little attention has been paid to the task of enriching ontologies with complex axioms. In this paper, we focus on a form of text that is frequent in industry, namely system installation design principle (SIDP) and we present a framework which can be used both to map SIDPs to OWL DL axioms and to assess the quality of these automatically derived axioms. We present experimental results on a set of 960 SIDPs provided by Airbus which demonstrate (i) that the approach is robust (97.50% of the SIDPs can be parsed) and (ii) that DL axioms assigned to full parses are very likely to be correct in 96% of the cases