7 research outputs found
The KBGen Challenge
International audienceGiven a preselected set of relations extracted from the AURA knowledge base on biology, the KBGEN Task consisted in generating a sentence verbalising these relations. Three team submitted the results of their systems. The systems were compared using both automatic metrics (BLEU, NIST) and subjective ratings by 12 human users for three dimensions namely, fluency, grammaticality and meaning similarity. In this report, we summarise the KBGen Task, the evaluation methods and the results obtained
Surface Realisation from Knowledge-Bases
International audienceWe present a simple, data-driven approach to generation from knowledge bases (KB). A key feature of this approach is that grammar induction is driven by the extended domain of locality principle of TAG (Tree Adjoining Grammar); and that it takes into account both syntactic and semantic information. The resulting extracted TAG includes a unification based semantics and can be used by an existing surface realiser to generate sentences from KB data. Experimental evaluation on the KBGen data shows that our model outperforms a data-driven generate-and-rank approach based on an automatically induced probabilistic grammar; and is comparable with a handcrafted symbolic approach
Automatic Verbalisation of Biological Events
International audienceWe present a method for automatically generating descriptions of biological events encoded in the KB BIO 101 Knowledge base. In this knowledge base, events are concepts (e.g., RELEASE) related by role relations (e.g., AGENT, PATIENT, PATH, INSTRUMENT) to the concepts denoting their arguments (e.g., GATED-CHANNEL, VASCULAR-TISSUE, IRON). We propose a probabilistic, unsupervised method which extracts possible verbalisation frames from large biology specific domain corpora and uses probabilities both to select an appropriate frame given an event description and to determine the mapping between syntactic and semantic arguments. That is, probabilities are used to determine which event argument fills which syntactic function (e.g., subject, object) in the produced verbalisation. We evaluate our approach on a corpus of 336 event descriptions, provide a qualitative and quantitative analysis of the results obtained and discuss possible directions for further work
KBGen - Text Generation for Knowledge Bases as a New Shared Task
International audienceIn this paper we propose a new shared task where the aim is to produce coherent descriptions of concepts and relationships in a frame-based knowledge base (KB). We propose to use AURA, a freely available KB, for the shared task and illustrate an application context for NLG. We show that the same application context and the need for language generation tools can be generalized to other biology knowledge bases. We argue that the easy availability of input data and a larger research community -- both domain experts and knowledge representation experts -- which actively uses these knowledge bases, along with regular evaluation experiments, creates an ideal scenario for a shared task
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation
This paper surveys the current state of the art in Natural Language
Generation (NLG), defined as the task of generating text or speech from
non-linguistic input. A survey of NLG is timely in view of the changes that the
field has undergone over the past decade or so, especially in relation to new
(usually data-driven) methods, as well as new applications of NLG technology.
This survey therefore aims to (a) give an up-to-date synthesis of research on
the core tasks in NLG and the architectures adopted in which such tasks are
organised; (b) highlight a number of relatively recent research topics that
have arisen partly as a result of growing synergies between NLG and other areas
of artificial intelligence; (c) draw attention to the challenges in NLG
evaluation, relating them to similar challenges faced in other areas of Natural
Language Processing, with an emphasis on different evaluation methods and the
relationships between them.Comment: Published in Journal of AI Research (JAIR), volume 61, pp 75-170. 118
pages, 8 figures, 1 tabl
Semantic consistency in text generation
Automatic input-grounded text generation tasks process input texts and generate human-understandable natural language text for the processed information. The development
of neural sequence-to-sequence (seq2seq) models, which are usually trained in an end-to-end fashion, pushed the frontier of the performance on text generation tasks expeditiously. However, they are claimed to be defective in semantic consistency w.r.t. their
corresponding input texts. Also, not only the models are to blame. The corpora themselves always include examples whose output is semantically inconsistent to its input.
Any model that is agnostic to such data divergence issues will be prone to semantic inconsistency. Meanwhile, the most widely-used overlap-based evaluation metrics
comparing the generated texts to their corresponding references do not evaluate the
input-output semantic consistency explicitly, which makes this problem hard to detect.
In this thesis, we focus on studying semantic consistency in three automatic text
generation scenarios: Data-to-text Generation, Single Document Abstractive Summarization, and Chit-chat Dialogue Generation, by seeking for the answers to the following research questions: (1) how to define input-output semantic consistency in different
text generation tasks? (2) how to quantitatively evaluate the input-output semantic
consistency? (3) how to achieve better semantic consistency in individual tasks?
We systematically define the semantic inconsistency phenomena in these three
tasks as omission, intrinsic hallucination, and extrinsic hallucination. For Data-to-text Generation, we jointly learn a sentence planner that tightly controls which part
of input source gets generated in what sequence, with a neural seq2seq text generator,
to decrease all three types of semantic inconsistency in model-generated texts. The
evaluation results confirm that the texts generated by our model contain much less
omissions while maintaining low level of extrinsic hallucinations without sacrificing
fluency compared to seq2seq models. For Single Document Abstractive Summarization, we reduce the level of extrinsic hallucinations in training data by automatically
introducing assisting articles to each document-summary instance to provide the supplemental world-knowledge that is present in the summary but missing from the doc ument. With the help of a novel metric, we show that seq2seq models trained with as sisting articles demonstrate less extrinsic hallucinations than the ones trained without
them. For Chit-chat Dialogue Generation, by filtering out the omitted and hallucinated
examples from training set using a newly introduced evaluation metric, and encoding
it into the neural seq2seq response generation models as a control factor, we diminish
the level of omissions and extrinsic hallucinations in the generated dialogue responses