38 research outputs found
Neural Data-to-Text Generation Based on Small Datasets: Comparing the Added Value of Two Semi-Supervised Learning Approaches on Top of a Large Language Model
This study discusses the effect of semi-supervised learning in combination
with pretrained language models for data-to-text generation. It is not known
whether semi-supervised learning is still helpful when a large-scale language
model is also supplemented. This study aims to answer this question by
comparing a data-to-text system only supplemented with a language model, to two
data-to-text systems that are additionally enriched by a data augmentation or a
pseudo-labeling semi-supervised learning approach.
Results show that semi-supervised learning results in higher scores on
diversity metrics. In terms of output quality, extending the training set of a
data-to-text system with a language model using the pseudo-labeling approach
did increase text quality scores, but the data augmentation approach yielded
similar scores to the system without training set extension. These results
indicate that semi-supervised learning approaches can bolster output quality
and diversity, even when a language model is also present.Comment: 22 pages (excluding bibliography and appendix
Enhancing sequence-to-sequence modeling for RDF triples to natural text
Establishes key guidelines on how, which and when Machine Translation (MT) techniques are worth applying to RDF-to-Text task. Not only do we apply and compare the most prominent MT architecture, the Transformer, but we also analyze state-of-the-art techniques such as Byte Pair Encoding or Back Translation to demonstrate an improvement in generalization. In addition, we empirically show how to tailor these techniques to enhance models relying on learned embeddings rather than using pretrained ones. Automatic metrics suggest that Back Translation can significantly improve model performance up to 7 BLEU points, hence, opening a window for surpassing state-of-the-art results with appropriate architectures.Peer ReviewedPostprint (published version
Multiple Relations Classification using Imbalanced Predictions Adaptation
The relation classification task assigns the proper semantic relation to a
pair of subject and object entities; the task plays a crucial role in various
text mining applications, such as knowledge graph construction and entities
interaction discovery in biomedical text. Current relation classification
models employ additional procedures to identify multiple relations in a single
sentence. Furthermore, they overlook the imbalanced predictions pattern. The
pattern arises from the presence of a few valid relations that need positive
labeling in a relatively large predefined relations set. We propose a multiple
relations classification model that tackles these issues through a customized
output architecture and by exploiting additional input features. Our findings
suggest that handling the imbalanced predictions leads to significant
improvements, even on a modest training design. The results demonstrate
superiority performance on benchmark datasets commonly used in relation
classification. To the best of our knowledge, this work is the first that
recognizes the imbalanced predictions within the relation classification task.Comment:
ASPIRO: Any-shot Structured Parsing-error-Induced ReprOmpting for Consistent Data-to-Text Generation
We present ASPIRO, an approach for structured data verbalisation into short
template sentences in zero to few-shot settings. Unlike previous methods, our
approach prompts large language models (LLMs) to directly produce
entity-agnostic templates, rather than relying on LLMs to faithfully copy the
given example entities, or validating/crafting the templates manually. We
incorporate LLM re-prompting, triggered by algorithmic parsing checks, as well
as the PARENT metric induced consistency validation to identify and rectify
template generation problems in real-time. ASPIRO, compared to direct LLM
output, averages 66\% parsing error rate reduction in generated verbalisations
of RDF triples on the DART dataset. Our best 5-shot text-davinci-003 setup,
scoring BLEU of 50.62, METEOR of 45.16, BLEURT of 0.82, NUBIA of 0.87, and
PARENT of 0.8962 on the Rel2Text dataset, competes effectively with recent
fine-tuned pre-trained language models.Comment: Accepted to Findings of EMNLP2023, code available at
https://github.com/vejvarm/ASPIR
Neural data-to-text generation: A comparison between pipeline and end-to-end architectures
Traditionally, most data-to-text applications have been designed using a
modular pipeline architecture, in which non-linguistic input data is converted
into natural language through several intermediate transformations. In
contrast, recent neural models for data-to-text generation have been proposed
as end-to-end approaches, where the non-linguistic input is rendered in natural
language with much less explicit intermediate representations in-between. This
study introduces a systematic comparison between neural pipeline and end-to-end
data-to-text approaches for the generation of text from RDF triples. Both
architectures were implemented making use of state-of-the art deep learning
methods as the encoder-decoder Gated-Recurrent Units (GRU) and Transformer.
Automatic and human evaluations together with a qualitative analysis suggest
that having explicit intermediate steps in the generation process results in
better texts than the ones generated by end-to-end approaches. Moreover, the
pipeline models generalize better to unseen inputs. Data and code are publicly
available.Comment: Preprint version of the EMNLP 2019 articl