Search CORE

220 research outputs found

The Impact of Word Representations on Sequential Neural MWE Identification

Author: Damnati Geraldine
Ramisch Carlos
Zampieri Nicolas
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2019
Field of study

International audienceRecent initiatives such as the PARSEME shared task have allowed the rapid development of MWE identification systems. Many of those are based on recent NLP advances, using neural sequence models that take continuous word representations as input. We study two related questions in neural verbal MWE identification: (a) the use of lemmas and/or surface forms as input features, and (b) the use of word-based or character-based em-beddings to represent them. Our experiments on Basque, French, and Polish show that character-based representations yield systematically better results than word-based ones. In some cases, character-based representations of surface forms can be used as a proxy for lem-mas, depending on the morphological complexity of the language

Crossref

HAL AMU

Proceedings of the 13th Linguistic Annotation Workshop, August 1, 2019, Florence, Italy

Author: Friedrich Annemarie
Hoek Jet
Zeyrek Deniz
Publication venue
Publication date: 07/07/2023
Field of study

OPUS Augsburg

Uncertainty in Natural Language Generation: From Theory to Applications

Author: Aziz Wilker
Baan Joris
Daheim Nico
Fernández Raquel
Ilia Evgenia
Li Haau-Sing
Plank Barbara
Sennrich Rico
Ulmer Dennis
Zerva Chrysoula
Publication venue
Publication date: 28/07/2023
Field of study

Recent advances of powerful Language Models have allowed Natural Language Generation (NLG) to emerge as an important technology that can not only perform traditional tasks like summarisation or translation, but also serve as a natural language interface to a variety of applications. As such, it is crucial that NLG systems are trustworthy and reliable, for example by indicating when they are likely to be wrong; and supporting multiple views, backgrounds and writing styles -- reflecting diverse human sub-populations. In this paper, we argue that a principled treatment of uncertainty can assist in creating systems and evaluation protocols better aligned with these goals. We first present the fundamental theory, frameworks and vocabulary required to represent uncertainty. We then characterise the main sources of uncertainty in NLG from a linguistic perspective, and propose a two-dimensional taxonomy that is more informative and faithful than the popular aleatoric/epistemic dichotomy. Finally, we move from theory to applications and highlight exciting research directions that exploit uncertainty to power decoding, controllable generation, self-assessment, selective answering, active learning and more

arXiv.org e-Print Archive

Biases in Large Language Models: Origins, Inventory and Discussion

Author: Conia Simone
Navigli Roberto
Ross Björn
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 22/06/2023
Field of study

Edinburgh Research Explorer

Argumentation models and their use in corpus annotation: practice, prospects, and challenges

Author: Cardoso Henrique Lopes
Carvalho Paula
Martins Bruno
Sousa-Silva Rui
Publication venue
Publication date: 01/07/2023
Field of study

The study of argumentation is transversal to several research domains, from philosophy to linguistics, from the law to computer science and artificial intelligence. In discourse analysis, several distinct models have been proposed to harness argumentation, each with a different focus or aim. To analyze the use of argumentation in natural language, several corpora annotation efforts have been carried out, with a more or less explicit grounding on one of such theoretical argumentation models. In fact, given the recent growing interest in argument mining applications, argument-annotated corpora are crucial to train machine learning models in a supervised way. However, the proliferation of such corpora has led to a wide disparity in the granularity of the argument annotations employed. In this paper, we review the most relevant theoretical argumentation models, after which we survey argument annotation projects closely following those theoretical models. We also highlight the main simplifications that are often introduced in practice. Furthermore, we glimpse other annotation efforts that are not so theoretically grounded but instead follow a shallower approach. It turns out that most argument annotation projects make their own assumptions and simplifications, both in terms of the textual genre they focus on and in terms of adapting the adopted theoretical argumentation model for their own agenda. Issues of compatibility among argument-annotated corpora are discussed by looking at the problem from a syntactical, semantic, and practical perspective. Finally, we discuss current and prospective applications of models that take advantage of argument-annotated corpora

Repositório Aberto da Universidade do Porto

Annotating Argument Schemes

Author: Lawrence John
Reed Chris
Visser Jacky
Wagemans Jean
Walton Douglas
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/03/2021
Field of study

University of Dundee Online Publications

International Migration, Integration and Social Cohesion online publications

UvA-DARE