1,638 research outputs found
Improving Cross-Lingual Transfer Learning for Event Detection
The widespread adoption of applications powered by Artificial Intelligence (AI) backbones has unquestionably changed the way we interact with the world around us. Applications such as automated personal assistants, automatic question answering, and machine-based translation systems have become mainstays of modern culture thanks to the recent considerable advances in Natural Language Processing (NLP) research. Nonetheless, with over 7000 spoken languages in the world, there still remain a considerable number of marginalized communities that are unable to benefit from these technological advancements largely due to the language they speak. Cross-Lingual Learning (CLL) looks to address this issue by transferring the knowledge acquired from a popular, high-resource source language (e.g., English, Chinese, or Spanish) to a less favored, lower-resourced target language (e.g., Urdu or Swahili). This dissertation leverages the Event Detection (ED) sub-task of Information Extraction (IE) as a testbed and presents three novel approaches that improve cross-lingual transfer learning from distinct perspectives: (1) direct knowledge transfer, (2) hybrid knowledge transfer, and (3) few-shot learning
Multidisciplinary perspectives on Artificial Intelligence and the law
This open access book presents an interdisciplinary, multi-authored, edited collection of chapters on Artificial Intelligence (‘AI’) and the Law. AI technology has come to play a central role in the modern data economy. Through a combination of increased computing power, the growing availability of data and the advancement of algorithms, AI has now become an umbrella term for some of the most transformational technological breakthroughs of this age. The importance of AI stems from both the opportunities that it offers and the challenges that it entails. While AI applications hold the promise of economic growth and efficiency gains, they also create significant risks and uncertainty. The potential and perils of AI have thus come to dominate modern discussions of technology and ethics – and although AI was initially allowed to largely develop without guidelines or rules, few would deny that the law is set to play a fundamental role in shaping the future of AI. As the debate over AI is far from over, the need for rigorous analysis has never been greater. This book thus brings together contributors from different fields and backgrounds to explore how the law might provide answers to some of the most pressing questions raised by AI. An outcome of the Católica Research Centre for the Future of Law and its interdisciplinary working group on Law and Artificial Intelligence, it includes contributions by leading scholars in the fields of technology, ethics and the law.info:eu-repo/semantics/publishedVersio
Rules, frequency, and predictability in morphological generalization: behavioral and computational evidence from the German plural system
Morphological generalization, or the task of mapping an unknown word (such as a novel noun Raun) to an inflected form (such as the plural Rauns), has historically proven a contested topic within computational linguistics and cognitive science, e.g. within the past tense debate (Rumelhart and McClelland, 1986; Pinker and Prince, 1988; Seidenberg and Plaut, 2014). Marcus et al. (1995) identified German plural inflection as a key challenge domain to evaluate two competing accounts of morphological generalization: a rule generation view focused on linguistic features of input words, and a type frequency view focused on the distribution of output inflected forms, thought to reflect more domain-general cognitive processes. More recent behavioral and computational research developments support a new view based on predictability, which integrates both input and output distributions. My research uses these methodological innovations to revisit a core dispute of the past tense debate: how do German speakers generalize plural inflection, and can computational learners generalize similarly?
This dissertation evaluates the rule generation, type frequency, and predictability accounts of morphological generalization in a series of behavioral and computational experiments with the stimuli developed by Marcus et al.. I assess predictions for three aspects of German plural generalization: distribution of infrequent plural classes, influence of grammatical gender, and within-item variability. Overall, I find that speaker behavior is best characterized as frequency-matching to a phonologically-conditioned lexical distribution. This result does not support the rule generation view, and qualifies the predictability view: speakers use some, but not all available information to reduce uncertainty in morphological generalization. Neural and symbolic model predictions are typically overconfident relative to speakers; simple Bayesian models show somewhat higher speaker-like variability and accuracy. All computational models are outperformed by a static phonologically-conditioned lexical baseline, suggesting these models have not learned the selective feature preferences that inform speaker generalization
Recommended from our members
Mitigating Data Scarcity for Neural Language Models
In recent years, pretrained neural language models (PNLMs) have taken the field of natural language processing by storm, achieving new benchmarks and state-of-theart performances. These models often rely heavily on annotated data, which may not always be available. Data scarcity are commonly found in specialized domains, such as medical, or in low-resource languages that are underexplored by AI research. In this dissertation, we focus on mitigating data scarcity using data augmentation and neural ensemble learning techniques for neural language models. In both research directions, we implement neural network algorithms and evaluate their impact on assisting neural language models in downstream NLP tasks. Specifically, for data augmentation, we explore two techniques: 1) creating positive training data by moving an answer span around its original context and 2) using text simplification techniques to introduce a variety of writing styles to the original training data. Our results indicate that these simple and effective solutions improve the performance of neural language models considerably in low-resource NLP domains and tasks. For neural ensemble learning, we use a multi-label neural classifier to select the best prediction outcome from a variety of individual pretrained neural language models trained for a low-resource medical text simplification task
Evaluating automated and hybrid neural disambiguation for African historical named entities
Documents detailing South African history contain ambiguous names. Ambiguous names may be due to people having the same name or the same person being referred to by multiple different names. Thus when searching for or attempting to extract information about a particular person, the name used may affect the results. This problem may be alleviated by using a Named Entity Disambiguation (NED) system to disambiguate names by linking them to a knowledge base. In recent years, transformer-based language models have led to improvements in NED systems. Furthermore, multilingual language models have shown the ability to learn concepts across languages, reducing the amount of training data required in low-resource languages. Thus a multilingual language model-based NED system was developed to disambiguate people's names within a historical South African context using documents written in English and isiZulu from the 500 Year Archive (FHYA). The multilingual language model-based system substantially improved on a probability-based baseline and achieved a micro F1-score of 0.726. At the same time, the entity linking component was able to link 81.9% of the mentions to the correct entity. However, the system's performance on documents written in isiZulu was significantly lower than on the documents written in English. Thus the system was augmented with handcrafted rules to improve its performance. The addition of handcrafted rules resulted in a small but significant improvement in performance when compared to the unaugmented NED system
A corpus-based CDA study of ideological mediation through translation shifts: an analysis of the official Chinese-English translation of the governance of China
This study aims to explore the extent to which President Xi’s ideological message is mediated in the official Chinese-English translation of The Governance of China via various translation shifts and analyze the possible ideological reasons behind it. Unlike previous studies whose interpretation of translation shifts has been restricted to either the linguistic level or the speech situation, this research project focuses on exploring the translation shifts’ ideological significance within the broader sociopolitical context. It adopts a mixed-methods approach, merging critical discourse analysis (CDA) and corpus-based translation studies. A parallel corpus based on the source and target texts of President Xi’s domestic speeches to officials and Party members, published in The Governance of China, was built to ensure a quantitative and qualitative analysis. It is also noteworthy that this study concentrates on the key Chinese modality markers, transitivity processes, metaphorical expressions, and referring terms that stand out in the present research corpus compared to general Chinese discourse instead of all the existing or the most frequent ones. The overall results suggest that translation shifts in modality, transitivity, metaphor, and reference have slightly increased the ideological significance of strengthening the government and the Party’s self-discipline compared to other national issues, and exhibited a tendency to contextualize considering the foreign audiences’ ideological positions. Such shifts may be related to the translation agency’s commitment and the state’s current foreign policy. Ultimately, this study reveals subtle ideological translation shifts that will be buried if researchers treat source and target texts separately. It calls for translators to raise awareness of textual features’ ideological potential and encourages audiences to pay attention to the institutional and sociopolitical background of translated texts
UTDRM: unsupervised method for training debunked-narrative retrieval models
A key task in the fact-checking workflow is to establish whether the claim under investigation has already been debunked or fact-checked before. This is essentially a retrieval task where a misinformation claim is used as a query to retrieve from a corpus of debunks. Prior debunk retrieval methods have typically been trained on annotated pairs of misinformation claims and debunks. The novelty of this paper is an Unsupervised Method for Training Debunked-Narrative Retrieval Models (UTDRM) in a zero-shot setting, eliminating the need for human-annotated pairs. This approach leverages fact-checking articles for the generation of synthetic claims and employs a neural retrieval model for training. Our experiments show that UTDRM tends to match or exceed the performance of state-of-the-art methods on seven datasets, which demonstrates its effectiveness and broad applicability. The paper also analyses the impact of various factors on UTDRM’s performance, such as the quantity of fact-checking articles utilised, the number of synthetically generated claims employed, the proposed entity inoculation method, and the usage of large language models for retrieval
A Survey on Deep Multi-modal Learning for Body Language Recognition and Generation
Body language (BL) refers to the non-verbal communication expressed through
physical movements, gestures, facial expressions, and postures. It is a form of
communication that conveys information, emotions, attitudes, and intentions
without the use of spoken or written words. It plays a crucial role in
interpersonal interactions and can complement or even override verbal
communication. Deep multi-modal learning techniques have shown promise in
understanding and analyzing these diverse aspects of BL. The survey emphasizes
their applications to BL generation and recognition. Several common BLs are
considered i.e., Sign Language (SL), Cued Speech (CS), Co-speech (CoS), and
Talking Head (TH), and we have conducted an analysis and established the
connections among these four BL for the first time. Their generation and
recognition often involve multi-modal approaches. Benchmark datasets for BL
research are well collected and organized, along with the evaluation of SOTA
methods on these datasets. The survey highlights challenges such as limited
labeled data, multi-modal learning, and the need for domain adaptation to
generalize models to unseen speakers or languages. Future research directions
are presented, including exploring self-supervised learning techniques,
integrating contextual information from other modalities, and exploiting
large-scale pre-trained multi-modal models. In summary, this survey paper
provides a comprehensive understanding of deep multi-modal learning for various
BL generations and recognitions for the first time. By analyzing advancements,
challenges, and future directions, it serves as a valuable resource for
researchers and practitioners in advancing this field. n addition, we maintain
a continuously updated paper list for deep multi-modal learning for BL
recognition and generation: https://github.com/wentaoL86/awesome-body-language
KIT's Multilingual Speech Translation System for IWSLT 2023
Many existing speech translation benchmarks focus on native-English speech in
high-quality recording conditions, which often do not match the conditions in
real-life use-cases. In this paper, we describe our speech translation system
for the multilingual track of IWSLT 2023, which focuses on the translation of
scientific conference talks. The test condition features accented input speech
and terminology-dense contents. The tasks requires translation into 10
languages of varying amounts of resources. In absence of training data from the
target domain, we use a retrieval-based approach (kNN-MT) for effective
adaptation (+0.8 BLEU for speech translation). We also use adapters to easily
integrate incremental training data from data augmentation, and show that it
matches the performance of re-training. We observe that cascaded systems are
more easily adaptable towards specific target domains, due to their separate
modules. Our cascaded speech system substantially outperforms its end-to-end
counterpart on scientific talk translation, although their performance remains
similar on TED talks.Comment: IWSLT 202
Computational sarcasm detection and understanding in online communication
The presence of sarcasm in online communication has motivated an increasing number of computational investigations of sarcasm across the scientific community. In this thesis, we build upon these investigations. Pointing out their limitations, we bring four contributions that span two research directions: sarcasm detection and sarcasm understanding.
Sarcasm detection is the task of building computational models optimised for recognising sarcasm in a given text.
These models are often built in a supervised learning paradigm, relying on datasets of texts labelled for sarcasm.
We bring two contributions in this direction.
First, we question the effectiveness of previous methods used to label texts for sarcasm. We argue that the labels they produce might not coincide with the sarcastic intention of the authors of the texts that they are labelling.
In response, we suggest a new method, and we use it to build iSarcasm, a novel dataset of sarcastic and non-sarcastic tweets.
We show that previous models achieve considerably lower performance on iSarcasm than on previous datasets, while human annotators achieve a considerably higher performance, compared to models, pointing out the need for more effective models.
Therefore, as a second contribution, we organise a competition that invites the community to create such models.
Sarcasm understanding is the task of explicating the phenomena that are subsumed under the umbrella of sarcasm through computational investigation.
We bring two contributions in this direction.
First, we conduct an alaysis into the socio-demographic ecology of sarcastic exchanges between human interlocutors. We find that the effectiveness of such exchanges is influenced by the socio-demographic similarity between the interlocutors, with factors such as English language nativeness, age, and gender, being particualry influential. We suggest that future social analysis tools should account for these factors.
Second, we challenge the motivation of a recent endeavour of the community; mainly, that of augmenting dialogue systems with the ability to generate sarcastic responses. Through a series of social experiments, we provide guidelines for dialogue systems concerning the appropriateness of generating sarcastic responses, and the formulation of such responses.
Through our work, we aim to encourage the community to consider computational investigations of sarcasm interdisciplinarily, at the intersection of natural language processing and computational social science
- …