49 research outputs found
Sequence to Sequence Mixture Model for Diverse Machine Translation
Sequence to sequence (SEQ2SEQ) models often lack diversity in their generated
translations. This can be attributed to the limitation of SEQ2SEQ models in
capturing lexical and syntactic variations in a parallel corpus resulting from
different styles, genres, topics, or ambiguity of the translation process. In
this paper, we develop a novel sequence to sequence mixture (S2SMIX) model that
improves both translation diversity and quality by adopting a committee of
specialized translation models rather than a single translation model. Each
mixture component selects its own training dataset via optimization of the
marginal loglikelihood, which leads to a soft clustering of the parallel
corpus. Experiments on four language pairs demonstrate the superiority of our
mixture model compared to a SEQ2SEQ baseline with standard or diversity-boosted
beam search. Our mixture model uses negligible additional parameters and incurs
no extra computation cost during decoding.Comment: 11 pages, 5 figures, accepted to CoNLL201
Word Representation Models for Morphologically Rich Languages in Neural Machine Translation
Dealing with the complex word forms in morphologically rich languages is an
open problem in language processing, and is particularly important in
translation. In contrast to most modern neural systems of translation, which
discard the identity for rare words, in this paper we propose several
architectures for learning word representations from character and morpheme
level word decompositions. We incorporate these representations in a novel
machine translation model which jointly learns word alignments and translations
via a hard attention mechanism. Evaluating on translating from several
morphologically rich languages into English, we show consistent improvements
over strong baseline methods, of between 1 and 1.5 BLEU points
Generative Models are Self-Watermarked: Declaring Model Authentication through Re-Generation
As machine- and AI-generated content proliferates, protecting the
intellectual property of generative models has become imperative, yet verifying
data ownership poses formidable challenges, particularly in cases of
unauthorized reuse of generated data. The challenge of verifying data ownership
is further amplified by using Machine Learning as a Service (MLaaS), which
often functions as a black-box system.
Our work is dedicated to detecting data reuse from even an individual sample.
Traditionally, watermarking has been leveraged to detect AI-generated content.
However, unlike watermarking techniques that embed additional information as
triggers into models or generated content, potentially compromising output
quality, our approach identifies latent fingerprints inherently present within
the outputs through re-generation. We propose an explainable verification
procedure that attributes data ownership through re-generation, and further
amplifies these fingerprints in the generative models through iterative data
re-generation. This methodology is theoretically grounded and demonstrates
viability and robustness using recent advanced text and image generative
models. Our methodology is significant as it goes beyond protecting the
intellectual property of APIs and addresses important issues such as the spread
of misinformation and academic misconduct. It provides a useful tool to ensure
the integrity of sources and authorship, expanding its application in different
scenarios where authenticity and ownership verification are essential
Use of LLMs for Illicit Purposes: Threats, Prevention Measures, and Vulnerabilities
Spurred by the recent rapid increase in the development and distribution of
large language models (LLMs) across industry and academia, much recent work has
drawn attention to safety- and security-related threats and vulnerabilities of
LLMs, including in the context of potentially criminal activities.
Specifically, it has been shown that LLMs can be misused for fraud,
impersonation, and the generation of malware; while other authors have
considered the more general problem of AI alignment. It is important that
developers and practitioners alike are aware of security-related problems with
such models. In this paper, we provide an overview of existing - predominantly
scientific - efforts on identifying and mitigating threats and vulnerabilities
arising from LLMs. We present a taxonomy describing the relationship between
threats caused by the generative capabilities of LLMs, prevention measures
intended to address such threats, and vulnerabilities arising from imperfect
prevention measures. With our work, we hope to raise awareness of the
limitations of LLMs in light of such security concerns, among both experienced
developers and novel users of such technologies.Comment: Pre-prin
IMBERT: Making BERT Immune to Insertion-based Backdoor Attacks
Backdoor attacks are an insidious security threat against machine learning models. Adversaries can manipulate the predictions of compromised models by inserting triggers into the training phase. Various backdoor attacks have been devised which can achieve nearly perfect attack success without affecting model predictions for clean inputs. Means of mitigating such vulnerabilities are underdeveloped, especially in natural language processing. To fill this gap, we introduce IMBERT, which uses either gradients or self-attention scores derived from victim models to self-defend against backdoor attacks at inference time. Our empirical studies demonstrate that IMBERT can effectively identify up to 98.5% of inserted triggers. Thus, it significantly reduces the attack success rate while attaining competitive accuracy on the clean dataset across widespread insertion-based attacks compared to two baselines. Finally, we show that our approach is model-agnostic, and can be easily ported to several pre-trained transformer models
Koala: An Index for Quantifying Overlaps with Pre-training Corpora
In very recent years more attention has been placed on probing the role of
pre-training data in Large Language Models (LLMs) downstream behaviour. Despite
the importance, there is no public tool that supports such analysis of
pre-training corpora at large scale. To help research in this space, we launch
Koala, a searchable index over large pre-training corpora using compressed
suffix arrays with highly efficient compression rate and search support. In its
first release we index the public proportion of OPT 175B pre-training data.
Koala provides a framework to do forensic analysis on the current and future
benchmarks as well as to assess the degree of memorization in the output from
the LLMs. Koala is available for public use at
https://koala-index.erc.monash.edu/.Comment: Available here: https://koala-index.erc.monash.edu