Search CORE

9 research outputs found

The Ethical Need for Watermarks in Machine-Generated Language

Author: Adomaitis Laurynas
Grinbaum Alexei
Publication venue
Publication date: 01/01/2022
Field of study

Watermarks should be introduced in the natural language outputs of AI systems in order to maintain the distinction between human and machine-generated text. The ethical imperative to not blur this distinction arises from the asemantic nature of large language models and from human projections of emotional and cognitive states on machines, possibly leading to manipulation, spreading falsehoods or emotional distress. Enforcing this distinction requires unintrusive, yet easily accessible marks of the machine origin. We propose to implement a code based on equidistant letter sequences. While no such code exists in human-written texts, its appearance in machine-generated ones would prove helpful for ethical reasons

arXiv.org e-Print Archive

HAL-CEA

Extracted BERT Model Leaks More Information than You Think!

Author: Chen C
He X
Lyu L
Xu Q
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/12/2022
Field of study

The collection and availability of big data, combined with advances in pre-trained models (e.g. BERT), have revolutionized the predictive performance of natural language processing tasks. This allows corporations to provide machine learning as a service (MLaaS) by encapsulating fine-tuned BERT-based models as APIs. Due to significant commercial interest, there has been a surge of attempts to steal remote services via model extraction. Although previous works have made progress in defending against model extraction attacks, there has been little discussion on their performance in preventing privacy leakage. This work bridges this gap by launching an attribute inference attack against the extracted BERT model. Our extensive experiments reveal that model extraction can cause severe privacy leakage even when victim models are facilitated with advanced defensive strategies

UCL Discovery

Single-Node Attack for Fooling Graph Neural Networks

Author: Alon Uri
Baskin Chaim
Finkelshtein Ben
Zheltonozhskii Evgenii
Publication venue
Publication date: 06/11/2020
Field of study

Graph neural networks (GNNs) have shown broad applicability in a variety of domains. Some of these domains, such as social networks and product recommendations, are fertile ground for malicious users and behavior. In this paper, we show that GNNs are vulnerable to the extremely limited scenario of a single-node adversarial example, where the node cannot be picked by the attacker. That is, an attacker can force the GNN to classify any target node to a chosen label by only slightly perturbing another single arbitrary node in the graph, even when not being able to pick that specific attacker node. When the adversary is allowed to pick a specific attacker node, the attack is even more effective. We show that this attack is effective across various GNN types, such as GraphSAGE, GCN, GAT, and GIN, across a variety of real-world datasets, and as a targeted and a non-targeted attack. Our code is available at https://github.com/benfinkelshtein/SINGLE

arXiv.org e-Print Archive

Red Teaming Language Model Detectors with Language Models

Author: Chang Kai-Wei
Chen Xiangning
Hsieh Cho-Jui
Shi Zhouxing
Wang Yihan
Yin Fan
Publication venue
Publication date: 19/10/2023
Field of study

The prevalence and strong capability of large language models (LLMs) present significant safety and ethical risks if exploited by malicious users. To prevent the potentially deceptive usage of LLMs, recent works have proposed algorithms to detect LLM-generated text and protect LLMs. In this paper, we investigate the robustness and reliability of these LLM detectors under adversarial attacks. We study two types of attack strategies: 1) replacing certain words in an LLM's output with their synonyms given the context; 2) automatically searching for an instructional prompt to alter the writing style of the generation. In both strategies, we leverage an auxiliary LLM to generate the word replacements or the instructional prompt. Different from previous works, we consider a challenging setting where the auxiliary LLM can also be protected by a detector. Experiments reveal that our attacks effectively compromise the performance of all detectors in the study with plausible generations, underscoring the urgent need to improve the robustness of LLM-generated text detection systems.Comment: Preprint. Accepted by TAC

arXiv.org e-Print Archive

Recommended from our members

CROSS: a framework for cyber risk optimisation in smart homes

Author: Loukas George
Malacaria Pasquale
Panaousis Emmanouil
Zhang Yunxiao
Publication venue: 'Elsevier BV'
Publication date: 05/04/2023
Field of study

This work introduces a decision support framework, called Cyber Risk Optimiser for Smart homeS (CROSS), which advises both smart home users and smart home service providers on how to select an optimal portfolio of cyber security controls to counteract cyber attacks in a smart home including traditional cyber attacks and adversarial machine learning attacks. CROSS is based on a multi-objective bi-level two-stage optimisation. In stage-one optimisation, the problem is modelled as a multi-leader-follower game that considers both security and economic objectives, where the provider selects a security portfolio to protect both itself and its users, while rational attackers target the weakest path. Stage-two optimisation is a Stackelberg security game that focuses on additional user security controls under the remit of smart home users. While CROSS can potentially be applied to other similar use cases, in this paper, our aim is to address threats against artificial intelligence (AI) applications as the use of AI in smart Internet of Things (IoT) devices introduces new cyber threats to home environments. Specifically, we have implemented and assessed CROSS in a smart heating use case in a prototypical AI-enabled IoT environment that combines characteristics and vulnerabilities currently present on existing commercial off-the-shelf (COTS) devices, demonstrating the selection of optimal decisions

Greenwich Academic Literature Archive

Queen Mary Research Online