Adversarial Illusions in Multi-Modal Embeddings

Bagdasaryan, Eugene; Jha, Rishi; Shmatikov, Vitaly; Zhang, Tingwei

Adversarial Illusions in Multi-Modal Embeddings

Authors: Eugene Bagdasaryan
Rishi Jha
Vitaly Shmatikov
Tingwei Zhang
Publication date: 6 October 2023
Publisher

Abstract

Multi-modal embeddings encode images, sounds, texts, videos, etc. into a single embedding space, aligning representations across modalities (e.g., associate an image of a dog with a barking sound). We show that multi-modal embeddings can be vulnerable to an attack we call "adversarial illusions." Given an image or a sound, an adversary can perturb it so as to make its embedding close to an arbitrary, adversary-chosen input in another modality. This enables the adversary to align any image and any sound with any text. Adversarial illusions exploit proximity in the embedding space and are thus agnostic to downstream tasks. Using ImageBind embeddings, we demonstrate how adversarially aligned inputs, generated without knowledge of specific downstream tasks, mislead image generation, text generation, and zero-shot classification

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2308.11804

Last time updated on 04/09/2023