70 research outputs found
Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models
Watermarking generative models consists of planting a statistical signal
(watermark) in a model's output so that it can be later verified that the
output was generated by the given model. A strong watermarking scheme satisfies
the property that a computationally bounded attacker cannot erase the watermark
without causing significant quality degradation. In this paper, we study the
(im)possibility of strong watermarking schemes. We prove that, under
well-specified and natural assumptions, strong watermarking is impossible to
achieve. This holds even in the private detection algorithm setting, where the
watermark insertion and detection algorithms share a secret key, unknown to the
attacker. To prove this result, we introduce a generic efficient watermark
attack; the attacker is not required to know the private key of the scheme or
even which scheme is used. Our attack is based on two assumptions: (1) The
attacker has access to a "quality oracle" that can evaluate whether a candidate
output is a high-quality response to a prompt, and (2) The attacker has access
to a "perturbation oracle" which can modify an output with a nontrivial
probability of maintaining quality, and which induces an efficiently mixing
random walk on high-quality outputs. We argue that both assumptions can be
satisfied in practice by an attacker with weaker computational capabilities
than the watermarked model itself, to which the attacker has only black-box
access. Furthermore, our assumptions will likely only be easier to satisfy over
time as models grow in capabilities and modalities. We demonstrate the
feasibility of our attack by instantiating it to attack three existing
watermarking schemes for large language models: Kirchenbauer et al. (2023),
Kuditipudi et al. (2023), and Zhao et al. (2023). The same attack successfully
removes the watermarks planted by all three schemes, with only minor quality
degradation.Comment: Blog post:
https://www.harvard.edu/kempner-institute/2023/11/09/watermarking-in-the-sand
Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models
Watermarking generative models consists of planting a statistical signal (watermark) in a model’s output so that it can be later verified that the output was generated by the given model. A strong watermarking scheme satisfies the property that a computationally bounded attacker cannot erase the watermark without causing significant quality degradation. In this paper, we study the (im)possibility of strong watermarking schemes. We prove that, under well-specified and natural assumptions, strong watermarking is impossible to achieve. This holds even in the private detection algorithm setting, where the watermark insertion and detection algorithms share a secret key, unknown to the attacker. To prove this result, we introduce a generic efficient watermark attack; the attacker is not required to know the private key of the scheme or even which scheme is used.
Our attack is based on two assumptions: (1) The attacker has access to a “quality oracle” that can evaluate whether a candidate output is a high-quality response to a prompt, and (2) The attacker has access to a “perturbation oracle” which can modify an output with a nontrivial probability of maintaining quality, and which induces an efficiently mixing random walk on high-quality outputs. We argue that both assumptions can be satisfied in practice by an attacker with weaker computational capabilities than the watermarked model itself, to which the attacker has only black-box access. Furthermore, our assumptions will likely only be easier to satisfy over time as models grow in capabilities and modalities.
We demonstrate the feasibility of our attack by instantiating it to attack three existing watermarking schemes for large language models: Kirchenbauer et al. (2023), Kuditipudi et al. (2023), and Zhao et al. (2023). The same attack successfully removes the watermarks planted by all three schemes, with only minor quality degradation
An Asymmetric Watermarking Method
Special Issue on Signal Processing for Data Hiding in Digital Media and Secure Content DeliveryThis article presents an asymmetric watermarking method as an alternative to classical Direct Sequence Spread Spectrum and Watermarking Costa Schemes techniques. This new method provides a higher security level against malicious attacks threatening watermarking techniques used for a copy protection purpose. This application, which is quite different from the classical copyright enforcement issue, is extremely challenging as no public algorithm is so far known to be secure enough and some proposed proprietary techniques have been already hacked. Our method is thus a try towards the proof that the Kerckhoffs principle can be stated in the copy protection framework
Data Hiding and Its Applications
Data hiding techniques have been widely used to provide copyright protection, data integrity, covert communication, non-repudiation, and authentication, among other applications. In the context of the increased dissemination and distribution of multimedia content over the internet, data hiding methods, such as digital watermarking and steganography, are becoming increasingly relevant in providing multimedia security. The goal of this book is to focus on the improvement of data hiding algorithms and their different applications (both traditional and emerging), bringing together researchers and practitioners from different research fields, including data hiding, signal processing, cryptography, and information theory, among others
Information Analysis for Steganography and Steganalysis in 3D Polygonal Meshes
Information hiding, which embeds a watermark/message over a cover signal, has recently found extensive applications in, for example, copyright protection, content authentication and covert communication. It has been widely considered as an appealing technology to complement conventional cryptographic processes in the field of multimedia security by embedding information into the signal being protected. Generally, information hiding can be classified into two categories: steganography and watermarking. While steganography attempts to embed as much information as possible into a cover signal, watermarking tries to emphasize the robustness of the embedded information at the expense of embedding capacity.
In contrast to information hiding, steganalysis aims at detecting whether a given medium has hidden message in it, and, if possible, recover that hidden message. It can be used to measure the security performance of information hiding techniques, meaning a steganalysis resistant steganographic/watermarking method should be imperceptible not only to Human Vision Systems (HVS), but also to intelligent analysis.
As yet, 3D information hiding and steganalysis has received relatively less attention compared to image information hiding, despite the proliferation of 3D computer graphics models which are fairly promising information carriers. This thesis focuses on this relatively neglected research area and has the following primary objectives: 1) to investigate the trade-off between embedding capacity and distortion by considering the correlation between spatial and normal/curvature noise in triangle meshes; 2) to design satisfactory 3D steganographic algorithms, taking into account this trade-off; 3) to design robust 3D watermarking algorithms; 4) to propose a steganalysis framework for detecting the existence of the hidden information in 3D models and introduce a universal 3D steganalytic method under this framework. %and demonstrate the performance of the proposed steganalysis by testing it against six well-known 3D steganographic/watermarking methods.
The thesis is organized as follows. Chapter 1 describes in detail the background relating to information hiding and steganalysis, as well as the research problems this thesis will be studying. Chapter 2 conducts a survey on the previous information hiding techniques for digital images, 3D models and other medium and also on image steganalysis algorithms.
Motivated by the observation that the knowledge of the spatial accuracy of the mesh vertices does not easily translate into information related to the accuracy of other visually important mesh attributes such as normals, Chapters 3 and 4 investigate the impact of modifying vertex coordinates of 3D triangle models on the mesh normals. Chapter 3 presents the results of an empirical investigation, whereas Chapter 4 presents the results of a theoretical study. Based on these results, a high-capacity 3D steganographic algorithm capable of controlling embedding distortion is also presented in Chapter 4.
In addition to normal information, several mesh interrogation, processing and rendering algorithms make direct or indirect use of curvature information. Motivated by this, Chapter 5 studies the relation between Discrete Gaussian Curvature (DGC) degradation and vertex coordinate modifications.
Chapter 6 proposes a robust watermarking algorithm for 3D polygonal models, based on modifying the histogram of the distances from the model vertices to a point in 3D space. That point is determined by applying Principal Component Analysis (PCA) to the cover model. The use of PCA makes the watermarking method robust against common 3D operations, such as rotation, translation and vertex reordering. In addition, Chapter 6 develops a 3D specific steganalytic algorithm to detect the existence of the hidden messages embedded by one well-known watermarking method. By contrast, the focus of Chapter 7 will be on developing a 3D watermarking algorithm that is resistant to mesh editing or deformation attacks that change the global shape of the mesh.
By adopting a framework which has been successfully developed for image steganalysis, Chapter 8 designs a 3D steganalysis method to detect the existence of messages hidden in 3D models with existing steganographic and watermarking algorithms. The efficiency of this steganalytic algorithm has been evaluated on five state-of-the-art 3D watermarking/steganographic methods. Moreover, being a universal steganalytic algorithm can be used as a benchmark for measuring the anti-steganalysis performance of other existing and most importantly future watermarking/steganographic algorithms.
Chapter 9 concludes this thesis and also suggests some potential directions for future work
Collusion-resistant fingerprinting for multimedia in a broadcast channel environment
Digital fingerprinting is a method by which a copyright owner can uniquely
embed a buyer-dependent, inconspicuous serial number (representing the fingerprint)
into every copy of digital data that is legally sold. The buyer of a legal copy is
then deterred from distributing further copies, because the unique fingerprint can be
used to trace back the origin of the piracy. The major challenge in fingerprinting is
collusion, an attack in which a coalition of pirates compare several of their uniquely
fingerprinted copies for the purpose of detecting and removing the fingerprints.
The objectives of this work are two-fold. First, we investigate the need for robustness
against large coalitions of pirates by introducing the concept of a malicious
distributor that has been overlooked in prior work. A novel fingerprinting code that
has superior codeword length in comparison to existing work under this novel malicious
distributor scenario is developed. In addition, ideas presented in the proposed
fingerprinting design can easily be applied to existing fingerprinting schemes, making
them more robust to collusion attacks.
Second, a new framework termed Joint Source Fingerprinting that integrates the
processes of watermarking and codebook design is introduced. The need for this new
paradigm is motivated by the fact that existing fingerprinting methods result in a
perceptually undistorted multimedia after collusion is applied. In contrast, the new
paradigm equates the process of collusion amongst a coalition of pirates, to degrading
the perceptual characteristics, and hence commercial value of the multimedia in question.
Thus by enforcing that the process of collusion diminishes the commercial value
of the content, the pirates are deterred from attacking the fingerprints. A fingerprinting
algorithm for video as well as an efficient means of broadcasting or distributing
fingerprinted video is also presented. Simulation results are provided to verify our
theoretical and empirical observations
Digital watermarking methods for data security and authentication
Philosophiae Doctor - PhDCryptology is the study of systems that typically originate from a consideration of the ideal circumstances under which secure information exchange is to take place. It involves the study of cryptographic and other processes that might be introduced for breaking the output of such systems - cryptanalysis. This includes the introduction of formal mathematical methods for the design of a cryptosystem and for estimating its theoretical level of securit
Multi-algorithmic Cryptography using Deterministic Chaos with Applications to Mobile Communications
In this extended paper, we present an overview of the principal issues associated with cryptography, providing historically significant examples for illustrative purposes as part of a short tutorial for readers that are not familiar with the subject matter. This is used to introduce the role that nonlinear dynamics and chaos play in the design of encryption engines which utilize different types of Iteration Function Systems (IFS). The design of such encryption engines requires that they conform to the principles associated with diffusion and confusion for generating ciphers that are of a maximum entropy type. For this reason, the role of confusion and diffusion in cryptography is discussed giving a design guide to the construction of ciphers that are based on the use of IFS. We then present the background and operating framework associated with a new product - CrypsticTM - which is based on the application of multi-algorithmic IFS to design encryption engines mounted on a USB memory stick using both disinformation and obfuscation to ‘hide’ a forensically inert application. The protocols and procedures associated with the use of this product are also briefly discussed
ИНТЕЛЛЕКТУАЛЬНЫЙ числовым программным ДЛЯ MIMD-компьютер
For most scientific and engineering problems simulated on computers the solving of problems of the computational mathematics with approximately given initial data constitutes an intermediate or a final stage. Basic problems of the computational mathematics include the investigating and solving of linear algebraic systems, evaluating of eigenvalues and eigenvectors of matrices, the solving of systems of non-linear equations, numerical integration of initial- value problems for systems of ordinary differential equations.Для більшості наукових та інженерних задач моделювання на ЕОМ рішення задач обчислювальної математики з наближено заданими вихідними даними складає проміжний або остаточний етап. Основні проблеми обчислювальної математики відносяться дослідження і рішення лінійних алгебраїчних систем оцінки власних значень і власних векторів матриць, рішення систем нелінійних рівнянь, чисельного інтегрування початково задач для систем звичайних диференціальних рівнянь.Для большинства научных и инженерных задач моделирования на ЭВМ решение задач вычислительной математики с приближенно заданным исходным данным составляет промежуточный или окончательный этап. Основные проблемы вычислительной математики относятся исследования и решения линейных алгебраических систем оценки собственных значений и собственных векторов матриц, решение систем нелинейных уравнений, численного интегрирования начально задач для систем обыкновенных дифференциальных уравнений
- …