thesis

Information theoretic analysis of watermarking systems

Abstract

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2001.Includes bibliographical references (p. 185-193).This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Watermarking models a copyright protection mechanism where an original data sequence is modified before distribution to the public in order to embed some extra information. The embedding should be transparent (i.e., the modified data should be similar to the original data) and robust (i.e., the information should be recoverable even if the data is modified further). In this thesis, we describe the information-theoretic capacity of such a system as a function of the statistics of the data to be watermarked and the desired level of transparency and robustness. That is, we view watermarking from a communication perspective and describe the maximum bit-rate that can be reliably transmitted from encoder to decoder. We make the conservative assumption that there is a malicious attacker who knows how the watermarking system works and who attempts to design a forgery that is similar to the original data but that does not contain the watermark. Conversely, the watermarking system must meet its performance criteria for any feasible attacker and would like to force the attacker to effectively destroy the data in order to remove the watermark. Watermarking can thus be viewed as a dynamic game between these two players who are trying to minimize and maximize, respectively, the amount of information that can be reliably embedded. We compute the capacity for several scenarios, focusing largely on Gaussian data and a squared difference similarity measure.(cont.) In contrast to many suggested watermarking techniques that view the original data as interference, we find that the capacity increases with the uncertainty in the original data. Indeed, we find that out of all distributions with the same variance, a Gaussian distribution on the original data results in the highest capacity. Furthermore, for Gaussian data, the capacity increases with its variance. One surprising result is that with Gaussian data the capacity does not increase if the original data can be used to decode the watermark. This is reminiscent of a similar model, Costa's "writing on dirty paper", in which the attacker simply adds independent Gaussian noise. Unlike with a more sophisticated attacker, we show that the capacity does not change for Costa's model if the original data is not Gaussian.by Aaron Seth Cohen.Ph.D

    Similar works