7 research outputs found

    Joint Compression and Digital Watermarking: Information-Theoretic Study and Algorithms Development

    Get PDF
    In digital watermarking, a watermark is embedded into a covertext in such a way that the resulting watermarked signal is robust to certain distortion caused by either standard data processing in a friendly environment or malicious attacks in an unfriendly environment. The watermarked signal can then be used for different purposes ranging from copyright protection, data authentication,fingerprinting, to information hiding. In this thesis, digital watermarking will be investigated from both an information theoretic viewpoint and a numerical computation viewpoint. From the information theoretic viewpoint, we first study a new digital watermarking scenario, in which watermarks and covertexts are generated from a joint memoryless watermark and covertext source. The configuration of this scenario is different from that treated in existing digital watermarking works, where watermarks are assumed independent of covertexts. In the case of public watermarking where the covertext is not accessible to the watermark decoder, a necessary and sufficient condition is determined under which the watermark can be fully recovered with high probability at the end of watermark decoding after the watermarked signal is disturbed by a fixed memoryless attack channel. Moreover, by using similar techniques, a combined source coding and Gel'fand-Pinsker channel coding theorem is established, and an open problem proposed recently by Cox et al is solved. Interestingly, from the sufficient and necessary condition we can show that, in light of the correlation between the watermark and covertext, watermarks still can be fully recovered with high probability even if the entropy of the watermark source is strictly above the standard public watermarking capacity. We then extend the above watermarking scenario to a case of joint compression and watermarking, where the watermark and covertext are correlated, and the watermarked signal has to be further compressed. Given an additional constraint of the compression rate of the watermarked signals, a necessary and sufficient condition is determined again under which the watermark can be fully recovered with high probability at the end of public watermark decoding after the watermarked signal is disturbed by a fixed memoryless attack channel. The above two joint compression and watermarking models are further investigated under a less stringent environment where the reproduced watermark at the end of decoding is allowed to be within certain distortion of the original watermark. Sufficient conditions are determined in both cases, under which the original watermark can be reproduced with distortion less than a given distortion level after the watermarked signal is disturbed by a fixed memoryless attack channel and the covertext is not available to the watermark decoder. Watermarking capacities and joint compression and watermarking rate regions are often characterized and/or presented as optimization problems in information theoretic research. However, it does not mean that they can be calculated easily. In this thesis we first derive closed forms of watermarking capacities of private Laplacian watermarking systems with the magnitude-error distortion measure under a fixed additive Laplacian attack and a fixed arbitrary additive attack, respectively. Then, based on the idea of the Blahut-Arimoto algorithm for computing channel capacities and rate distortion functions, two iterative algorithms are proposed for calculating private watermarking capacities and compression and watermarking rate regions of joint compression and private watermarking systems with finite alphabets. Finally, iterative algorithms are developed for calculating public watermarking capacities and compression and watermarking rate regions of joint compression and public watermarking systems with finite alphabets based on the Blahut-Arimoto algorithm and the Shannon's strategy

    Pseudorandom Error-Correcting Codes

    Get PDF
    We construct pseudorandom error-correcting codes (or simply pseudorandom codes), which are error-correcting codes with the property that any polynomial number of codewords are pseudorandom to any computationally-bounded adversary. Efficient decoding of corrupted codewords is possible with the help of a decoding key. We build pseudorandom codes that are robust to substitution and deletion errors, where pseudorandomness rests on standard cryptographic assumptions. Specifically, pseudorandomness is based on either 2O(n)2^{O(\sqrt{n})}-hardness of LPN, or polynomial hardness of LPN and the planted XOR problem at low density. As our primary application of pseudorandom codes, we present an undetectable watermarking scheme for outputs of language models that is robust to cropping and a constant rate of random substitutions and deletions. The watermark is undetectable in the sense that any number of samples of watermarked text are computationally indistinguishable from text output by the original model. This is the first undetectable watermarking scheme that can tolerate a constant rate of errors. Our second application is to steganography, where a secret message is hidden in innocent-looking content. We present a constant-rate stateless steganography scheme with robustness to a constant rate of substitutions. Ours is the first stateless steganography scheme with provable steganographic security and any robustness to errors

    Information theoretic analysis of watermarking systems

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2001.Includes bibliographical references (p. 185-193).This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Watermarking models a copyright protection mechanism where an original data sequence is modified before distribution to the public in order to embed some extra information. The embedding should be transparent (i.e., the modified data should be similar to the original data) and robust (i.e., the information should be recoverable even if the data is modified further). In this thesis, we describe the information-theoretic capacity of such a system as a function of the statistics of the data to be watermarked and the desired level of transparency and robustness. That is, we view watermarking from a communication perspective and describe the maximum bit-rate that can be reliably transmitted from encoder to decoder. We make the conservative assumption that there is a malicious attacker who knows how the watermarking system works and who attempts to design a forgery that is similar to the original data but that does not contain the watermark. Conversely, the watermarking system must meet its performance criteria for any feasible attacker and would like to force the attacker to effectively destroy the data in order to remove the watermark. Watermarking can thus be viewed as a dynamic game between these two players who are trying to minimize and maximize, respectively, the amount of information that can be reliably embedded. We compute the capacity for several scenarios, focusing largely on Gaussian data and a squared difference similarity measure.(cont.) In contrast to many suggested watermarking techniques that view the original data as interference, we find that the capacity increases with the uncertainty in the original data. Indeed, we find that out of all distributions with the same variance, a Gaussian distribution on the original data results in the highest capacity. Furthermore, for Gaussian data, the capacity increases with its variance. One surprising result is that with Gaussian data the capacity does not increase if the original data can be used to decode the watermark. This is reminiscent of a similar model, Costa's "writing on dirty paper", in which the attacker simply adds independent Gaussian noise. Unlike with a more sophisticated attacker, we show that the capacity does not change for Costa's model if the original data is not Gaussian.by Aaron Seth Cohen.Ph.D

    On Information Embedding When Watermarks and Covertexts are Correlated

    No full text

    On Information Embedding When Watermarks and Covertexts Are Correlated

    No full text

    On Joint Compression and Information Embedding When Watermarks and Covertexts Are Correlated

    No full text
    corecore