20 research outputs found

    Information hiding in printed documents

    No full text
    In today\u27s digital world securing different forms of content is very important in terms of protecting copyright and verifying authenticity. One example is watermarking of digital audio and images. We believe that a marking scheme analogous to digital watermarking but for documents is very important. There currently exist techniques to secure documents such as bank notes using paper watermarks, security fibers, holograms, or special inks. There are a number of applications in which it is desirable to be able to identify the technology, manufacturer, model, or specific unit that was used to print a given document even if the printer in question does not make use of these existing security devices to explicitly identify itself. It would be useful to achieve the same or a better level of protection without the use of any additional devices or technologies. Two strategies are proposed for printer identification based upon examination of a printed document. The first strategy is passive. It involves characterization of the printer by finding features in the printed document that are intrinsic to that particular printer, model, or manufacturer\u27s products. The second strategy is active. It involves the embedding of an extrinsic signature into a printed page. This signature can be generated by modulating the process parameters of the printer mechanism to encode identifying information such as the printer serial number and date of printing. It is shown that good separation between printers is achievable using gray-level co-occurrence based texture features obtained from text documents. Experiments using ten printers and a support vector machine classifier show very low classification error even between printers with the same electromechanical structure. The technique is also shown to work for various font sizes, font types, paper types, and printer age. The features are observed to migrate with the age of the consumables indicating that it may be possible to estimate the age of the consumables at the time of printing. In addition, the intrinsic nature of the features makes it difficult to obscure or remove them without physically modifying the printer itself. Combining both texture features and banding features it is possible to identify a printer under several attack scenarios. A coding technique for embedding extrinsic signatures in text documents is presented. Both time and frequency domain signaling and detection schemes are investigated. It is shown that better performance is achieved using a time domain signaling scheme with a correlation detector due to the limited length of text character edges. It is also shown that by treating the document as a communication channel, a coding technique allowing approximately 3600 bits in a full page of 12 point text is achievable with a 7.74% bit error rate. By using the data hiding technique described above, a counterfeit and tamper detection method based on combinatorial group testing is developed and investigated. The low error rate achievable by the data hiding system allows reliable determination of document authenticity and the location of tampered data within a document. From results of previous work a printer dot model is proposed to simulate the printing of cluster-dot halftone patterns. It has been shown that the original parameters chosen for that model do not adequately represent vertical edges in saturated regions such as text. Estimating the parameters by minimizing the error between the simulated and experimental edge profiles and edge sharpness for both the left and right edges provides values that more accurately represent the actual edge with and without embedded signals

    High-Capacity Data Hiding in Text Documents

    No full text
    ABSTRACT In today's digital world securing different forms of content is very important in terms of protecting copyright and verifying authenticity. One example is watermarking of digital audio and images. We believe that a marking scheme analogous to digital watermarking but for documents is very important. In this paper we describe the use of laser amplitude modulation in electrophotographic printers to embed information in a text document. In particular we describe an embedding and detection process which has the capability to embed 14 bits into characters that have a left vertical edge. For a typical 12 point document this translates to approximately 12000 bits per page

    Data Hiding Capacity and Embedding Techniques for Printed Text Documents

    No full text
    In previous publications we have demonstrated the use of laser intensity modulation to embed information in halftone and text documents. In those experiments we were able to embed and correctly decode 33 bits in a 12 point page of printed text. In this paper we will present our current work on developing a channel model for a text document. This model will allow us to define capacity bounds for the channel and to better understand the modulation and detection techniques that can be used to reach that capacity.

    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. ***, NO. *, JANUARY *** 1 Scanner Identification Using Feature-Based Processing and Analysis

    No full text
    Abstract-Digital images can be obtained through a variety of sources including digital cameras and scanners. In many cases the ability to determine the source of a digital image is important. This paper presents methods for authenticating images that have been acquired using flatbed desktop scanners. These methods use scanner fingerprints based on statistics of imaging sensor pattern noise. To capture different types of sensor noise, a denoising filterbank consisting four different denoising filters is used for obtaining the noise patterns. To identify the source scanner, a support vector machine (SVM) classifier based on these fingerprints is used. These features are shown to achieve a high classification accuracy. Furthermore, the selected fingerprints based on statistical properties of the sensor noise are shown to be robust under post-processing operations such as JPEG compression, contrast stretching and sharpening

    Channel Model and Operational Capacity Analysis of Printed Text Documents

    No full text
    In today’s digital world securing different forms of content is very important in terms of protecting copyright and verifying authenticity. One example is watermarking of digital audio and images. We believe that a marking scheme analogous to digital watermarking but for documents is very important. In this paper we describe the use of laser amplitude modulation in electrophotographic printers to embed information in a text document. In particular we describe an embedding and detection process which allows the embedding of between 2 and 8 bits in a single line of text. For a typical 12 point document this translates to between 100 and 400 bits per page. We also perform an operational analysis to compare two decoding methods using different embedding densities
    corecore