50 research outputs found

    Natural Language Watermarking and Tamperproofing

    Get PDF
    Two main results in the area of information hiding in natural language text are presented. A semantically-based scheme dramatically improves the information-hiding capacity of any text through two techniques: (i) modifying the granularity of meaning of individual sentences, whereas our own previous scheme kept the granularity fixed, and (ii) halving the number of sentences affected by the watermark. No longer a "long text, short watermark" approach, it now makes it possible to watermark short texts like wire agency reports. Using both the above-mentioned semantic marking scheme and our previous syntactically-based method hides information in a way that reveals any non-trivial tampering with the text (while re-formatting is not considered to be tampering---the problem would be solved trivially otherwise by hiding a hash of the text) with a probability 1--2 , n being its number of sentences and a small positive integer based on the extent of co-referencing

    Digital watermarking: a state-of-the-art review

    Get PDF
    Digital watermarking is the art of embedding data, called a watermark, into a multimedia object such that the watermark can be detected or extracted later without impairing the object. Concealment of secret messages inside a natural language, known as steganography, has been in existence as early as the 16th century. However, the increase in electronic/digital information transmission and distribution has resulted in the spread of watermarking from ordinary text to multimedia transmission. In this paper, we review various approaches and methods that have been used to conceal and preserve messages. Examples of real-world applications are also discussed.SANPAD, Telkom, Cisco, Aria Technologies, THRIPDepartment of HE and Training approved lis

    Printed document integrity verification using barcode

    Get PDF
    Printed documents are still relevant in our daily life and information in it must be protected from threats and attacks such as forgery, falsification or unauthorized modification. Such threats make the document lose its integrity and authenticity. There are several techniques that have been proposed and used to ensure authenticity and originality of printed documents. But some of the techniques are not suitable for public use due to its complexity, hard to obtain special materials to secure the document and expensive. This paper discuss several techniques for printed document security such as watermarking and barcode as well as the usability of two dimensional barcode in document authentication and data compression with the barcode. A conceptual solution that are simple and efficient to secure the integrity and document sender's authenticity is proposed that uses two dimensional barcode to carry integrity and authenticity information in the document. The information stored in the barcode contains digital signature that provides sender's authenticity and hash value that can ensure the integrity of the printed document

    Lime: Data Lineage in the Malicious Environment

    Full text link
    Intentional or unintentional leakage of confidential data is undoubtedly one of the most severe security threats that organizations face in the digital era. The threat now extends to our personal lives: a plethora of personal information is available to social networks and smartphone providers and is indirectly transferred to untrustworthy third party and fourth party applications. In this work, we present a generic data lineage framework LIME for data flow across multiple entities that take two characteristic, principal roles (i.e., owner and consumer). We define the exact security guarantees required by such a data lineage mechanism toward identification of a guilty entity, and identify the simplifying non repudiation and honesty assumptions. We then develop and analyze a novel accountable data transfer protocol between two entities within a malicious environment by building upon oblivious transfer, robust watermarking, and signature primitives. Finally, we perform an experimental evaluation to demonstrate the practicality of our protocol

    Who Wrote this Code? Watermarking for Code Generation

    Full text link
    Large language models for code have recently shown remarkable performance in generating executable code. However, this rapid advancement has been accompanied by many legal and ethical concerns, such as code licensing issues, code plagiarism, and malware generation, making watermarking machine-generated code a very timely problem. Despite such imminent needs, we discover that existing watermarking and machine-generated text detection methods for LLMs fail to function with code generation tasks properly. Hence, in this work, we propose a new watermarking method, SWEET, that significantly improves upon previous approaches when watermarking machine-generated code. Our proposed method selectively applies watermarking to the tokens with high enough entropy, surpassing a defined threshold. The experiments on code generation benchmarks show that our watermarked code has superior quality compared to code produced by the previous state-of-the-art LLM watermarking method. Furthermore, our watermark method also outperforms DetectGPT for the task of machine-generated code detection

    DATA ACROSS IN MALIGNANT SECURITY GUARANTEES IN PUBLIC NETS

    Get PDF
    Within this work, we present a normal data lineage framework LIME for data flow across multiple entities that take two characteristic, principal roles. In some instances, identification from the leaker is thanks to forensic techniques, but these are typically costly and don't always create the preferred results. We present LIME, one for accountable bandwidth across multiple entities. We define participating parties, their inter-relationships and provide a concrete instantiation for any bandwidth protocol utilizing a novel mixture of oblivious transfer, robust watermarking and digital signatures.  We define the precise security guarantees needed by this type of data lineage mechanism toward identification of the guilty entity, and find out the simplifying non-repudiation and honesty assumptions. Then we develop and evaluate a singular accountable bandwidth protocol between two entities inside a malicious atmosphere because they build upon oblivious transfer, robust watermarking, and signature primitives. Finally, we perform an experimental evaluation to show the functionality in our protocol and apply our framework towards the important data leakage scenarios of information outsourcing and social systems. Generally, we consider LIME, our lineage framework for bandwidth, to become a key step towards achieving accountability by design. The important thing benefit of our model is it enforces accountability by design i.e., it drives the machine designer to think about possible data leakages and also the corresponding accountability constraints in the design stage
    corecore