290 research outputs found

    Lime: Data Lineage in the Malicious Environment

    Full text link
    Intentional or unintentional leakage of confidential data is undoubtedly one of the most severe security threats that organizations face in the digital era. The threat now extends to our personal lives: a plethora of personal information is available to social networks and smartphone providers and is indirectly transferred to untrustworthy third party and fourth party applications. In this work, we present a generic data lineage framework LIME for data flow across multiple entities that take two characteristic, principal roles (i.e., owner and consumer). We define the exact security guarantees required by such a data lineage mechanism toward identification of a guilty entity, and identify the simplifying non repudiation and honesty assumptions. We then develop and analyze a novel accountable data transfer protocol between two entities within a malicious environment by building upon oblivious transfer, robust watermarking, and signature primitives. Finally, we perform an experimental evaluation to demonstrate the practicality of our protocol

    Models and Algorithms for Graph Watermarking

    Full text link
    We introduce models and algorithmic foundations for graph watermarking. Our frameworks include security definitions and proofs, as well as characterizations when graph watermarking is algorithmically feasible, in spite of the fact that the general problem is NP-complete by simple reductions from the subgraph isomorphism or graph edit distance problems. In the digital watermarking of many types of files, an implicit step in the recovery of a watermark is the mapping of individual pieces of data, such as image pixels or movie frames, from one object to another. In graphs, this step corresponds to approximately matching vertices of one graph to another based on graph invariants such as vertex degree. Our approach is based on characterizing the feasibility of graph watermarking in terms of keygen, marking, and identification functions defined over graph families with known distributions. We demonstrate the strength of this approach with exemplary watermarking schemes for two random graph models, the classic Erd\H{o}s-R\'{e}nyi model and a random power-law graph model, both of which are used to model real-world networks

    Towards Provably Invisible Network Flow Fingerprints

    Full text link
    Network traffic analysis reveals important information even when messages are encrypted. We consider active traffic analysis via flow fingerprinting by invisibly embedding information into packet timings of flows. In particular, assume Alice wishes to embed fingerprints into flows of a set of network input links, whose packet timings are modeled by Poisson processes, without being detected by a watchful adversary Willie. Bob, who receives the set of fingerprinted flows after they pass through the network modeled as a collection of independent and parallel M/M/1M/M/1 queues, wishes to extract Alice's embedded fingerprints to infer the connection between input and output links of the network. We consider two scenarios: 1) Alice embeds fingerprints in all of the flows; 2) Alice embeds fingerprints in each flow independently with probability pp. Assuming that the flow rates are equal, we calculate the maximum number of flows in which Alice can invisibly embed fingerprints while having those fingerprints successfully decoded by Bob. Then, we extend the construction and analysis to the case where flow rates are distinct, and discuss the extension of the network model

    Authentication with Distortion Criteria

    Full text link
    In a variety of applications, there is a need to authenticate content that has experienced legitimate editing in addition to potential tampering attacks. We develop one formulation of this problem based on a strict notion of security, and characterize and interpret the associated information-theoretic performance limits. The results can be viewed as a natural generalization of classical approaches to traditional authentication. Additional insights into the structure of such systems and their behavior are obtained by further specializing the results to Bernoulli and Gaussian cases. The associated systems are shown to be substantially better in terms of performance and/or security than commonly advocated approaches based on data hiding and digital watermarking. Finally, the formulation is extended to obtain efficient layered authentication system constructions.Comment: 22 pages, 10 figure

    Publicly Detectable Watermarking for Language Models

    Full text link
    We construct the first provable watermarking scheme for language models with public detectability or verifiability: we use a private key for watermarking and a public key for watermark detection. Our protocol is the first watermarking scheme that does not embed a statistical signal in generated text. Rather, we directly embed a publicly-verifiable cryptographic signature using a form of rejection sampling. We show that our construction meets strong formal security guarantees and preserves many desirable properties found in schemes in the private-key watermarking setting. In particular, our watermarking scheme retains distortion-freeness and model agnosticity. We implement our scheme and make empirical measurements over open models in the 7B parameter range. Our experiments suggest that our watermarking scheme meets our formal claims while preserving text quality

    Watermarking for multimedia security using complex wavelets

    Get PDF
    This paper investigates the application of complex wavelet transforms to the field of digital data hiding. Complex wavelets offer improved directional selectivity and shift invariance over their discretely sampled counterparts allowing for better adaptation of watermark distortions to the host media. Two methods of deriving visual models for the watermarking system are adapted to the complex wavelet transforms and their performances are compared. To produce improved capacity a spread transform embedding algorithm is devised, this combines the robustness of spread spectrum methods with the high capacity of quantization based methods. Using established information theoretic methods, limits of watermark capacity are derived that demonstrate the superiority of complex wavelets over discretely sampled wavelets. Finally results for the algorithm against commonly used attacks demonstrate its robustness and the improved performance offered by complex wavelet transforms

    Hashing Based Software Watermarking for Source Code Files

    Get PDF
    Software is developed and delivered to clients as a routine part of software engineering life cycle . Software is quite an expensive entity. However various attacks are possible on software to make its illegal use. Different solutions are there to prevent piracy. Software watermarking embeds a watermark in the source code so that it is undetectable yet it proves the ownership of the developer. The technique has been tested for C++ source code files, however, it can be applicable on any other language. The proposed techniques scans the code for all possible constants, forms a hash sequence using MD5 algorithm that calculates the watermark and stores in Date & Watermark Value Repository (DWVR)
    • …
    corecore