715 research outputs found

    Fraternal Twins: Unifying Attacks on Machine Learning and Digital Watermarking

    Full text link
    Machine learning is increasingly used in security-critical applications, such as autonomous driving, face recognition and malware detection. Most learning methods, however, have not been designed with security in mind and thus are vulnerable to different types of attacks. This problem has motivated the research field of adversarial machine learning that is concerned with attacking and defending learning methods. Concurrently, a different line of research has tackled a very similar problem: In digital watermarking information are embedded in a signal in the presence of an adversary. As a consequence, this research field has also extensively studied techniques for attacking and defending watermarking methods. The two research communities have worked in parallel so far, unnoticeably developing similar attack and defense strategies. This paper is a first effort to bring these communities together. To this end, we present a unified notation of black-box attacks against machine learning and watermarking that reveals the similarity of both settings. To demonstrate the efficacy of this unified view, we apply concepts from watermarking to machine learning and vice versa. We show that countermeasures from watermarking can mitigate recent model-extraction attacks and, similarly, that techniques for hardening machine learning can fend off oracle attacks against watermarks. Our work provides a conceptual link between two research fields and thereby opens novel directions for improving the security of both, machine learning and digital watermarking

    BlackMarks: Blackbox Multibit Watermarking for Deep Neural Networks

    Full text link
    Deep Neural Networks have created a paradigm shift in our ability to comprehend raw data in various important fields ranging from computer vision and natural language processing to intelligence warfare and healthcare. While DNNs are increasingly deployed either in a white-box setting where the model internal is publicly known, or a black-box setting where only the model outputs are known, a practical concern is protecting the models against Intellectual Property (IP) infringement. We propose BlackMarks, the first end-to-end multi-bit watermarking framework that is applicable in the black-box scenario. BlackMarks takes the pre-trained unmarked model and the owner's binary signature as inputs and outputs the corresponding marked model with a set of watermark keys. To do so, BlackMarks first designs a model-dependent encoding scheme that maps all possible classes in the task to bit '0' and bit '1' by clustering the output activations into two groups. Given the owner's watermark signature (a binary string), a set of key image and label pairs are designed using targeted adversarial attacks. The watermark (WM) is then embedded in the prediction behavior of the target DNN by fine-tuning the model with generated WM key set. To extract the WM, the remote model is queried by the WM key images and the owner's signature is decoded from the corresponding predictions according to the designed encoding scheme. We perform a comprehensive evaluation of BlackMarks's performance on MNIST, CIFAR10, ImageNet datasets and corroborate its effectiveness and robustness. BlackMarks preserves the functionality of the original DNN and incurs negligible WM embedding runtime overhead as low as 2.054%

    Robust Spatial-spread Deep Neural Image Watermarking

    Full text link
    Watermarking is an operation of embedding an information into an image in a way that allows to identify ownership of the image despite applying some distortions on it. In this paper, we presented a novel end-to-end solution for embedding and recovering the watermark in the digital image using convolutional neural networks. The method is based on spreading the message over the spatial domain of the image, hence reducing the "local bits per pixel" capacity. To obtain the model we used adversarial training and applied noiser layers between the encoder and the decoder. Moreover, we broadened the spectrum of typically considered attacks on the watermark and by grouping the attacks according to their scope, we achieved high general robustness, most notably against JPEG compression, Gaussian blurring, subsampling or resizing. To help us in the models training we also proposed a precise differentiable approximation of JPEG.Comment: The article was accepted on TrustCom 2020: The 19th IEEE International Conference on Trust, Security and Privacy in Computing and Communication

    Turning Your Weakness Into a Strength: Watermarking Deep Neural Networks by Backdooring

    Full text link
    Deep Neural Networks have recently gained lots of success after enabling several breakthroughs in notoriously challenging problems. Training these networks is computationally expensive and requires vast amounts of training data. Selling such pre-trained models can, therefore, be a lucrative business model. Unfortunately, once the models are sold they can be easily copied and redistributed. To avoid this, a tracking mechanism to identify models as the intellectual property of a particular vendor is necessary. In this work, we present an approach for watermarking Deep Neural Networks in a black-box way. Our scheme works for general classification tasks and can easily be combined with current learning algorithms. We show experimentally that such a watermark has no noticeable impact on the primary task that the model is designed for and evaluate the robustness of our proposal against a multitude of practical attacks. Moreover, we provide a theoretical analysis, relating our approach to previous work on backdooring

    Robust Watermarking of Neural Network with Exponential Weighting

    Full text link
    Deep learning has been achieving top performance in many tasks. Since training of a deep learning model requires a great deal of cost, we need to treat neural network models as valuable intellectual properties. One concern in such a situation is that some malicious user might redistribute the model or provide a prediction service using the model without permission. One promising solution is digital watermarking, to embed a mechanism into the model so that the owner of the model can verify the ownership of the model externally. In this study, we present a novel attack method against watermark, query modification, and demonstrate that all of the existing watermark methods are vulnerable to either of query modification or existing attack method (model modification). To overcome this vulnerability, we present a novel watermarking method, exponential weighting. We experimentally show that our watermarking method achieves high verification performance of watermark even under a malicious attempt of unauthorized service providers, such as model modification and query modification, without sacrificing the predictive performance of the neural network model.Comment: 13 page

    Local Gradients Smoothing: Defense against localized adversarial attacks

    Full text link
    Deep neural networks (DNNs) have shown vulnerability to adversarial attacks, i.e., carefully perturbed inputs designed to mislead the network at inference time. Recently introduced localized attacks, Localized and Visible Adversarial Noise (LaVAN) and Adversarial patch, pose a new challenge to deep learning security by adding adversarial noise only within a specific region without affecting the salient objects in an image. Driven by the observation that such attacks introduce concentrated high-frequency changes at a particular image location, we have developed an effective method to estimate noise location in gradient domain and transform those high activation regions caused by adversarial noise in image domain while having minimal effect on the salient object that is important for correct classification. Our proposed Local Gradients Smoothing (LGS) scheme achieves this by regularizing gradients in the estimated noisy region before feeding the image to DNN for inference. We have shown the effectiveness of our method in comparison to other defense methods including Digital Watermarking, JPEG compression, Total Variance Minimization (TVM) and Feature squeezing on ImageNet dataset. In addition, we systematically study the robustness of the proposed defense mechanism against Back Pass Differentiable Approximation (BPDA), a state of the art attack recently developed to break defenses that transform an input sample to minimize the adversarial effect. Compared to other defense mechanisms, LGS is by far the most resistant to BPDA in localized adversarial attack setting.Comment: Accepted At WACV-201

    Performance Comparison of Contemporary DNN Watermarking Techniques

    Full text link
    DNNs shall be considered as the intellectual property (IP) of the model builder due to the impeding cost of designing/training a highly accurate model. Research attempts have been made to protect the authorship of the trained model and prevent IP infringement using DNN watermarking techniques. In this paper, we provide a comprehensive performance comparison of the state-of-the-art DNN watermarking methodologies according to the essential requisites for an effective watermarking technique. We identify the pros and cons of each scheme and provide insights into the underlying rationale. Empirical results corroborate that DeepSigns framework proposed in [4] has the best overall performance in terms of the evaluation metrics. Our comparison facilitates the development of pending watermarking approaches and enables the model owner to deploy the watermarking scheme that satisfying her requirements

    HiDDeN: Hiding Data With Deep Networks

    Full text link
    Recent work has shown that deep neural networks are highly sensitive to tiny perturbations of input images, giving rise to adversarial examples. Though this property is usually considered a weakness of learned models, we explore whether it can be beneficial. We find that neural networks can learn to use invisible perturbations to encode a rich amount of useful information. In fact, one can exploit this capability for the task of data hiding. We jointly train encoder and decoder networks, where given an input message and cover image, the encoder produces a visually indistinguishable encoded image, from which the decoder can recover the original message. We show that these encodings are competitive with existing data hiding algorithms, and further that they can be made robust to noise: our models learn to reconstruct hidden information in an encoded image despite the presence of Gaussian blurring, pixel-wise dropout, cropping, and JPEG compression. Even though JPEG is non-differentiable, we show that a robust model can be trained using differentiable approximations. Finally, we demonstrate that adversarial training improves the visual quality of encoded images

    StegaStamp: Invisible Hyperlinks in Physical Photographs

    Full text link
    Printed and digitally displayed photos have the ability to hide imperceptible digital data that can be accessed through internet-connected imaging systems. Another way to think about this is physical photographs that have unique QR codes invisibly embedded within them. This paper presents an architecture, algorithms, and a prototype implementation addressing this vision. Our key technical contribution is StegaStamp, a learned steganographic algorithm to enable robust encoding and decoding of arbitrary hyperlink bitstrings into photos in a manner that approaches perceptual invisibility. StegaStamp comprises a deep neural network that learns an encoding/decoding algorithm robust to image perturbations approximating the space of distortions resulting from real printing and photography. We demonstrates real-time decoding of hyperlinks in photos from in-the-wild videos that contain variation in lighting, shadows, perspective, occlusion and viewing distance. Our prototype system robustly retrieves 56 bit hyperlinks after error correction - sufficient to embed a unique code within every photo on the internet.Comment: CVPR 2020, Project page: http://www.matthewtancik.com/stegastam

    Deep Learning in steganography and steganalysis from 2015 to 2018

    Full text link
    For almost 10 years, the detection of a hidden message in an image has been mainly carried out by the computation of Rich Models (RM), followed by classification using an Ensemble Classifier (EC). In 2015, the first study using a convolutional neural network (CNN) obtained the first results of steganalysis by Deep Learning approaching the performances of the two-step approach (EC + RM). Between 2015-2018, numerous publications have shown that it is possible to obtain improved performances, notably in spatial steganalysis, JPEG steganalysis, Selection-Channel-Aware steganalysis, and in quantitative steganalysis. This chapter deals with deep learning in steganalysis from the point of view of current methods, by presenting different neural networks from the period 2015-2018, that have been evaluated with a methodology specific to the discipline of steganalysis. The chapter is not intended to repeat the basic concepts of machine learning or deep learning. So, we will present the structure of a deep neural network, in a generic way and present the networks proposed in existing literature for the different scenarios of steganalysis, and finally, we will discuss steganography by deep learning.Comment: Book chapter, final version (October 2019). This chapter will appear in 2020 in the book titled: "Digital Media Steganography: Principles, Algorithms, Advances", Book Editor: M. Hassaballah. 46 page
    • …
    corecore