715 research outputs found
Fraternal Twins: Unifying Attacks on Machine Learning and Digital Watermarking
Machine learning is increasingly used in security-critical applications, such
as autonomous driving, face recognition and malware detection. Most learning
methods, however, have not been designed with security in mind and thus are
vulnerable to different types of attacks. This problem has motivated the
research field of adversarial machine learning that is concerned with attacking
and defending learning methods. Concurrently, a different line of research has
tackled a very similar problem: In digital watermarking information are
embedded in a signal in the presence of an adversary. As a consequence, this
research field has also extensively studied techniques for attacking and
defending watermarking methods.
The two research communities have worked in parallel so far, unnoticeably
developing similar attack and defense strategies. This paper is a first effort
to bring these communities together. To this end, we present a unified notation
of black-box attacks against machine learning and watermarking that reveals the
similarity of both settings. To demonstrate the efficacy of this unified view,
we apply concepts from watermarking to machine learning and vice versa. We show
that countermeasures from watermarking can mitigate recent model-extraction
attacks and, similarly, that techniques for hardening machine learning can fend
off oracle attacks against watermarks. Our work provides a conceptual link
between two research fields and thereby opens novel directions for improving
the security of both, machine learning and digital watermarking
BlackMarks: Blackbox Multibit Watermarking for Deep Neural Networks
Deep Neural Networks have created a paradigm shift in our ability to
comprehend raw data in various important fields ranging from computer vision
and natural language processing to intelligence warfare and healthcare. While
DNNs are increasingly deployed either in a white-box setting where the model
internal is publicly known, or a black-box setting where only the model outputs
are known, a practical concern is protecting the models against Intellectual
Property (IP) infringement. We propose BlackMarks, the first end-to-end
multi-bit watermarking framework that is applicable in the black-box scenario.
BlackMarks takes the pre-trained unmarked model and the owner's binary
signature as inputs and outputs the corresponding marked model with a set of
watermark keys. To do so, BlackMarks first designs a model-dependent encoding
scheme that maps all possible classes in the task to bit '0' and bit '1' by
clustering the output activations into two groups. Given the owner's watermark
signature (a binary string), a set of key image and label pairs are designed
using targeted adversarial attacks. The watermark (WM) is then embedded in the
prediction behavior of the target DNN by fine-tuning the model with generated
WM key set. To extract the WM, the remote model is queried by the WM key images
and the owner's signature is decoded from the corresponding predictions
according to the designed encoding scheme. We perform a comprehensive
evaluation of BlackMarks's performance on MNIST, CIFAR10, ImageNet datasets and
corroborate its effectiveness and robustness. BlackMarks preserves the
functionality of the original DNN and incurs negligible WM embedding runtime
overhead as low as 2.054%
Robust Spatial-spread Deep Neural Image Watermarking
Watermarking is an operation of embedding an information into an image in a
way that allows to identify ownership of the image despite applying some
distortions on it. In this paper, we presented a novel end-to-end solution for
embedding and recovering the watermark in the digital image using convolutional
neural networks. The method is based on spreading the message over the spatial
domain of the image, hence reducing the "local bits per pixel" capacity. To
obtain the model we used adversarial training and applied noiser layers between
the encoder and the decoder. Moreover, we broadened the spectrum of typically
considered attacks on the watermark and by grouping the attacks according to
their scope, we achieved high general robustness, most notably against JPEG
compression, Gaussian blurring, subsampling or resizing. To help us in the
models training we also proposed a precise differentiable approximation of
JPEG.Comment: The article was accepted on TrustCom 2020: The 19th IEEE
International Conference on Trust, Security and Privacy in Computing and
Communication
Turning Your Weakness Into a Strength: Watermarking Deep Neural Networks by Backdooring
Deep Neural Networks have recently gained lots of success after enabling
several breakthroughs in notoriously challenging problems. Training these
networks is computationally expensive and requires vast amounts of training
data. Selling such pre-trained models can, therefore, be a lucrative business
model. Unfortunately, once the models are sold they can be easily copied and
redistributed. To avoid this, a tracking mechanism to identify models as the
intellectual property of a particular vendor is necessary.
In this work, we present an approach for watermarking Deep Neural Networks in
a black-box way. Our scheme works for general classification tasks and can
easily be combined with current learning algorithms. We show experimentally
that such a watermark has no noticeable impact on the primary task that the
model is designed for and evaluate the robustness of our proposal against a
multitude of practical attacks. Moreover, we provide a theoretical analysis,
relating our approach to previous work on backdooring
Robust Watermarking of Neural Network with Exponential Weighting
Deep learning has been achieving top performance in many tasks. Since
training of a deep learning model requires a great deal of cost, we need to
treat neural network models as valuable intellectual properties. One concern in
such a situation is that some malicious user might redistribute the model or
provide a prediction service using the model without permission. One promising
solution is digital watermarking, to embed a mechanism into the model so that
the owner of the model can verify the ownership of the model externally. In
this study, we present a novel attack method against watermark, query
modification, and demonstrate that all of the existing watermark methods are
vulnerable to either of query modification or existing attack method (model
modification). To overcome this vulnerability, we present a novel watermarking
method, exponential weighting. We experimentally show that our watermarking
method achieves high verification performance of watermark even under a
malicious attempt of unauthorized service providers, such as model modification
and query modification, without sacrificing the predictive performance of the
neural network model.Comment: 13 page
Local Gradients Smoothing: Defense against localized adversarial attacks
Deep neural networks (DNNs) have shown vulnerability to adversarial attacks,
i.e., carefully perturbed inputs designed to mislead the network at inference
time. Recently introduced localized attacks, Localized and Visible Adversarial
Noise (LaVAN) and Adversarial patch, pose a new challenge to deep learning
security by adding adversarial noise only within a specific region without
affecting the salient objects in an image. Driven by the observation that such
attacks introduce concentrated high-frequency changes at a particular image
location, we have developed an effective method to estimate noise location in
gradient domain and transform those high activation regions caused by
adversarial noise in image domain while having minimal effect on the salient
object that is important for correct classification. Our proposed Local
Gradients Smoothing (LGS) scheme achieves this by regularizing gradients in the
estimated noisy region before feeding the image to DNN for inference. We have
shown the effectiveness of our method in comparison to other defense methods
including Digital Watermarking, JPEG compression, Total Variance Minimization
(TVM) and Feature squeezing on ImageNet dataset. In addition, we systematically
study the robustness of the proposed defense mechanism against Back Pass
Differentiable Approximation (BPDA), a state of the art attack recently
developed to break defenses that transform an input sample to minimize the
adversarial effect. Compared to other defense mechanisms, LGS is by far the
most resistant to BPDA in localized adversarial attack setting.Comment: Accepted At WACV-201
Performance Comparison of Contemporary DNN Watermarking Techniques
DNNs shall be considered as the intellectual property (IP) of the model
builder due to the impeding cost of designing/training a highly accurate model.
Research attempts have been made to protect the authorship of the trained model
and prevent IP infringement using DNN watermarking techniques. In this paper,
we provide a comprehensive performance comparison of the state-of-the-art DNN
watermarking methodologies according to the essential requisites for an
effective watermarking technique. We identify the pros and cons of each scheme
and provide insights into the underlying rationale. Empirical results
corroborate that DeepSigns framework proposed in [4] has the best overall
performance in terms of the evaluation metrics. Our comparison facilitates the
development of pending watermarking approaches and enables the model owner to
deploy the watermarking scheme that satisfying her requirements
HiDDeN: Hiding Data With Deep Networks
Recent work has shown that deep neural networks are highly sensitive to tiny
perturbations of input images, giving rise to adversarial examples. Though this
property is usually considered a weakness of learned models, we explore whether
it can be beneficial. We find that neural networks can learn to use invisible
perturbations to encode a rich amount of useful information. In fact, one can
exploit this capability for the task of data hiding. We jointly train encoder
and decoder networks, where given an input message and cover image, the encoder
produces a visually indistinguishable encoded image, from which the decoder can
recover the original message. We show that these encodings are competitive with
existing data hiding algorithms, and further that they can be made robust to
noise: our models learn to reconstruct hidden information in an encoded image
despite the presence of Gaussian blurring, pixel-wise dropout, cropping, and
JPEG compression. Even though JPEG is non-differentiable, we show that a robust
model can be trained using differentiable approximations. Finally, we
demonstrate that adversarial training improves the visual quality of encoded
images
StegaStamp: Invisible Hyperlinks in Physical Photographs
Printed and digitally displayed photos have the ability to hide imperceptible
digital data that can be accessed through internet-connected imaging systems.
Another way to think about this is physical photographs that have unique QR
codes invisibly embedded within them. This paper presents an architecture,
algorithms, and a prototype implementation addressing this vision. Our key
technical contribution is StegaStamp, a learned steganographic algorithm to
enable robust encoding and decoding of arbitrary hyperlink bitstrings into
photos in a manner that approaches perceptual invisibility. StegaStamp
comprises a deep neural network that learns an encoding/decoding algorithm
robust to image perturbations approximating the space of distortions resulting
from real printing and photography. We demonstrates real-time decoding of
hyperlinks in photos from in-the-wild videos that contain variation in
lighting, shadows, perspective, occlusion and viewing distance. Our prototype
system robustly retrieves 56 bit hyperlinks after error correction - sufficient
to embed a unique code within every photo on the internet.Comment: CVPR 2020, Project page: http://www.matthewtancik.com/stegastam
Deep Learning in steganography and steganalysis from 2015 to 2018
For almost 10 years, the detection of a hidden message in an image has been
mainly carried out by the computation of Rich Models (RM), followed by
classification using an Ensemble Classifier (EC). In 2015, the first study
using a convolutional neural network (CNN) obtained the first results of
steganalysis by Deep Learning approaching the performances of the two-step
approach (EC + RM). Between 2015-2018, numerous publications have shown that it
is possible to obtain improved performances, notably in spatial steganalysis,
JPEG steganalysis, Selection-Channel-Aware steganalysis, and in quantitative
steganalysis. This chapter deals with deep learning in steganalysis from the
point of view of current methods, by presenting different neural networks from
the period 2015-2018, that have been evaluated with a methodology specific to
the discipline of steganalysis. The chapter is not intended to repeat the basic
concepts of machine learning or deep learning. So, we will present the
structure of a deep neural network, in a generic way and present the networks
proposed in existing literature for the different scenarios of steganalysis,
and finally, we will discuss steganography by deep learning.Comment: Book chapter, final version (October 2019). This chapter will appear
in 2020 in the book titled: "Digital Media Steganography: Principles,
Algorithms, Advances", Book Editor: M. Hassaballah. 46 page
- …