100 research outputs found
MEA-Defender: A Robust Watermark against Model Extraction Attack
Recently, numerous highly-valuable Deep Neural Networks (DNNs) have been
trained using deep learning algorithms. To protect the Intellectual Property
(IP) of the original owners over such DNN models, backdoor-based watermarks
have been extensively studied. However, most of such watermarks fail upon model
extraction attack, which utilizes input samples to query the target model and
obtains the corresponding outputs, thus training a substitute model using such
input-output pairs. In this paper, we propose a novel watermark to protect IP
of DNN models against model extraction, named MEA-Defender. In particular, we
obtain the watermark by combining two samples from two source classes in the
input domain and design a watermark loss function that makes the output domain
of the watermark within that of the main task samples. Since both the input
domain and the output domain of our watermark are indispensable parts of those
of the main task samples, the watermark will be extracted into the stolen model
along with the main task during model extraction. We conduct extensive
experiments on four model extraction attacks, using five datasets and six
models trained based on supervised learning and self-supervised learning
algorithms. The experimental results demonstrate that MEA-Defender is highly
robust against different model extraction attacks, and various watermark
removal/detection approaches.Comment: To Appear in IEEE Symposium on Security and Privacy 2024 (IEEE S&P
2024), MAY 20-23, 2024, SAN FRANCISCO, CA, US
A Systematic Review on Model Watermarking for Neural Networks
Machine learning (ML) models are applied in an increasing variety of domains.
The availability of large amounts of data and computational resources encourages the development of ever more complex and valuable models.
These models are considered intellectual property of the legitimate parties who have trained them, which makes their protection against stealing, illegitimate redistribution, and unauthorized application an urgent need.
Digital watermarking presents a strong mechanism for marking model ownership and, thereby, offers protection against those threats.
This work presents a taxonomy identifying and analyzing different classes of watermarking schemes for ML models.
It introduces a unified threat model to allow structured reasoning on and comparison of the effectiveness of watermarking methods in different scenarios.
Furthermore, it systematizes desired security requirements and attacks against ML model watermarking.
Based on that framework, representative literature from the field is surveyed to illustrate the taxonomy.
Finally, shortcomings and general limitations of existing approaches are discussed, and an outlook on future research directions is given
Deep Intellectual Property: A Survey
With the widespread application in industrial manufacturing and commercial
services, well-trained deep neural networks (DNNs) are becoming increasingly
valuable and crucial assets due to the tremendous training cost and excellent
generalization performance. These trained models can be utilized by users
without much expert knowledge benefiting from the emerging ''Machine Learning
as a Service'' (MLaaS) paradigm. However, this paradigm also exposes the
expensive models to various potential threats like model stealing and abuse. As
an urgent requirement to defend against these threats, Deep Intellectual
Property (DeepIP), to protect private training data, painstakingly-tuned
hyperparameters, or costly learned model weights, has been the consensus of
both industry and academia. To this end, numerous approaches have been proposed
to achieve this goal in recent years, especially to prevent or discover model
stealing and unauthorized redistribution. Given this period of rapid evolution,
the goal of this paper is to provide a comprehensive survey of the recent
achievements in this field. More than 190 research contributions are included
in this survey, covering many aspects of Deep IP Protection:
challenges/threats, invasive solutions (watermarking), non-invasive solutions
(fingerprinting), evaluation metrics, and performance. We finish the survey by
identifying promising directions for future research.Comment: 38 pages, 12 figure
Towards Code Watermarking with Dual-Channel Transformations
The expansion of the open source community and the rise of large language
models have raised ethical and security concerns on the distribution of source
code, such as misconduct on copyrighted code, distributions without proper
licenses, or misuse of the code for malicious purposes. Hence it is important
to track the ownership of source code, in wich watermarking is a major
technique. Yet, drastically different from natural languages, source code
watermarking requires far stricter and more complicated rules to ensure the
readability as well as the functionality of the source code. Hence we introduce
SrcMarker, a watermarking system to unobtrusively encode ID bitstrings into
source code, without affecting the usage and semantics of the code. To this
end, SrcMarker performs transformations on an AST-based intermediate
representation that enables unified transformations across different
programming languages. The core of the system utilizes learning-based embedding
and extraction modules to select rule-based transformations for watermarking.
In addition, a novel feature-approximation technique is designed to tackle the
inherent non-differentiability of rule selection, thus seamlessly integrating
the rule-based transformations and learning-based networks into an
interconnected system to enable end-to-end training. Extensive experiments
demonstrate the superiority of SrcMarker over existing methods in various
watermarking requirements.Comment: 16 page
Ownership Protection of Generative Adversarial Networks
Generative adversarial networks (GANs) have shown remarkable success in image
synthesis, making GAN models themselves commercially valuable to legitimate
model owners. Therefore, it is critical to technically protect the intellectual
property of GANs. Prior works need to tamper with the training set or training
process, and they are not robust to emerging model extraction attacks. In this
paper, we propose a new ownership protection method based on the common
characteristics of a target model and its stolen models. Our method can be
directly applicable to all well-trained GANs as it does not require retraining
target models. Extensive experimental results show that our new method can
achieve the best protection performance, compared to the state-of-the-art
methods. Finally, we demonstrate the effectiveness of our method with respect
to the number of generations of model extraction attacks, the number of
generated samples, different datasets, as well as adaptive attacks
ClearMark: Intuitive and Robust Model Watermarking via Transposed Model Training
Due to costly efforts during data acquisition and model training, Deep Neural
Networks (DNNs) belong to the intellectual property of the model creator.
Hence, unauthorized use, theft, or modification may lead to legal
repercussions. Existing DNN watermarking methods for ownership proof are often
non-intuitive, embed human-invisible marks, require trust in algorithmic
assessment that lacks human-understandable attributes, and rely on rigid
thresholds, making it susceptible to failure in cases of partial watermark
erasure.
This paper introduces ClearMark, the first DNN watermarking method designed
for intuitive human assessment. ClearMark embeds visible watermarks, enabling
human decision-making without rigid value thresholds while allowing
technology-assisted evaluations. ClearMark defines a transposed model
architecture allowing to use of the model in a backward fashion to interwove
the watermark with the main task within all model parameters. Compared to
existing watermarking methods, ClearMark produces visual watermarks that are
easy for humans to understand without requiring complex verification algorithms
or strict thresholds. The watermark is embedded within all model parameters and
entangled with the main task, exhibiting superior robustness. It shows an
8,544-bit watermark capacity comparable to the strongest existing work.
Crucially, ClearMark's effectiveness is model and dataset-agnostic, and
resilient against adversarial model manipulations, as demonstrated in a
comprehensive study performed with four datasets and seven architectures.Comment: 20 pages, 18 figures, 4 table
Identifying Appropriate Intellectual Property Protection Mechanisms for Machine Learning Models: A Systematization of Watermarking, Fingerprinting, Model Access, and Attacks
The commercial use of Machine Learning (ML) is spreading; at the same time,
ML models are becoming more complex and more expensive to train, which makes
Intellectual Property Protection (IPP) of trained models a pressing issue.
Unlike other domains that can build on a solid understanding of the threats,
attacks and defenses available to protect their IP, the ML-related research in
this regard is still very fragmented. This is also due to a missing unified
view as well as a common taxonomy of these aspects.
In this paper, we systematize our findings on IPP in ML, while focusing on
threats and attacks identified and defenses proposed at the time of writing. We
develop a comprehensive threat model for IP in ML, categorizing attacks and
defenses within a unified and consolidated taxonomy, thus bridging research
from both the ML and security communities
Data Hiding with Deep Learning: A Survey Unifying Digital Watermarking and Steganography
Data hiding is the process of embedding information into a noise-tolerant
signal such as a piece of audio, video, or image. Digital watermarking is a
form of data hiding where identifying data is robustly embedded so that it can
resist tampering and be used to identify the original owners of the media.
Steganography, another form of data hiding, embeds data for the purpose of
secure and secret communication. This survey summarises recent developments in
deep learning techniques for data hiding for the purposes of watermarking and
steganography, categorising them based on model architectures and noise
injection methods. The objective functions, evaluation metrics, and datasets
used for training these data hiding models are comprehensively summarised.
Finally, we propose and discuss possible future directions for research into
deep data hiding techniques
- …