939 research outputs found
Privacy Risks of Securing Machine Learning Models against Adversarial Examples
The arms race between attacks and defenses for machine learning models has
come to a forefront in recent years, in both the security community and the
privacy community. However, one big limitation of previous research is that the
security domain and the privacy domain have typically been considered
separately. It is thus unclear whether the defense methods in one domain will
have any unexpected impact on the other domain.
In this paper, we take a step towards resolving this limitation by combining
the two domains. In particular, we measure the success of membership inference
attacks against six state-of-the-art defense methods that mitigate the risk of
adversarial examples (i.e., evasion attacks). Membership inference attacks
determine whether or not an individual data record has been part of a model's
training set. The accuracy of such attacks reflects the information leakage of
training algorithms about individual members of the training set. Adversarial
defense methods against adversarial examples influence the model's decision
boundaries such that model predictions remain unchanged for a small area around
each input. However, this objective is optimized on training data. Thus,
individual data records in the training set have a significant influence on
robust models. This makes the models more vulnerable to inference attacks.
To perform the membership inference attacks, we leverage the existing
inference methods that exploit model predictions. We also propose two new
inference methods that exploit structural properties of robust models on
adversarially perturbed data. Our experimental evaluation demonstrates that
compared with the natural training (undefended) approach, adversarial defense
methods can indeed increase the target model's risk against membership
inference attacks.Comment: ACM CCS 2019, code is available at
https://github.com/inspire-group/privacy-vs-robustnes
Crowd-ML: A Privacy-Preserving Learning Framework for a Crowd of Smart Devices
Smart devices with built-in sensors, computational capabilities, and network
connectivity have become increasingly pervasive. The crowds of smart devices
offer opportunities to collectively sense and perform computing tasks in an
unprecedented scale. This paper presents Crowd-ML, a privacy-preserving machine
learning framework for a crowd of smart devices, which can solve a wide range
of learning problems for crowdsensing data with differential privacy
guarantees. Crowd-ML endows a crowdsensing system with an ability to learn
classifiers or predictors online from crowdsensing data privately with minimal
computational overheads on devices and servers, suitable for a practical and
large-scale employment of the framework. We analyze the performance and the
scalability of Crowd-ML, and implement the system with off-the-shelf
smartphones as a proof of concept. We demonstrate the advantages of Crowd-ML
with real and simulated experiments under various conditions
Privacy-preserving Distributed Machine Learning via Local Randomization and ADMM Perturbation
With the proliferation of training data, distributed machine learning (DML)
is becoming more competent for large-scale learning tasks. However, privacy
concerns have to be given priority in DML, since training data may contain
sensitive information of users. In this paper, we propose a privacy-preserving
ADMM-based DML framework with two novel features: First, we remove the
assumption commonly made in the literature that the users trust the server
collecting their data. Second, the framework provides heterogeneous privacy for
users depending on data's sensitive levels and servers' trust degrees. The
challenging issue is to keep the accumulation of privacy losses over ADMM
iterations minimal. In the proposed framework, a local randomization approach,
which is differentially private, is adopted to provide users with
self-controlled privacy guarantee for the most sensitive information. Further,
the ADMM algorithm is perturbed through a combined noise-adding method, which
simultaneously preserves privacy for users' less sensitive information and
strengthens the privacy protection of the most sensitive information. We
provide detailed analyses on the performance of the trained model according to
its generalization error. Finally, we conduct extensive experiments using
real-world datasets to validate the theoretical results and evaluate the
classification performance of the proposed framework
Towards Plausible Differentially Private ADMM Based Distributed Machine Learning
The Alternating Direction Method of Multipliers (ADMM) and its distributed
version have been widely used in machine learning. In the iterations of ADMM,
model updates using local private data and model exchanges among agents impose
critical privacy concerns. Despite some pioneering works to relieve such
concerns, differentially private ADMM still confronts many research challenges.
For example, the guarantee of differential privacy (DP) relies on the premise
that the optimality of each local problem can be perfectly attained in each
ADMM iteration, which may never happen in practice. The model trained by DP
ADMM may have low prediction accuracy. In this paper, we address these concerns
by proposing a novel (Improved) Plausible differentially Private ADMM
algorithm, called PP-ADMM and IPP-ADMM. In PP-ADMM, each agent approximately
solves a perturbed optimization problem that is formulated from its local
private data in an iteration, and then perturbs the approximate solution with
Gaussian noise to provide the DP guarantee. To further improve the model
accuracy and convergence, an improved version IPP-ADMM adopts sparse vector
technique (SVT) to determine if an agent should update its neighbors with the
current perturbed solution. The agent calculates the difference of the current
solution from that in the last iteration, and if the difference is larger than
a threshold, it passes the solution to neighbors; or otherwise the solution
will be discarded. Moreover, we propose to track the total privacy loss under
the zero-concentrated DP (zCDP) and provide a generalization performance
analysis. Experiments on real-world datasets demonstrate that under the same
privacy guarantee, the proposed algorithms are superior to the state of the art
in terms of model accuracy and convergence rate.Comment: Comments: Accepted for publication in CIKM'2
Flow-based Distributionally Robust Optimization
We present a computationally efficient framework, called ,
for solving flow-based distributionally robust optimization (DRO) problems with
Wasserstein uncertainty sets while aiming to find continuous worst-case
distribution (also called the Least Favorable Distribution, LFD) and sample
from it. The requirement for LFD to be continuous is so that the algorithm can
be scalable to problems with larger sample sizes and achieve better
generalization capability for the induced robust algorithms. To tackle the
computationally challenging infinitely dimensional optimization problem, we
leverage flow-based models and continuous-time invertible transport maps
between the data distribution and the target distribution and develop a
Wasserstein proximal gradient flow type algorithm. In theory, we establish the
equivalence of the solution by optimal transport map to the original
formulation, as well as the dual form of the problem through Wasserstein
calculus and Brenier theorem. In practice, we parameterize the transport maps
by a sequence of neural networks progressively trained in blocks by gradient
descent. We demonstrate its usage in adversarial learning, distributionally
robust hypothesis testing, and a new mechanism for data-driven distribution
perturbation differential privacy, where the proposed method gives strong
empirical performance on high-dimensional real data.Comment: IEEE Journal on Selected Areas in Information Theory (JSAIT).
Accepted. 202
Security Evaluation of Support Vector Machines in Adversarial Environments
Support Vector Machines (SVMs) are among the most popular classification
techniques adopted in security applications like malware detection, intrusion
detection, and spam filtering. However, if SVMs are to be incorporated in
real-world security systems, they must be able to cope with attack patterns
that can either mislead the learning algorithm (poisoning), evade detection
(evasion), or gain information about their internal parameters (privacy
breaches). The main contributions of this chapter are twofold. First, we
introduce a formal general framework for the empirical evaluation of the
security of machine-learning systems. Second, according to our framework, we
demonstrate the feasibility of evasion, poisoning and privacy attacks against
SVMs in real-world security problems. For each attack technique, we evaluate
its impact and discuss whether (and how) it can be countered through an
adversary-aware design of SVMs. Our experiments are easily reproducible thanks
to open-source code that we have made available, together with all the employed
datasets, on a public repository.Comment: 47 pages, 9 figures; chapter accepted into book 'Support Vector
Machine Applications
์ปค๋ ์ํฌํธ์ ํํ์ ์ ํ์ฉํ ์ฐจ๋ถ ํ๋ผ์ด๋ฒ์ ๋ค์ค ํด๋์ค ๋ถ๋ฅ ๊ธฐ๋ฒ
ํ์๋
ผ๋ฌธ(์์ฌ) -- ์์ธ๋ํ๊ต๋ํ์ : ๊ณต๊ณผ๋ํ ์ฐ์
๊ณตํ๊ณผ, 2022.2. ์ด์ฌ์ฑ.In this paper, we propose a multi-class classification method using kernel supports and a dynamic system under differential privacy. We find support vector machine (SVM) algorithms have a fundamental weaknesses of implementing differential privacy because the decision function depends on some subset of the training data called the support vectors. Therefore, we develop a method using interior points called equilibrium points (EPs) without relying on the decision boundary. To construct EPs, we utilize a dynamic system with a new differentially private support vector data description (SVDD) by perturbing the sphere center in the kernel space. Empirical results show that the proposed method achieves better performance even on small-sized datasets where differential privacy performs poorly.๋ณธ ๋
ผ๋ฌธ์์๋ ์ปค๋ ์ํฌํธ์ ํํ์ ์ ํ์ฉํ ์ฐจ๋ถ ํ๋ผ์ด๋ฒ์ ๋ค์ค ํด๋์ค ๋ถ๋ฅ ๊ธฐ๋ฒ์ ์ ์ํ๋ค. ์ํฌํธ ๋ฒกํฐ ๋ถ๋ฅ ๊ธฐ๋ฒ์ ๋ฐ์ดํฐ ๋ถ์๊ณผ ๋จธ์ ๋ฌ๋์ ํ์ฉ์ฑ์ด ๋์ ์ฌ์ฉ์์ ๋ฐ์ดํฐ๋ฅผ ๋ณดํธํ๋ฉฐ ํ์ตํ๋ ๊ฒ์ด ํ์์ ์ด๋ค. ๊ทธ ์ค ๊ฐ์ฅ ๋์ค์ ์ธ ์ํฌํธ ๋ฒกํฐ ๋จธ์ (SVM)์ ์ํฌํธ ๋ฒกํฐ๋ผ๊ณ ๋ถ๋ฆฌ๋ ์ผ๋ถ ๋ฐ์ดํฐ์๋ง ๋ถ๋ฅ์ ์์กดํ๊ธฐ ๋๋ฌธ์ ํ๋ผ์ด๋ฒ์ ์ฐจ๋ถ ๊ธฐ๋ฒ์ ํ์ฉํ๊ธฐ ์ด๋ ต๋ค. ๋ฐ์ดํฐ ํ๋๊ฐ ๋ณ๊ฒฝ๋์์ ๋ ๊ฒฐ๊ณผ์ ๋ณํ๊ฐ ์ ์ด์ผ ํ๋ ์ฐจ๋ถ ํ๋ผ์ด๋ฒ์ ์ํฉ์์ ์ํฌํธ ๋ฒกํฐ ํ๋๊ฐ ์์ด์ง๋ค๋ฉด ๋ถ๋ฅ๊ธฐ์ ๊ฒฐ์ ๊ฒฝ๊ณ๋ ๊ทธ ๋ณํ์ ๋งค์ฐ ์ทจ์ฝํ๋ค๋ ๋ฌธ์ ๊ฐ ์๋ค. ์ด ๋ฌธ์ ๋ฅผ ํด๊ฒฐํ๊ธฐ ์ํด ๋ณธ ์ฐ๊ตฌ์์๋ ํํ์ ์ด๋ผ๊ณ ๋ถ๋ฆฌ๋ ๊ตฐ์ง ๋ด๋ถ์ ์กด์ฌํ๋ ์ ์ ํ์ฉํ๋ ์ฐจ๋ถ ํ๋ผ์ด๋ฒ์ ๋ค์ค ํด๋์ค ๋ถ๋ฅ ๊ธฐ๋ฒ์ ์ ์ํ๋ค. ์ด๋ฅผ ์ํด, ๋จผ์ ์ปค๋ ๊ณต๊ฐ์์ ๊ตฌ์ ์ค์ฌ์ ์ญ๋์ ๋ํด ์ฐจ๋ถ ํ๋ผ์ด๋ฒ์๋ฅผ ๋ง์กฑํ๋ ์ํฌํธ ๋ฒกํฐ ๋ฐ์ดํฐ ๋์คํฌ๋ฆฝ์
(SVDD)์ ๊ตฌํ๊ณ ์ด๋ฅผ ๋ ๋ฒจ์งํฉ์ผ๋ก ํ์ฉํด ๋์ญํ๊ณ๋ก ๊ทน์์ ๋ค์ ๊ตฌํ๋ค. ํํ์ ์ ํ์ฉํ๊ฑฐ๋ ๊ณ ์ฐจ์ ๋ฐ์ดํฐ์ ๊ฒฝ์ฐ ์ด์
๋ฐฉ์ฒด๋ฅผ ๋ง๋ค์ด, ํ์ตํ ๋ชจ๋ธ์ ์ถ๋ก ์ ํ์ฉํ ์ ์๋ (1) ์ํฌํธ ํจ์๋ฅผ ๊ณต๊ฐ ํ๋ ๋ฐฉ๋ฒ๊ณผ (2) ํํ์ ์ ๊ณต๊ฐํ๋ ๋ฐฉ๋ฒ์ ์ ์ํ๋ค. 8๊ฐ์ ๋ค์ํ ๋ฐ์ดํฐ ์งํฉ์ ์คํ์ ์ธ ๊ฒฐ๊ณผ๋ ์ ์ํ ๋ฐฉ๋ฒ๋ก ์ด ๋
ธ์ด์ฆ์ ๊ฐ๊ฑดํ ๋ด๋ถ์ ์ ์ ํ์ฉํด ๊ธฐ์กด์ ์ฐจ๋ถ ํ๋ผ์ด๋ฒ์ ์ํฌํธ ๋ฒกํฐ ๋จธ์ ๋ณด๋ค ์ฑ๋ฅ์ ๋์ด๊ณ , ์ฐจ๋ถ ํ๋ผ์ด๋ฒ์๊ฐ ์ ์ฉ๋๊ธฐ ์ด๋ ค์ด ์์ ๋ฐ์ดํฐ์
์๋ ํ์ฉ๋ ์ ์๋ค๋ ๊ธฐ์ ์์ ๋ณด์ฌ์ค๋ค.Chapter 1 Introduction 1
1.1 Problem Description: Data Privacy 1
1.2 The Privacy of Support Vector Methods 2
1.3 Research Motivation and Contribution 4
1.4 Organization of the Thesis 5
Chapter 2 Literature Review 6
2.1 Differentially private Empirical risk minimization 6
2.2 Differentially private Support vector machine 7
Chapter 3 Preliminaries 9
3.1 Differential privacy 9
Chapter 4 Differential private support vector data description 12
4.1 Support vector data description 12
4.2 Differentially private support vector data description 13
Chapter 5 Differentially private multi-class classification utilizing SVDD 19
5.1 Phase I. Constructing a private support level function 20
5.2 Phase II: Differentially private clustering on the data space via a dynamical system 21
5.3 Phase III: Classifying the decomposed regions under differential privacy 22
Chapter 6 Inference scenarios and releasing the differentially private model 25
6.1 Publishing support function 26
6.2 Releasing equilibrium points 26
6.3 Comparison to previous methods 27
Chapter 7 Experiments 28
7.1 Models and Scenario setting 28
7.2 Datasets 29
7.3 Experimental settings 29
7.4 Empirical results on various datasets under publishing support function 30
7.5 Evaluating robustness under diverse data size 33
7.6 Inference through equilibrium points 33
Chapter 8 Conclusion 34
8.1 Conclusion 34์
- โฆ