939 research outputs found

    Privacy Risks of Securing Machine Learning Models against Adversarial Examples

    Full text link
    The arms race between attacks and defenses for machine learning models has come to a forefront in recent years, in both the security community and the privacy community. However, one big limitation of previous research is that the security domain and the privacy domain have typically been considered separately. It is thus unclear whether the defense methods in one domain will have any unexpected impact on the other domain. In this paper, we take a step towards resolving this limitation by combining the two domains. In particular, we measure the success of membership inference attacks against six state-of-the-art defense methods that mitigate the risk of adversarial examples (i.e., evasion attacks). Membership inference attacks determine whether or not an individual data record has been part of a model's training set. The accuracy of such attacks reflects the information leakage of training algorithms about individual members of the training set. Adversarial defense methods against adversarial examples influence the model's decision boundaries such that model predictions remain unchanged for a small area around each input. However, this objective is optimized on training data. Thus, individual data records in the training set have a significant influence on robust models. This makes the models more vulnerable to inference attacks. To perform the membership inference attacks, we leverage the existing inference methods that exploit model predictions. We also propose two new inference methods that exploit structural properties of robust models on adversarially perturbed data. Our experimental evaluation demonstrates that compared with the natural training (undefended) approach, adversarial defense methods can indeed increase the target model's risk against membership inference attacks.Comment: ACM CCS 2019, code is available at https://github.com/inspire-group/privacy-vs-robustnes

    Crowd-ML: A Privacy-Preserving Learning Framework for a Crowd of Smart Devices

    Full text link
    Smart devices with built-in sensors, computational capabilities, and network connectivity have become increasingly pervasive. The crowds of smart devices offer opportunities to collectively sense and perform computing tasks in an unprecedented scale. This paper presents Crowd-ML, a privacy-preserving machine learning framework for a crowd of smart devices, which can solve a wide range of learning problems for crowdsensing data with differential privacy guarantees. Crowd-ML endows a crowdsensing system with an ability to learn classifiers or predictors online from crowdsensing data privately with minimal computational overheads on devices and servers, suitable for a practical and large-scale employment of the framework. We analyze the performance and the scalability of Crowd-ML, and implement the system with off-the-shelf smartphones as a proof of concept. We demonstrate the advantages of Crowd-ML with real and simulated experiments under various conditions

    Privacy-preserving Distributed Machine Learning via Local Randomization and ADMM Perturbation

    Full text link
    With the proliferation of training data, distributed machine learning (DML) is becoming more competent for large-scale learning tasks. However, privacy concerns have to be given priority in DML, since training data may contain sensitive information of users. In this paper, we propose a privacy-preserving ADMM-based DML framework with two novel features: First, we remove the assumption commonly made in the literature that the users trust the server collecting their data. Second, the framework provides heterogeneous privacy for users depending on data's sensitive levels and servers' trust degrees. The challenging issue is to keep the accumulation of privacy losses over ADMM iterations minimal. In the proposed framework, a local randomization approach, which is differentially private, is adopted to provide users with self-controlled privacy guarantee for the most sensitive information. Further, the ADMM algorithm is perturbed through a combined noise-adding method, which simultaneously preserves privacy for users' less sensitive information and strengthens the privacy protection of the most sensitive information. We provide detailed analyses on the performance of the trained model according to its generalization error. Finally, we conduct extensive experiments using real-world datasets to validate the theoretical results and evaluate the classification performance of the proposed framework

    Towards Plausible Differentially Private ADMM Based Distributed Machine Learning

    Full text link
    The Alternating Direction Method of Multipliers (ADMM) and its distributed version have been widely used in machine learning. In the iterations of ADMM, model updates using local private data and model exchanges among agents impose critical privacy concerns. Despite some pioneering works to relieve such concerns, differentially private ADMM still confronts many research challenges. For example, the guarantee of differential privacy (DP) relies on the premise that the optimality of each local problem can be perfectly attained in each ADMM iteration, which may never happen in practice. The model trained by DP ADMM may have low prediction accuracy. In this paper, we address these concerns by proposing a novel (Improved) Plausible differentially Private ADMM algorithm, called PP-ADMM and IPP-ADMM. In PP-ADMM, each agent approximately solves a perturbed optimization problem that is formulated from its local private data in an iteration, and then perturbs the approximate solution with Gaussian noise to provide the DP guarantee. To further improve the model accuracy and convergence, an improved version IPP-ADMM adopts sparse vector technique (SVT) to determine if an agent should update its neighbors with the current perturbed solution. The agent calculates the difference of the current solution from that in the last iteration, and if the difference is larger than a threshold, it passes the solution to neighbors; or otherwise the solution will be discarded. Moreover, we propose to track the total privacy loss under the zero-concentrated DP (zCDP) and provide a generalization performance analysis. Experiments on real-world datasets demonstrate that under the same privacy guarantee, the proposed algorithms are superior to the state of the art in terms of model accuracy and convergence rate.Comment: Comments: Accepted for publication in CIKM'2

    Flow-based Distributionally Robust Optimization

    Full text link
    We present a computationally efficient framework, called FlowDRO\texttt{FlowDRO}, for solving flow-based distributionally robust optimization (DRO) problems with Wasserstein uncertainty sets while aiming to find continuous worst-case distribution (also called the Least Favorable Distribution, LFD) and sample from it. The requirement for LFD to be continuous is so that the algorithm can be scalable to problems with larger sample sizes and achieve better generalization capability for the induced robust algorithms. To tackle the computationally challenging infinitely dimensional optimization problem, we leverage flow-based models and continuous-time invertible transport maps between the data distribution and the target distribution and develop a Wasserstein proximal gradient flow type algorithm. In theory, we establish the equivalence of the solution by optimal transport map to the original formulation, as well as the dual form of the problem through Wasserstein calculus and Brenier theorem. In practice, we parameterize the transport maps by a sequence of neural networks progressively trained in blocks by gradient descent. We demonstrate its usage in adversarial learning, distributionally robust hypothesis testing, and a new mechanism for data-driven distribution perturbation differential privacy, where the proposed method gives strong empirical performance on high-dimensional real data.Comment: IEEE Journal on Selected Areas in Information Theory (JSAIT). Accepted. 202

    Security Evaluation of Support Vector Machines in Adversarial Environments

    Full text link
    Support Vector Machines (SVMs) are among the most popular classification techniques adopted in security applications like malware detection, intrusion detection, and spam filtering. However, if SVMs are to be incorporated in real-world security systems, they must be able to cope with attack patterns that can either mislead the learning algorithm (poisoning), evade detection (evasion), or gain information about their internal parameters (privacy breaches). The main contributions of this chapter are twofold. First, we introduce a formal general framework for the empirical evaluation of the security of machine-learning systems. Second, according to our framework, we demonstrate the feasibility of evasion, poisoning and privacy attacks against SVMs in real-world security problems. For each attack technique, we evaluate its impact and discuss whether (and how) it can be countered through an adversary-aware design of SVMs. Our experiments are easily reproducible thanks to open-source code that we have made available, together with all the employed datasets, on a public repository.Comment: 47 pages, 9 figures; chapter accepted into book 'Support Vector Machine Applications

    ์ปค๋„ ์„œํฌํŠธ์™€ ํ‰ํ˜•์ ์„ ํ™œ์šฉํ•œ ์ฐจ๋ถ„ ํ”„๋ผ์ด๋ฒ„์‹œ ๋‹ค์ค‘ ํด๋ž˜์Šค ๋ถ„๋ฅ˜ ๊ธฐ๋ฒ•

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(์„์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์‚ฐ์—…๊ณตํ•™๊ณผ, 2022.2. ์ด์žฌ์šฑ.In this paper, we propose a multi-class classification method using kernel supports and a dynamic system under differential privacy. We find support vector machine (SVM) algorithms have a fundamental weaknesses of implementing differential privacy because the decision function depends on some subset of the training data called the support vectors. Therefore, we develop a method using interior points called equilibrium points (EPs) without relying on the decision boundary. To construct EPs, we utilize a dynamic system with a new differentially private support vector data description (SVDD) by perturbing the sphere center in the kernel space. Empirical results show that the proposed method achieves better performance even on small-sized datasets where differential privacy performs poorly.๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ปค๋„ ์„œํฌํŠธ์™€ ํ‰ํ˜•์ ์„ ํ™œ์šฉํ•œ ์ฐจ๋ถ„ ํ”„๋ผ์ด๋ฒ„์‹œ ๋‹ค์ค‘ ํด๋ž˜์Šค ๋ถ„๋ฅ˜ ๊ธฐ๋ฒ•์„ ์ œ์‹œํ•œ๋‹ค. ์„œํฌํŠธ ๋ฒกํ„ฐ ๋ถ„๋ฅ˜ ๊ธฐ๋ฒ•์€ ๋ฐ์ดํ„ฐ ๋ถ„์„๊ณผ ๋จธ์‹  ๋Ÿฌ๋‹์— ํ™œ์šฉ์„ฑ์ด ๋†’์•„ ์‚ฌ์šฉ์ž์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๋ณดํ˜ธํ•˜๋ฉฐ ํ•™์Šตํ•˜๋Š” ๊ฒƒ์ด ํ•„์ˆ˜์ ์ด๋‹ค. ๊ทธ ์ค‘ ๊ฐ€์žฅ ๋Œ€์ค‘์ ์ธ ์„œํฌํŠธ ๋ฒกํ„ฐ ๋จธ์‹ (SVM)์€ ์„œํฌํŠธ ๋ฒกํ„ฐ๋ผ๊ณ  ๋ถˆ๋ฆฌ๋Š” ์ผ๋ถ€ ๋ฐ์ดํ„ฐ์—๋งŒ ๋ถ„๋ฅ˜์— ์˜์กดํ•˜๊ธฐ ๋•Œ๋ฌธ์— ํ”„๋ผ์ด๋ฒ„์‹œ ์ฐจ๋ถ„ ๊ธฐ๋ฒ•์„ ํ™œ์šฉํ•˜๊ธฐ ์–ด๋ ต๋‹ค. ๋ฐ์ดํ„ฐ ํ•˜๋‚˜๊ฐ€ ๋ณ€๊ฒฝ๋˜์—ˆ์„ ๋•Œ ๊ฒฐ๊ณผ์˜ ๋ณ€ํ™”๊ฐ€ ์ ์–ด์•ผ ํ•˜๋Š” ์ฐจ๋ถ„ ํ”„๋ผ์ด๋ฒ„์‹œ ์ƒํ™ฉ์—์„œ ์„œํฌํŠธ ๋ฒกํ„ฐ ํ•˜๋‚˜๊ฐ€ ์—†์–ด์ง„๋‹ค๋ฉด ๋ถ„๋ฅ˜๊ธฐ์˜ ๊ฒฐ์ • ๊ฒฝ๊ณ„๋Š” ๊ทธ ๋ณ€ํ™”์— ๋งค์šฐ ์ทจ์•ฝํ•˜๋‹ค๋Š” ๋ฌธ์ œ๊ฐ€ ์žˆ๋‹ค. ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ํ‰ํ˜•์ ์ด๋ผ๊ณ  ๋ถˆ๋ฆฌ๋Š” ๊ตฐ์ง‘ ๋‚ด๋ถ€์— ์กด์žฌํ•˜๋Š” ์ ์„ ํ™œ์šฉํ•˜๋Š” ์ฐจ๋ถ„ ํ”„๋ผ์ด๋ฒ„์‹œ ๋‹ค์ค‘ ํด๋ž˜์Šค ๋ถ„๋ฅ˜ ๊ธฐ๋ฒ•์„ ์ œ์‹œํ•œ๋‹ค. ์ด๋ฅผ ์œ„ํ•ด, ๋จผ์ € ์ปค๋„ ๊ณต๊ฐ„์—์„œ ๊ตฌ์˜ ์ค‘์‹ฌ์— ์„ญ๋™์„ ๋”ํ•ด ์ฐจ๋ถ„ ํ”„๋ผ์ด๋ฒ„์‹œ๋ฅผ ๋งŒ์กฑํ•˜๋Š” ์„œํฌํŠธ ๋ฒกํ„ฐ ๋ฐ์ดํ„ฐ ๋””์Šคํฌ๋ฆฝ์…˜(SVDD)์„ ๊ตฌํ•˜๊ณ  ์ด๋ฅผ ๋ ˆ๋ฒจ์ง‘ํ•ฉ์œผ๋กœ ํ™œ์šฉํ•ด ๋™์—ญํ•™๊ณ„๋กœ ๊ทน์†Œ์ ๋“ค์„ ๊ตฌํ•œ๋‹ค. ํ‰ํ˜•์ ์„ ํ™œ์šฉํ•˜๊ฑฐ๋‚˜ ๊ณ ์ฐจ์› ๋ฐ์ดํ„ฐ์˜ ๊ฒฝ์šฐ ์ดˆ์ž…๋ฐฉ์ฒด๋ฅผ ๋งŒ๋“ค์–ด, ํ•™์Šตํ•œ ๋ชจ๋ธ์„ ์ถ”๋ก ์— ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๋Š” (1) ์„œํฌํŠธ ํ•จ์ˆ˜๋ฅผ ๊ณต๊ฐœ ํ•˜๋Š” ๋ฐฉ๋ฒ•๊ณผ (2) ํ‰ํ˜•์ ์„ ๊ณต๊ฐœํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์‹œํ•œ๋‹ค. 8๊ฐœ์˜ ๋‹ค์–‘ํ•œ ๋ฐ์ดํ„ฐ ์ง‘ํ•ฉ์˜ ์‹คํ—˜์ ์ธ ๊ฒฐ๊ณผ๋Š” ์ œ์‹œํ•œ ๋ฐฉ๋ฒ•๋ก ์ด ๋…ธ์ด์ฆˆ์— ๊ฐ•๊ฑดํ•œ ๋‚ด๋ถ€์˜ ์ ์„ ํ™œ์šฉํ•ด ๊ธฐ์กด์˜ ์ฐจ๋ถ„ ํ”„๋ผ์ด๋ฒ„์‹œ ์„œํฌํŠธ ๋ฒกํ„ฐ ๋จธ์‹ ๋ณด๋‹ค ์„ฑ๋Šฅ์„ ๋†’์ด๊ณ , ์ฐจ๋ถ„ ํ”„๋ผ์ด๋ฒ„์‹œ๊ฐ€ ์ ์šฉ๋˜๊ธฐ ์–ด๋ ค์šด ์ž‘์€ ๋ฐ์ดํ„ฐ์…‹์—๋„ ํ™œ์šฉ๋  ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ธฐ์ˆ ์ž„์„ ๋ณด์—ฌ์ค€๋‹ค.Chapter 1 Introduction 1 1.1 Problem Description: Data Privacy 1 1.2 The Privacy of Support Vector Methods 2 1.3 Research Motivation and Contribution 4 1.4 Organization of the Thesis 5 Chapter 2 Literature Review 6 2.1 Differentially private Empirical risk minimization 6 2.2 Differentially private Support vector machine 7 Chapter 3 Preliminaries 9 3.1 Differential privacy 9 Chapter 4 Differential private support vector data description 12 4.1 Support vector data description 12 4.2 Differentially private support vector data description 13 Chapter 5 Differentially private multi-class classification utilizing SVDD 19 5.1 Phase I. Constructing a private support level function 20 5.2 Phase II: Differentially private clustering on the data space via a dynamical system 21 5.3 Phase III: Classifying the decomposed regions under differential privacy 22 Chapter 6 Inference scenarios and releasing the differentially private model 25 6.1 Publishing support function 26 6.2 Releasing equilibrium points 26 6.3 Comparison to previous methods 27 Chapter 7 Experiments 28 7.1 Models and Scenario setting 28 7.2 Datasets 29 7.3 Experimental settings 29 7.4 Empirical results on various datasets under publishing support function 30 7.5 Evaluating robustness under diverse data size 33 7.6 Inference through equilibrium points 33 Chapter 8 Conclusion 34 8.1 Conclusion 34์„
    • โ€ฆ
    corecore