2,169 research outputs found
์ปค๋ ์ํฌํธ์ ํํ์ ์ ํ์ฉํ ์ฐจ๋ถ ํ๋ผ์ด๋ฒ์ ๋ค์ค ํด๋์ค ๋ถ๋ฅ ๊ธฐ๋ฒ
ํ์๋
ผ๋ฌธ(์์ฌ) -- ์์ธ๋ํ๊ต๋ํ์ : ๊ณต๊ณผ๋ํ ์ฐ์
๊ณตํ๊ณผ, 2022.2. ์ด์ฌ์ฑ.In this paper, we propose a multi-class classification method using kernel supports and a dynamic system under differential privacy. We find support vector machine (SVM) algorithms have a fundamental weaknesses of implementing differential privacy because the decision function depends on some subset of the training data called the support vectors. Therefore, we develop a method using interior points called equilibrium points (EPs) without relying on the decision boundary. To construct EPs, we utilize a dynamic system with a new differentially private support vector data description (SVDD) by perturbing the sphere center in the kernel space. Empirical results show that the proposed method achieves better performance even on small-sized datasets where differential privacy performs poorly.๋ณธ ๋
ผ๋ฌธ์์๋ ์ปค๋ ์ํฌํธ์ ํํ์ ์ ํ์ฉํ ์ฐจ๋ถ ํ๋ผ์ด๋ฒ์ ๋ค์ค ํด๋์ค ๋ถ๋ฅ ๊ธฐ๋ฒ์ ์ ์ํ๋ค. ์ํฌํธ ๋ฒกํฐ ๋ถ๋ฅ ๊ธฐ๋ฒ์ ๋ฐ์ดํฐ ๋ถ์๊ณผ ๋จธ์ ๋ฌ๋์ ํ์ฉ์ฑ์ด ๋์ ์ฌ์ฉ์์ ๋ฐ์ดํฐ๋ฅผ ๋ณดํธํ๋ฉฐ ํ์ตํ๋ ๊ฒ์ด ํ์์ ์ด๋ค. ๊ทธ ์ค ๊ฐ์ฅ ๋์ค์ ์ธ ์ํฌํธ ๋ฒกํฐ ๋จธ์ (SVM)์ ์ํฌํธ ๋ฒกํฐ๋ผ๊ณ ๋ถ๋ฆฌ๋ ์ผ๋ถ ๋ฐ์ดํฐ์๋ง ๋ถ๋ฅ์ ์์กดํ๊ธฐ ๋๋ฌธ์ ํ๋ผ์ด๋ฒ์ ์ฐจ๋ถ ๊ธฐ๋ฒ์ ํ์ฉํ๊ธฐ ์ด๋ ต๋ค. ๋ฐ์ดํฐ ํ๋๊ฐ ๋ณ๊ฒฝ๋์์ ๋ ๊ฒฐ๊ณผ์ ๋ณํ๊ฐ ์ ์ด์ผ ํ๋ ์ฐจ๋ถ ํ๋ผ์ด๋ฒ์ ์ํฉ์์ ์ํฌํธ ๋ฒกํฐ ํ๋๊ฐ ์์ด์ง๋ค๋ฉด ๋ถ๋ฅ๊ธฐ์ ๊ฒฐ์ ๊ฒฝ๊ณ๋ ๊ทธ ๋ณํ์ ๋งค์ฐ ์ทจ์ฝํ๋ค๋ ๋ฌธ์ ๊ฐ ์๋ค. ์ด ๋ฌธ์ ๋ฅผ ํด๊ฒฐํ๊ธฐ ์ํด ๋ณธ ์ฐ๊ตฌ์์๋ ํํ์ ์ด๋ผ๊ณ ๋ถ๋ฆฌ๋ ๊ตฐ์ง ๋ด๋ถ์ ์กด์ฌํ๋ ์ ์ ํ์ฉํ๋ ์ฐจ๋ถ ํ๋ผ์ด๋ฒ์ ๋ค์ค ํด๋์ค ๋ถ๋ฅ ๊ธฐ๋ฒ์ ์ ์ํ๋ค. ์ด๋ฅผ ์ํด, ๋จผ์ ์ปค๋ ๊ณต๊ฐ์์ ๊ตฌ์ ์ค์ฌ์ ์ญ๋์ ๋ํด ์ฐจ๋ถ ํ๋ผ์ด๋ฒ์๋ฅผ ๋ง์กฑํ๋ ์ํฌํธ ๋ฒกํฐ ๋ฐ์ดํฐ ๋์คํฌ๋ฆฝ์
(SVDD)์ ๊ตฌํ๊ณ ์ด๋ฅผ ๋ ๋ฒจ์งํฉ์ผ๋ก ํ์ฉํด ๋์ญํ๊ณ๋ก ๊ทน์์ ๋ค์ ๊ตฌํ๋ค. ํํ์ ์ ํ์ฉํ๊ฑฐ๋ ๊ณ ์ฐจ์ ๋ฐ์ดํฐ์ ๊ฒฝ์ฐ ์ด์
๋ฐฉ์ฒด๋ฅผ ๋ง๋ค์ด, ํ์ตํ ๋ชจ๋ธ์ ์ถ๋ก ์ ํ์ฉํ ์ ์๋ (1) ์ํฌํธ ํจ์๋ฅผ ๊ณต๊ฐ ํ๋ ๋ฐฉ๋ฒ๊ณผ (2) ํํ์ ์ ๊ณต๊ฐํ๋ ๋ฐฉ๋ฒ์ ์ ์ํ๋ค. 8๊ฐ์ ๋ค์ํ ๋ฐ์ดํฐ ์งํฉ์ ์คํ์ ์ธ ๊ฒฐ๊ณผ๋ ์ ์ํ ๋ฐฉ๋ฒ๋ก ์ด ๋
ธ์ด์ฆ์ ๊ฐ๊ฑดํ ๋ด๋ถ์ ์ ์ ํ์ฉํด ๊ธฐ์กด์ ์ฐจ๋ถ ํ๋ผ์ด๋ฒ์ ์ํฌํธ ๋ฒกํฐ ๋จธ์ ๋ณด๋ค ์ฑ๋ฅ์ ๋์ด๊ณ , ์ฐจ๋ถ ํ๋ผ์ด๋ฒ์๊ฐ ์ ์ฉ๋๊ธฐ ์ด๋ ค์ด ์์ ๋ฐ์ดํฐ์
์๋ ํ์ฉ๋ ์ ์๋ค๋ ๊ธฐ์ ์์ ๋ณด์ฌ์ค๋ค.Chapter 1 Introduction 1
1.1 Problem Description: Data Privacy 1
1.2 The Privacy of Support Vector Methods 2
1.3 Research Motivation and Contribution 4
1.4 Organization of the Thesis 5
Chapter 2 Literature Review 6
2.1 Differentially private Empirical risk minimization 6
2.2 Differentially private Support vector machine 7
Chapter 3 Preliminaries 9
3.1 Differential privacy 9
Chapter 4 Differential private support vector data description 12
4.1 Support vector data description 12
4.2 Differentially private support vector data description 13
Chapter 5 Differentially private multi-class classification utilizing SVDD 19
5.1 Phase I. Constructing a private support level function 20
5.2 Phase II: Differentially private clustering on the data space via a dynamical system 21
5.3 Phase III: Classifying the decomposed regions under differential privacy 22
Chapter 6 Inference scenarios and releasing the differentially private model 25
6.1 Publishing support function 26
6.2 Releasing equilibrium points 26
6.3 Comparison to previous methods 27
Chapter 7 Experiments 28
7.1 Models and Scenario setting 28
7.2 Datasets 29
7.3 Experimental settings 29
7.4 Empirical results on various datasets under publishing support function 30
7.5 Evaluating robustness under diverse data size 33
7.6 Inference through equilibrium points 33
Chapter 8 Conclusion 34
8.1 Conclusion 34์
A Manifest-Based Framework for Organizing the Management of Personal Data at the Edge of the Network
Smart disclosure initiatives and new regulations such as GDPR allow individuals to get the control back on their data by gathering their entire digital life in a Personal Data Management Systems (PDMS). Multiple PDMS architectures exist, from centralized web hosting solutions to self-data hosting at home. These solutions strongly differ on their ability to preserve data privacy and to perform collective computations crossing data of multiple individuals (e.g., epidemiological or social studies) but none of them satisfy both objectives. The emergence of Trusted Execution Environments (TEE) changes the game. We propose a solution called Trusted PDMS, combining the TEE and PDMS properties to manage the data of each individual, and a Manifest-based framework to securely execute collective computation on top of them. We demonstrate the practicality of the solution through a real case-study being conducted over 10.000 patients in the healthcare field
Heavy Hitters and the Structure of Local Privacy
We present a new locally differentially private algorithm for the heavy
hitters problem which achieves optimal worst-case error as a function of all
standardly considered parameters. Prior work obtained error rates which depend
optimally on the number of users, the size of the domain, and the privacy
parameter, but depend sub-optimally on the failure probability.
We strengthen existing lower bounds on the error to incorporate the failure
probability, and show that our new upper bound is tight with respect to this
parameter as well. Our lower bound is based on a new understanding of the
structure of locally private protocols. We further develop these ideas to
obtain the following general results beyond heavy hitters.
Advanced Grouposition: In the local model, group privacy for
users degrades proportionally to , instead of linearly in
as in the central model. Stronger group privacy yields improved max-information
guarantees, as well as stronger lower bounds (via "packing arguments"), over
the central model.
Building on a transformation of Bassily and Smith (STOC 2015), we
give a generic transformation from any non-interactive approximate-private
local protocol into a pure-private local protocol. Again in contrast with the
central model, this shows that we cannot obtain more accurate algorithms by
moving from pure to approximate local privacy
Privacy in characterizing and recruiting patients for IoHT-aided digital clinical trials
Nowadays there is a tremendous amount of smart and connected devices that produce data. The so-called IoT is so pervasive that its devices (in particular the ones that we take with us during all the day - wearables, smartphones...) often provide some insights on our lives to third parties. People habitually exchange some of their private data in order to obtain services, discounts and advantages. Sharing personal data is commonly accepted in contexts like social networks but individuals suddenly become more than concerned if a third party is interested in accessing personal health data. The healthcare systems worldwide, however, begun to take advantage of the data produced by eHealth solutions. It is clear that while on one hand the technology proved to be a great ally in the modern medicine and can lead to notable benefits, on the other hand these processes pose serious threats to our privacy. The process of testing, validating and putting on the market a new drug or medical treatment is called clinical trial. These trials are deeply impacted by the technological advancements and greatly benefit from the use of eHealth solutions. The clinical research institutes are the entities in charge of leading the trials and need to access as much health data of the patients as possible. However, at any phase of a clinical trial, the personal information of the participants should be preserved and maintained private as long as possible. During this thesis, we will introduce an architecture that protects the privacy of personal data during the first phases of digital clinical trials (namely the characterization phase and the recruiting phase), allowing potential participants to freely join trials without disclosing their personal health information without a proper reward and/or prior agreement. We will illustrate what is the trusted environment that is the most used approach in eHealth and, later, we will dig into the untrusted environment where the concept of privacy is more challenging to protect while maintaining usability of data. Our architecture maintains the individuals in full control over the flow of their personal health data. Moreover, the architecture allows the clinical research institutes to characterize the population of potentiant users without direct access to their personal data. We validated our architecture with a proof of concept that includes all the involved entities from the low level hardware up to the end application. We designed and realized the hardware capable of sensing, processing and transmitting personal health data in a privacy preserving fashion that requires little to none maintenance
A Survey on Differential Privacy with Machine Learning and Future Outlook
Nowadays, machine learning models and applications have become increasingly
pervasive. With this rapid increase in the development and employment of
machine learning models, a concern regarding privacy has risen. Thus, there is
a legitimate need to protect the data from leaking and from any attacks. One of
the strongest and most prevalent privacy models that can be used to protect
machine learning models from any attacks and vulnerabilities is differential
privacy (DP). DP is strict and rigid definition of privacy, where it can
guarantee that an adversary is not capable to reliably predict if a specific
participant is included in the dataset or not. It works by injecting a noise to
the data whether to the inputs, the outputs, the ground truth labels, the
objective functions, or even to the gradients to alleviate the privacy issue
and protect the data. To this end, this survey paper presents different
differentially private machine learning algorithms categorized into two main
categories (traditional machine learning models vs. deep learning models).
Moreover, future research directions for differential privacy with machine
learning algorithms are outlined.Comment: 12 pages, 3 figure
- โฆ