Search CORE

2,169 research outputs found

커널 서포트와 평형점을 활용한 차분 프라이버시 다중 클래스 분류 기법

Author: 박진성
Publication venue: 서울대학교 대학원
Publication date: 01/02/2022
Field of study

학위논문(석사) -- 서울대학교대학원 : 공과대학 산업공학과, 2022.2. 이재욱.In this paper, we propose a multi-class classification method using kernel supports and a dynamic system under differential privacy. We find support vector machine (SVM) algorithms have a fundamental weaknesses of implementing differential privacy because the decision function depends on some subset of the training data called the support vectors. Therefore, we develop a method using interior points called equilibrium points (EPs) without relying on the decision boundary. To construct EPs, we utilize a dynamic system with a new differentially private support vector data description (SVDD) by perturbing the sphere center in the kernel space. Empirical results show that the proposed method achieves better performance even on small-sized datasets where differential privacy performs poorly.본 논문에서는 커널 서포트와 평형점을 활용한 차분 프라이버시 다중 클래스 분류 기법을 제시한다. 서포트 벡터 분류 기법은 데이터 분석과 머신 러닝에 활용성이 높아 사용자의 데이터를 보호하며 학습하는 것이 필수적이다. 그 중 가장 대중적인 서포트 벡터 머신(SVM)은 서포트 벡터라고 불리는 일부 데이터에만 분류에 의존하기 때문에 프라이버시 차분 기법을 활용하기 어렵다. 데이터 하나가 변경되었을 때 결과의 변화가 적어야 하는 차분 프라이버시 상황에서 서포트 벡터 하나가 없어진다면 분류기의 결정 경계는 그 변화에 매우 취약하다는 문제가 있다. 이 문제를 해결하기 위해 본 연구에서는 평형점이라고 불리는 군집 내부에 존재하는 점을 활용하는 차분 프라이버시 다중 클래스 분류 기법을 제시한다. 이를 위해, 먼저 커널 공간에서 구의 중심에 섭동을 더해 차분 프라이버시를 만족하는 서포트 벡터 데이터 디스크립션(SVDD)을 구하고 이를 레벨집합으로 활용해 동역학계로 극소점들을 구한다. 평형점을 활용하거나 고차원 데이터의 경우 초입방체를 만들어, 학습한 모델을 추론에 활용할 수 있는 (1) 서포트 함수를 공개 하는 방법과 (2) 평형점을 공개하는 방법을 제시한다. 8개의 다양한 데이터 집합의 실험적인 결과는 제시한 방법론이 노이즈에 강건한 내부의 점을 활용해 기존의 차분 프라이버시 서포트 벡터 머신보다 성능을 높이고, 차분 프라이버시가 적용되기 어려운 작은 데이터셋에도 활용될 수 있다는 기술임을 보여준다.Chapter 1 Introduction 1 1.1 Problem Description: Data Privacy 1 1.2 The Privacy of Support Vector Methods 2 1.3 Research Motivation and Contribution 4 1.4 Organization of the Thesis 5 Chapter 2 Literature Review 6 2.1 Differentially private Empirical risk minimization 6 2.2 Differentially private Support vector machine 7 Chapter 3 Preliminaries 9 3.1 Differential privacy 9 Chapter 4 Differential private support vector data description 12 4.1 Support vector data description 12 4.2 Differentially private support vector data description 13 Chapter 5 Differentially private multi-class classification utilizing SVDD 19 5.1 Phase I. Constructing a private support level function 20 5.2 Phase II: Differentially private clustering on the data space via a dynamical system 21 5.3 Phase III: Classifying the decomposed regions under differential privacy 22 Chapter 6 Inference scenarios and releasing the differentially private model 25 6.1 Publishing support function 26 6.2 Releasing equilibrium points 26 6.3 Comparison to previous methods 27 Chapter 7 Experiments 28 7.1 Models and Scenario setting 28 7.2 Datasets 29 7.3 Experimental settings 29 7.4 Empirical results on various datasets under publishing support function 30 7.5 Evaluating robustness under diverse data size 33 7.6 Inference through equilibrium points 33 Chapter 8 Conclusion 34 8.1 Conclusion 34석

SNU Open Repository and Archive

A Manifest-Based Framework for Organizing the Management of Personal Data at the Edge of the Network

Author: Anciaux Nicolas
Ladjel Riad
Pucheral Philippe
Scerri Guillaume
Publication venue: AIS Electronic Library (AISeL)
Publication date: 28/08/2019
Field of study

Smart disclosure initiatives and new regulations such as GDPR allow individuals to get the control back on their data by gathering their entire digital life in a Personal Data Management Systems (PDMS). Multiple PDMS architectures exist, from centralized web hosting solutions to self-data hosting at home. These solutions strongly differ on their ability to preserve data privacy and to perform collective computations crossing data of multiple individuals (e.g., epidemiological or social studies) but none of them satisfy both objectives. The emergence of Trusted Execution Environments (TEE) changes the game. We propose a solution called Trusted PDMS, combining the TEE and PDMS properties to manage the data of each individual, and a Manifest-based framework to securely execute collective computation on top of them. We demonstrate the practicality of the solution through a real case-study being conducted over 10.000 patients in the healthcare field

INRIA a CCSD electronic archive server

AIS Electronic Library (AISeL)

HAL UVSQ

Heavy Hitters and the Structure of Local Privacy

Author: Alon N.
Bassily R.
Dwork C.
Guruswami V.
Thakurta A.
Publication venue
Publication date: 13/11/2017
Field of study

We present a new locally differentially private algorithm for the heavy hitters problem which achieves optimal worst-case error as a function of all standardly considered parameters. Prior work obtained error rates which depend optimally on the number of users, the size of the domain, and the privacy parameter, but depend sub-optimally on the failure probability. We strengthen existing lower bounds on the error to incorporate the failure probability, and show that our new upper bound is tight with respect to this parameter as well. Our lower bound is based on a new understanding of the structure of locally private protocols. We further develop these ideas to obtain the following general results beyond heavy hitters.

\bullet

Advanced Grouposition: In the local model, group privacy for

k

users degrades proportionally to

\approx \sqrt{k}

, instead of linearly in

k

as in the central model. Stronger group privacy yields improved max-information guarantees, as well as stronger lower bounds (via "packing arguments"), over the central model.

\bullet

Building on a transformation of Bassily and Smith (STOC 2015), we give a generic transformation from any non-interactive approximate-private local protocol into a pure-private local protocol. Again in contrast with the central model, this shows that we cannot obtain more accurate algorithms by moving from pure to approximate local privacy

arXiv.org e-Print Archive

Crossref

Privacy-by-design in big data analytics and social mining

Author: Giannotti Fosca
Monreale Anna
Pedreschi Dino
Pratesi Francesca
Rinzivillo Salvatore
Publication venue
Publication date: 01/01/2014
Field of study

Crossref

Archivio della Ricerca - Università di Pisa

Open Access Repository

Privacy in characterizing and recruiting patients for IoHT-aided digital clinical trials

Author: Angeletti Fabio
Publication venue
Publication date: 22/02/2019
Field of study

Nowadays there is a tremendous amount of smart and connected devices that produce data. The so-called IoT is so pervasive that its devices (in particular the ones that we take with us during all the day - wearables, smartphones...) often provide some insights on our lives to third parties. People habitually exchange some of their private data in order to obtain services, discounts and advantages. Sharing personal data is commonly accepted in contexts like social networks but individuals suddenly become more than concerned if a third party is interested in accessing personal health data. The healthcare systems worldwide, however, begun to take advantage of the data produced by eHealth solutions. It is clear that while on one hand the technology proved to be a great ally in the modern medicine and can lead to notable benefits, on the other hand these processes pose serious threats to our privacy. The process of testing, validating and putting on the market a new drug or medical treatment is called clinical trial. These trials are deeply impacted by the technological advancements and greatly benefit from the use of eHealth solutions. The clinical research institutes are the entities in charge of leading the trials and need to access as much health data of the patients as possible. However, at any phase of a clinical trial, the personal information of the participants should be preserved and maintained private as long as possible. During this thesis, we will introduce an architecture that protects the privacy of personal data during the first phases of digital clinical trials (namely the characterization phase and the recruiting phase), allowing potential participants to freely join trials without disclosing their personal health information without a proper reward and/or prior agreement. We will illustrate what is the trusted environment that is the most used approach in eHealth and, later, we will dig into the untrusted environment where the concept of privacy is more challenging to protect while maintaining usability of data. Our architecture maintains the individuals in full control over the flow of their personal health data. Moreover, the architecture allows the clinical research institutes to characterize the population of potentiant users without direct access to their personal data. We validated our architecture with a proof of concept that includes all the involved entities from the low level hardware up to the end application. We designed and realized the hardware capable of sensing, processing and transmitting personal health data in a privacy preserving fashion that requires little to none maintenance

Archivio della ricerca- Università di Roma La Sapienza

A Survey on Differential Privacy with Machine Learning and Future Outlook

Author: Baraheem Samah
Yao Zhongmei
Publication venue
Publication date: 19/11/2022
Field of study

Nowadays, machine learning models and applications have become increasingly pervasive. With this rapid increase in the development and employment of machine learning models, a concern regarding privacy has risen. Thus, there is a legitimate need to protect the data from leaking and from any attacks. One of the strongest and most prevalent privacy models that can be used to protect machine learning models from any attacks and vulnerabilities is differential privacy (DP). DP is strict and rigid definition of privacy, where it can guarantee that an adversary is not capable to reliably predict if a specific participant is included in the dataset or not. It works by injecting a noise to the data whether to the inputs, the outputs, the ground truth labels, the objective functions, or even to the gradients to alleviate the privacy issue and protect the data. To this end, this survey paper presents different differentially private machine learning algorithms categorized into two main categories (traditional machine learning models vs. deep learning models). Moreover, future research directions for differential privacy with machine learning algorithms are outlined.Comment: 12 pages, 3 figure

arXiv.org e-Print Archive