13 research outputs found
Advancing Personalized Federated Learning: Group Privacy, Fairness, and Beyond
Federated learning (FL) is a framework for training machine learning models
in a distributed and collaborative manner. During training, a set of
participating clients process their data stored locally, sharing only the model
updates obtained by minimizing a cost function over their local inputs. FL was
proposed as a stepping-stone towards privacy-preserving machine learning, but
it has been shown vulnerable to issues such as leakage of private information,
lack of personalization of the model, and the possibility of having a trained
model that is fairer to some groups than to others. In this paper, we address
the triadic interaction among personalization, privacy guarantees, and fairness
attained by models trained within the FL framework. Differential privacy and
its variants have been studied and applied as cutting-edge standards for
providing formal privacy guarantees. However, clients in FL often hold very
diverse datasets representing heterogeneous communities, making it important to
protect their sensitive information while still ensuring that the trained model
upholds the aspect of fairness for the users. To attain this objective, a
method is put forth that introduces group privacy assurances through the
utilization of -privacy (aka metric privacy). -privacy represents a
localized form of differential privacy that relies on a metric-oriented
obfuscation approach to maintain the original data's topological distribution.
This method, besides enabling personalized model training in a federated
approach and providing formal privacy guarantees, possesses significantly
better group fairness measured under a variety of standard metrics than a
global model trained within a classical FL template. Theoretical justifications
for the applicability are provided, as well as experimental validation on
real-world datasets to illustrate the working of the proposed method
Causal Discovery Under Local Privacy
Differential privacy is a widely adopted framework designed to safeguard the
sensitive information of data providers within a data set. It is based on the
application of controlled noise at the interface between the server that stores
and processes the data, and the data consumers. Local differential privacy is a
variant that allows data providers to apply the privatization mechanism
themselves on their data individually. Therefore it provides protection also in
contexts in which the server, or even the data collector, cannot be trusted.
The introduction of noise, however, inevitably affects the utility of the data,
particularly by distorting the correlations between individual data components.
This distortion can prove detrimental to tasks such as causal discovery. In
this paper, we consider various well-known locally differentially private
mechanisms and compare the trade-off between the privacy they provide, and the
accuracy of the causal structure produced by algorithms for causal learning
when applied to data obfuscated by these mechanisms. Our analysis yields
valuable insights for selecting appropriate local differentially private
protocols for causal discovery tasks. We foresee that our findings will aid
researchers and practitioners in conducting locally private causal discovery
An Adaptive Grid and Incentive Mechanism for Personalized Differentially Private Location Data in the Local Setting
With the proliferation of wireless communication and mobile devices, various location-based services are emerging. For the growth of the location-based services, more accurate and various types of personal location data are required. However, concerns about privacy violations are a significant obstacle to obtain personal location data. In this paper, we propose a local differential privacy scheme in an environment where there is no trusted third party to implement privacy protection techniques and incentive mechanisms to motivate users to provide more accurate location data. The proposed local differential privacy scheme allows a user to set a personalized safe region that he/she can disclose and then perturb the user’s location within the safe region. It is the way to satisfy the user’s various privacy requirements and improve data utility. The proposed incentive mechanism has two models, and both models pay the incentive differently according to the user’s safe region size to motivate to set a more precise safe region. We verify the proposed local differential privacy algorithm and incentive mechanism can satisfy the privacy protection level while achieving the desirable utility through the experiment
Échantillonneur rapide de la distribution de von Mises Fisher en Python
This paper implements a method for sampling from the d-dimensional Von Mises Fisher distribution using NumPy, focusing on speed and readability. The complexity of the algorithm is O(nd) for n samples, which is theoretically optimal taking into account that nd is the output size
An Adaptive Window Size Selection Method for Differentially Private Data Publishing over Infinite Trajectory Stream
Recently, various services based on user's location are emerging since the development of wireless Internet and sensor technology. VANET (vehicular ad hoc network), in which a large number of vehicles communicate using wireless communication, is also being highlighted as one of the services. VANET collects and analyzes the traffic data periodically to provide the traffic information service. The problem is that traffic data contains user’s sensitive location information that can lead to privacy violations. Differential privacy techniques are being used as a de facto standard to prevent such privacy violation caused by data analysis. However, applying differential privacy to traffic data stream which has infinite size over time makes data useless because too much noise is inserted to protect privacy. In order to overcome this limitation, existing researches set a certain range of windows and apply differential privacy to windowed data. However, previous researches have set a fixed window size do not consider a traffic data’s property such as road structure and time-based traffic variation. It may lead to insufficient privacy protection and unnecessary data utility degradation. In this paper, we propose an adaptive window size selection method that consider the correlation between road networks and time-based traffic variation to solve a fixed window size problem. And we suggest an adjustable privacy budget allocation technique for corresponding to the adaptive window size selection. We show that the proposed method improves the data utility, while satisfying the equal level of differential privacy as compared with the existing method through experiments that is designed based on real-world road network
Establishing the Price of Privacy in Federated Data Trading
International audienc
An Incentive Mechanism for Trading Personal Data in Data Markets
International audienc
Group privacy for personalized federated learning
Federated learning is a type of collaborative machine learning, where
participating clients process their data locally, sharing only updates to the
collaborative model. This enables to build privacy-aware distributed machine
learning models, among others. The goal is the optimization of a statistical
model's parameters by minimizing a cost function of a collection of datasets
which are stored locally by a set of clients. This process exposes the clients
to two issues: leakage of private information and lack of personalization of
the model. On the other hand, with the recent advancements in techniques to
analyze data, there is a surge of concern for the privacy violation of the
participating clients. To mitigate this, differential privacy and its variants
serve as a standard for providing formal privacy guarantees. Often the clients
represent very heterogeneous communities and hold data which are very diverse.
Therefore, aligned with the recent focus of the FL community to build a
framework of personalized models for the users representing their diversity, it
is also of utmost importance to protect against potential threats against the
sensitive and personal information of the clients. -privacy, which is a
generalization of geo-indistinguishability, the lately popularized paradigm of
location privacy, uses a metric-based obfuscation technique that preserves the
spatial distribution of the original data. To address the issue of protecting
the privacy of the clients and allowing for personalized model training to
enhance the fairness and utility of the system, we propose a method to provide
group privacy guarantees exploiting some key properties of -privacy which
enables personalized models under the framework of FL. We provide with
theoretical justifications to the applicability and experimental validation on
real-world datasets to illustrate the working of the proposed method
Group privacy for personalized federated learning
International audienceFederated learning (FL) is a type of collaborative machine learning where participating peers/clients process their data locally, sharing only updates to the collaborative model. This enables to build privacy-aware distributed machine learning models, among others. The goal is the optimization of a statistical model's parameters by minimizing a cost function of a collection of datasets which are stored locally by a set of clients. This process exposes the clients to two issues: leakage of private information and lack of personalization of the model. On the other hand, with the recent advancements in various techniques to analyze data, there is a surge of concern for the privacy violation of the participating clients. To mitigate this, differential privacy and its variants serve as a standard for providing formal privacy guarantees. Often the clients represent very heterogeneous communities and hold data which are very diverse. Therefore, aligned with the recent focus of the FL community to build a framework of personalized models for the users representing their diversity, it is also of utmost importance to protect the clients' sensitive and personal information against potential threats. To address this goal we consider -privacy, also known as metric privacy, which is a variant of local differential privacy, using a a metric-based obfuscation technique that preserves the topological distribution of the original data. To cope with the issue of protecting the privacy of the clients and allowing for personalized model training to enhance the fairness and utility of the system, we propose a method to provide group privacy guarantees exploiting some key properties of -privacy which enables personalized models under the framework of FL. We provide theoretical justifications to the applicability and experimental validation on real datasets to illustrate the working of our method