1,138 research outputs found
OpBoost: A Vertical Federated Tree Boosting Framework Based on Order-Preserving Desensitization
Vertical Federated Learning (FL) is a new paradigm that enables users with
non-overlapping attributes of the same data samples to jointly train a model
without directly sharing the raw data. Nevertheless, recent works show that
it's still not sufficient to prevent privacy leakage from the training process
or the trained model. This paper focuses on studying the privacy-preserving
tree boosting algorithms under the vertical FL. The existing solutions based on
cryptography involve heavy computation and communication overhead and are
vulnerable to inference attacks. Although the solution based on Local
Differential Privacy (LDP) addresses the above problems, it leads to the low
accuracy of the trained model.
This paper explores to improve the accuracy of the widely deployed tree
boosting algorithms satisfying differential privacy under vertical FL.
Specifically, we introduce a framework called OpBoost. Three order-preserving
desensitization algorithms satisfying a variant of LDP called distance-based
LDP (dLDP) are designed to desensitize the training data. In particular, we
optimize the dLDP definition and study efficient sampling distributions to
further improve the accuracy and efficiency of the proposed algorithms. The
proposed algorithms provide a trade-off between the privacy of pairs with large
distance and the utility of desensitized values. Comprehensive evaluations show
that OpBoost has a better performance on prediction accuracy of trained models
compared with existing LDP approaches on reasonable settings. Our code is open
source
A Generalized Look at Federated Learning: Survey and Perspectives
Federated learning (FL) refers to a distributed machine learning framework
involving learning from several decentralized edge clients without sharing
local dataset. This distributed strategy prevents data leakage and enables
on-device training as it updates the global model based on the local model
updates. Despite offering several advantages, including data privacy and
scalability, FL poses challenges such as statistical and system heterogeneity
of data in federated networks, communication bottlenecks, privacy and security
issues. This survey contains a systematic summarization of previous work,
studies, and experiments on FL and presents a list of possibilities for FL
across a range of applications and use cases. Other than that, various
challenges of implementing FL and promising directions revolving around the
corresponding challenges are provided.Comment: 9 pages, 2 figure
Contributions to the privacy provisioning for federated identity management platforms
Identity information, personal data and user’s profiles are key assets for organizations
and companies by becoming the use of identity management (IdM) infrastructures a prerequisite
for most companies, since IdM systems allow them to perform their business
transactions by sharing information and customizing services for several purposes in more
efficient and effective ways.
Due to the importance of the identity management paradigm, a lot of work has been done
so far resulting in a set of standards and specifications. According to them, under the
umbrella of the IdM paradigm a person’s digital identity can be shared, linked and reused
across different domains by allowing users simple session management, etc. In this way,
users’ information is widely collected and distributed to offer new added value services
and to enhance availability. Whereas these new services have a positive impact on users’
life, they also bring privacy problems.
To manage users’ personal data, while protecting their privacy, IdM systems are the ideal
target where to deploy privacy solutions, since they handle users’ attribute exchange.
Nevertheless, current IdM models and specifications do not sufficiently address comprehensive
privacy mechanisms or guidelines, which enable users to better control over the
use, divulging and revocation of their online identities. These are essential aspects, specially
in sensitive environments where incorrect and unsecured management of user’s data
may lead to attacks, privacy breaches, identity misuse or frauds.
Nowadays there are several approaches to IdM that have benefits and shortcomings, from
the privacy perspective.
In this thesis, the main goal is contributing to the privacy provisioning for federated
identity management platforms. And for this purpose, we propose a generic architecture
that extends current federation IdM systems. We have mainly focused our contributions
on health care environments, given their particularly sensitive nature. The two main
pillars of the proposed architecture, are the introduction of a selective privacy-enhanced
user profile management model and flexibility in revocation consent by incorporating an
event-based hybrid IdM approach, which enables to replace time constraints and explicit
revocation by activating and deactivating authorization rights according to events. The
combination of both models enables to deal with both online and offline scenarios, as well
as to empower the user role, by letting her to bring together identity information from
different sources.
Regarding user’s consent revocation, we propose an implicit revocation consent mechanism
based on events, that empowers a new concept, the sleepyhead credentials, which
is issued only once and would be used any time. Moreover, we integrate this concept
in IdM systems supporting a delegation protocol and we contribute with the definition
of mathematical model to determine event arrivals to the IdM system and how they are
managed to the corresponding entities, as well as its integration with the most widely
deployed specification, i.e., Security Assertion Markup Language (SAML).
In regard to user profile management, we define a privacy-awareness user profile management
model to provide efficient selective information disclosure. With this contribution a
service provider would be able to accesses the specific personal information without being
able to inspect any other details and keeping user control of her data by controlling
who can access. The structure that we consider for the user profile storage is based on
extensions of Merkle trees allowing for hash combining that would minimize the need of
individual verification of elements along a path. An algorithm for sorting the tree as we
envision frequently accessed attributes to be closer to the root (minimizing the access’
time) is also provided.
Formal validation of the above mentioned ideas has been carried out through simulations
and the development of prototypes. Besides, dissemination activities were performed in
projects, journals and conferences.Programa Oficial de Doctorado en IngenierĂa TelemáticaPresidente: MarĂa Celeste Campo Vázquez.- Secretario: MarĂa Francisca Hinarejos Campos.- Vocal: Ă“scar Esparza MartĂ
- …