1,138 research outputs found

    OpBoost: A Vertical Federated Tree Boosting Framework Based on Order-Preserving Desensitization

    Full text link
    Vertical Federated Learning (FL) is a new paradigm that enables users with non-overlapping attributes of the same data samples to jointly train a model without directly sharing the raw data. Nevertheless, recent works show that it's still not sufficient to prevent privacy leakage from the training process or the trained model. This paper focuses on studying the privacy-preserving tree boosting algorithms under the vertical FL. The existing solutions based on cryptography involve heavy computation and communication overhead and are vulnerable to inference attacks. Although the solution based on Local Differential Privacy (LDP) addresses the above problems, it leads to the low accuracy of the trained model. This paper explores to improve the accuracy of the widely deployed tree boosting algorithms satisfying differential privacy under vertical FL. Specifically, we introduce a framework called OpBoost. Three order-preserving desensitization algorithms satisfying a variant of LDP called distance-based LDP (dLDP) are designed to desensitize the training data. In particular, we optimize the dLDP definition and study efficient sampling distributions to further improve the accuracy and efficiency of the proposed algorithms. The proposed algorithms provide a trade-off between the privacy of pairs with large distance and the utility of desensitized values. Comprehensive evaluations show that OpBoost has a better performance on prediction accuracy of trained models compared with existing LDP approaches on reasonable settings. Our code is open source

    A Generalized Look at Federated Learning: Survey and Perspectives

    Full text link
    Federated learning (FL) refers to a distributed machine learning framework involving learning from several decentralized edge clients without sharing local dataset. This distributed strategy prevents data leakage and enables on-device training as it updates the global model based on the local model updates. Despite offering several advantages, including data privacy and scalability, FL poses challenges such as statistical and system heterogeneity of data in federated networks, communication bottlenecks, privacy and security issues. This survey contains a systematic summarization of previous work, studies, and experiments on FL and presents a list of possibilities for FL across a range of applications and use cases. Other than that, various challenges of implementing FL and promising directions revolving around the corresponding challenges are provided.Comment: 9 pages, 2 figure

    Contributions to the privacy provisioning for federated identity management platforms

    Get PDF
    Identity information, personal data and user’s profiles are key assets for organizations and companies by becoming the use of identity management (IdM) infrastructures a prerequisite for most companies, since IdM systems allow them to perform their business transactions by sharing information and customizing services for several purposes in more efficient and effective ways. Due to the importance of the identity management paradigm, a lot of work has been done so far resulting in a set of standards and specifications. According to them, under the umbrella of the IdM paradigm a person’s digital identity can be shared, linked and reused across different domains by allowing users simple session management, etc. In this way, users’ information is widely collected and distributed to offer new added value services and to enhance availability. Whereas these new services have a positive impact on users’ life, they also bring privacy problems. To manage users’ personal data, while protecting their privacy, IdM systems are the ideal target where to deploy privacy solutions, since they handle users’ attribute exchange. Nevertheless, current IdM models and specifications do not sufficiently address comprehensive privacy mechanisms or guidelines, which enable users to better control over the use, divulging and revocation of their online identities. These are essential aspects, specially in sensitive environments where incorrect and unsecured management of user’s data may lead to attacks, privacy breaches, identity misuse or frauds. Nowadays there are several approaches to IdM that have benefits and shortcomings, from the privacy perspective. In this thesis, the main goal is contributing to the privacy provisioning for federated identity management platforms. And for this purpose, we propose a generic architecture that extends current federation IdM systems. We have mainly focused our contributions on health care environments, given their particularly sensitive nature. The two main pillars of the proposed architecture, are the introduction of a selective privacy-enhanced user profile management model and flexibility in revocation consent by incorporating an event-based hybrid IdM approach, which enables to replace time constraints and explicit revocation by activating and deactivating authorization rights according to events. The combination of both models enables to deal with both online and offline scenarios, as well as to empower the user role, by letting her to bring together identity information from different sources. Regarding user’s consent revocation, we propose an implicit revocation consent mechanism based on events, that empowers a new concept, the sleepyhead credentials, which is issued only once and would be used any time. Moreover, we integrate this concept in IdM systems supporting a delegation protocol and we contribute with the definition of mathematical model to determine event arrivals to the IdM system and how they are managed to the corresponding entities, as well as its integration with the most widely deployed specification, i.e., Security Assertion Markup Language (SAML). In regard to user profile management, we define a privacy-awareness user profile management model to provide efficient selective information disclosure. With this contribution a service provider would be able to accesses the specific personal information without being able to inspect any other details and keeping user control of her data by controlling who can access. The structure that we consider for the user profile storage is based on extensions of Merkle trees allowing for hash combining that would minimize the need of individual verification of elements along a path. An algorithm for sorting the tree as we envision frequently accessed attributes to be closer to the root (minimizing the access’ time) is also provided. Formal validation of the above mentioned ideas has been carried out through simulations and the development of prototypes. Besides, dissemination activities were performed in projects, journals and conferences.Programa Oficial de Doctorado en Ingeniería TelemáticaPresidente: María Celeste Campo Vázquez.- Secretario: María Francisca Hinarejos Campos.- Vocal: Óscar Esparza Martí
    • …
    corecore