4 research outputs found

    A Distributed Trust Framework for Privacy-Preserving Machine Learning

    Full text link
    When training a machine learning model, it is standard procedure for the researcher to have full knowledge of both the data and model. However, this engenders a lack of trust between data owners and data scientists. Data owners are justifiably reluctant to relinquish control of private information to third parties. Privacy-preserving techniques distribute computation in order to ensure that data remains in the control of the owner while learning takes place. However, architectures distributed amongst multiple agents introduce an entirely new set of security and trust complications. These include data poisoning and model theft. This paper outlines a distributed infrastructure which is used to facilitate peer-to-peer trust between distributed agents; collaboratively performing a privacy-preserving workflow. Our outlined prototype sets industry gatekeepers and governance bodies as credential issuers. Before participating in the distributed learning workflow, malicious actors must first negotiate valid credentials. We detail a proof of concept using Hyperledger Aries, Decentralised Identifiers (DIDs) and Verifiable Credentials (VCs) to establish a distributed trust architecture during a privacy-preserving machine learning experiment. Specifically, we utilise secure and authenticated DID communication channels in order to facilitate a federated learning workflow related to mental health care data.Comment: To be published in the proceedings of the 17th International Conference on Trust, Privacy and Security in Digital Business - TrustBus202

    Advancements in privacy enhancing technologies for machine learning

    Get PDF
    The field of privacy preserving machine learning is still in its infancy and has been growing in popularity since 2019. Privacy enhancing technologies within the context of machine learning are composed of a set of core techniques. These relate to cryptography, distributed computation- or federated learning- differential privacy, and methods for managing distributed identity. Furthermore, the notion of contextual integrity exists to quantify the appropriate flow of information. The aim of this work is to advance a vision of a privacy compatible infrastructure, where web 3.0 exists as a decentralised infrastructure, enshrines the user’s right to privacy and consent over information concerning them on the Internet. This thesis contains a mix of experiments relating to privacy enhancing technologies in the context of machine learning. A number of privacy enhancing methods are advanced in these experiments, and a novel privacy preserving flow is created. This includes the establishment of an open-source framework for vertically distributed federated learning and the advancement of a novel privacy preserving machine learning framework which accommodates a core set of privacy enhancing technologies. Along with this, the work advances a novel means of describing privacy preserving information flows which extends the definition of contextual integrity. This thesis establishes a range of contributions to the advancement of privacy enhancing technologies for privacy preserving machine learning. A case study is evaluated, and a novel, heterogeneous stack classifier is built which predicts the presence of insider threat and demonstrates the efficacy of machine learning in solving problems in this domain, given access to real data. It also draws conclusions about the applicability of federated learning in this use case. A novel framework is introduced that facilitates vertically distributed machine learning on data relating to the same subjects held on different hosts. Researchers can use this to achieve vertically federated learning in practice. The weaknesses in the security of the Split Neural Networks technique are discussed, and appropriate defences were explored in detail. These defences harden SplitNN against inversion attacks. A novel distributed trust framework is established which facilitated peer-to-peer access control without the need for a third party. This puts forward a solution for fully privacy preserving access control while interacting with privacy preserving machine learning infrastructure. Finally, a novel framework for the implementation of structured transparency is given. This provides a cohesive way to manage information flows in the privacy preserving machine learning and analytics space, offering a well-stocked toolkit for the implementation of structured transparency which utilises the aforementioned technologies. This also exhibits homomorphically encrypted inference which fully hardens the SplitNN methodology against model inversion attacks. The most significant finding in this work is the production of an information flow which combines; split neural networks, homomorphic encryption, zero-knowledge access control and elements of differential privacy. This flow facilitates homomorphic inference through split neural networks, advancing the state-of-the-art with regard to privacy preserving machine learning

    Identity and identification in an information society: Augmenting formal systems of identification with technological artefacts

    Get PDF
    Information and Communication Technology (ICT) are transforming society’s information flows. These new interactive environments decouple agents, information and actions from their original contexts and this introduces challenges when evaluating trustworthiness and intelligently placing trust.This thesis develops methods that can extend institutional trust into digitally enhanced interactive settings. By applying privacy-preserving cryptographic protocols within a technical architecture, this thesis demonstrates how existing human systems of identification that support institutional trust can be augmented with ICT in ways that distribute trust, respect privacy and limit the potential for abuse. Importantly, identification systems are located within a sociologically informed framework of interaction where identity is more than a collection of static attributes.A synthesis of the evolution and systematisation of cryptographic knowledge is presented and this is juxtaposed against the ideas developed within the digital identity community. The credential mechanism, first conceptualised by David Chaum, has matured into a number of well specified mathematical protocols. This thesis focuses on CL-RSA and BBS+, which are both signature schemes with efficient protocols that can instantiate a credential mechanism with strong privacy-preserving properties.The processes of managing the identification of healthcare professionals as they navigate their careers within the Scottish Healthcare Ecosystem provide a concrete case study for this work. The proposed architecture mediates the exchange of verifiable, integrity-assured evidence that has been cryptographically signed by relevant healthcare institutions, but is stored, managed and presented by the healthcare professionals to whom the evidence pertains.An evaluation of the integrity-assured transaction data produced by this architecture demonstrates how it could be integrated into digitally augmented identification processes, increasing the assurance that can be placed in these processes. The technical architecture is shown to be practical through a series of experiments run under realistic production-like settings.This work demonstrates that designing decentralised, standards-based, privacy-preserving identification systems for trusted professionals within highly assured social contexts can distribute institutionalised trust to trustworthy individuals and empower these individuals to interface with society’s increasingly socio-technical systems
    corecore