11 research outputs found

    Gradient Assisted Learning

    Full text link
    In distributed settings, collaborations between different entities, such as financial institutions, medical centers, and retail markets, are crucial to providing improved service and performance. However, the underlying entities may have little interest in sharing their private data, proprietary models, and objective functions. These privacy requirements have created new challenges for collaboration. In this work, we propose Gradient Assisted Learning (GAL), a new method for various entities to assist each other in supervised learning tasks without sharing data, models, and objective functions. In this framework, all participants collaboratively optimize the aggregate of local loss functions, and each participant autonomously builds its own model by iteratively fitting the gradients of the objective function. Experimental studies demonstrate that Gradient Assisted Learning can achieve performance close to centralized learning when all data, models, and objective functions are fully disclosed

    SoK: Training Machine Learning Models over Multiple Sources with Privacy Preservation

    Full text link
    Nowadays, gathering high-quality training data from multiple data controllers with privacy preservation is a key challenge to train high-quality machine learning models. The potential solutions could dramatically break the barriers among isolated data corpus, and consequently enlarge the range of data available for processing. To this end, both academia researchers and industrial vendors are recently strongly motivated to propose two main-stream folders of solutions: 1) Secure Multi-party Learning (MPL for short); and 2) Federated Learning (FL for short). These two solutions have their advantages and limitations when we evaluate them from privacy preservation, ways of communication, communication overhead, format of data, the accuracy of trained models, and application scenarios. Motivated to demonstrate the research progress and discuss the insights on the future directions, we thoroughly investigate these protocols and frameworks of both MPL and FL. At first, we define the problem of training machine learning models over multiple data sources with privacy-preserving (TMMPP for short). Then, we compare the recent studies of TMMPP from the aspects of the technical routes, parties supported, data partitioning, threat model, and supported machine learning models, to show the advantages and limitations. Next, we introduce the state-of-the-art platforms which support online training over multiple data sources. Finally, we discuss the potential directions to resolve the problem of TMMPP.Comment: 17 pages, 4 figure

    Software Architecture Design for Federated Learning Systems

    Full text link
    The advancements in deep learning and machine learning as the subdomain of AI have been demonstrated in multiple industries. However, the requirement for data by deep machine learning models has raised data privacy concerns. For instance, the EU's General Data Protection Regulation (GDPR) stipulates a range of data protection measures, causing data hungriness issues. Furthermore, trustworthy and responsible AI have emerged as hot topics recently thanks to the new ethical, legal, social, and technological challenges brought on by the technology. All of that led to the need for decentralised machine learning approaches. Federated learning is an emerging privacy-preserving AI technique that trains models locally and formulates a global model without transferring local data externally. Being widely distributed with different components and stakeholders, federated learning requires software system design thinking and software engineering considerations. Nonetheless, the different software engineering challenges and the software architectural approaches of federated learning have not previously been conceptualised systematically in the software architecture literature. This thesis aims to address the software engineering research gap of federated learning systems and to provide system-level solutions to achieve trustworthy and responsible federated learning by design. We first report the findings of a systematic literature review on federated learning from its software engineering perspective. Based on the study, the software architecture design concerns in building federated learning systems have been largely ignored. Thus, we present a collection of architectural patterns for the design challenges of federated learning systems and a set of decision models to assist software architects in pattern selection and perform architecture validations. The evaluation results show that the approaches are feasible and useful in serving as a guideline for federated learning software architecture design. We propose FLRA, a reference architecture for federated learning systems, and adopt the FLRA as the design basis to enhance trust for federated learning software architecture. Finally, we evaluated the designed federated learning architecture. The evaluation results show that the approach is feasible to enable accountability and improve fairness. Ultimately, the proposed system-level solution can achieve trustworthy and responsible federated learning
    corecore