1,142 research outputs found

    Achieving Lightweight Federated Advertising with Self-Supervised Split Distillation

    Full text link
    As an emerging secure learning paradigm in leveraging cross-agency private data, vertical federated learning (VFL) is expected to improve advertising models by enabling the joint learning of complementary user attributes privately owned by the advertiser and the publisher. However, there are two key challenges in applying it to advertising systems: a) the limited scale of labeled overlapping samples, and b) the high cost of real-time cross-agency serving. In this paper, we propose a semi-supervised split distillation framework VFed-SSD to alleviate the two limitations. We identify that: i) there are massive unlabeled overlapped data available in advertising systems, and ii) we can keep a balance between model performance and inference cost by decomposing the federated model. Specifically, we develop a self-supervised task Matched Pair Detection (MPD) to exploit the vertically partitioned unlabeled data and propose the Split Knowledge Distillation (SplitKD) schema to avoid cross-agency serving. Empirical studies on three industrial datasets exhibit the effectiveness of our methods, with the median AUC over all datasets improved by 0.86% and 2.6% in the local deployment mode and the federated deployment mode respectively. Overall, our framework provides an efficient federation-enhanced solution for real-time display advertising with minimal deploying cost and significant performance lift.Comment: Accepted to the Trustworthy Federated Learning workshop of IJCAI2022 (FL-IJCAI22). 6 pages, 3 figures, 3 tables Old title: Semi-Supervised Cross-Silo Advertising with Partial Knowledge Transfe

    Federated-ANN based Critical Path Analysis and Health Recommendations for MapReduce Workflows in Consumer Electronics Applications

    Get PDF
    Although much research has been done to improve the performance of big data systems, predicting the performance degradation of these systems quickly and efficiently remains a significant challenge. Unfortunately, the complexity of big data systems is so vast that predicting performance degradation ahead of time is quite tricky. Long execution time is often discussed in the context of performance degradation of big data systems. This paper proposes MrPath, a Federated AI-based critical path analysis approach for holistic performance prediction of MapReduce workflows for consumer electronics applications while enabling root-cause analysis of various types of faults. We have implemented a federated artificial neural network (FANN) to predict the critical path in a MapReduce workflow. After the critical path components (e.g., mapper1, reducer2) are predicted/detected, root cause analysis uses user-defined functions (UDF) to pinpoint the most likely reasons for the observed performance problems. Finally, health node classification is performed using an ANN-based Self-Organising Map (SOM). The results show that the AI-based critical path analysis method can significantly illuminate the reasons behind the long execution time in big data systems

    Adaptability Approaches in Digital Libraries

    Get PDF
    This paper examines some approaches for endowing digital libraries with adaptability capabilities in order to scaffold and enhance end user experience. The paper provides a general overview of techniques and methods commonly adopted for achieving adaptability. It also discusses how these can be implemented, and to this end illustrates specific examples and guidelines drawn from the practical experience that the authors are currently gaining in the Share.TEC European project, a context in which adaptability is key to managing and responding to considerable diversity in user requirements

    Data Privacy in Multi-Cloud: An Enhanced Data Fragmentation Framework

    Full text link
    Data splitting preserves privacy by partitioning data into various fragments to be stored remotely and shared. It supports most data operations because data can be stored in clear as opposed to methods that rely on cryptography. However, majority of existing data splitting techniques do not consider data already in the multi-cloud. This leads to unnecessary use of resources to re-split data into fragments. This work proposes a data splitting framework that leverages on existing data in the multi-cloud. It improves data splitting mechanisms by reducing the number of splitting operations and resulting fragments. Therefore, decreasing the number of storage locations a data owner manages. Broadcasts queries locate third-party data fragments to avoid costly operations when splitting data. This work examines considerations for the use of third-party fragments and application to existing data splitting techniques. The proposed framework was also applied to an existing data splitting mechanism to complement its capabilities.Comment: Keywords: Data Storage, Multi-Cloud, Cloud Security, Privacy Preservation, Privacy Enhancing, Data Splitting; https://ieeexplore.ieee.org/document/964774

    Federated Learning and Differential Privacy: Software tools analysis, the Sherpa.ai FL framework and methodological guidelines for preserving data privacy

    Get PDF
    The high demand of artificial intelligence services at the edges that also preserve data privacy has pushed the research on novel machine learning paradigms that fit these requirements. Federated learning has the ambition to protect data privacy through distributed learning methods that keep the data in its storage silos. Likewise, differential privacy attains to improve the protection of data privacy by measuring the privacy loss in the communication among the elements of federated learning. The prospective matching of federated learning and differential privacy to the challenges of data privacy protection has caused the release of several software tools that support their functionalities, but they lack a unified vision of these techniques, and a methodological workflow that supports their usage. Hence, we present the Sherpa.ai Federated Learning framework that is built upon a holistic view of federated learning and differential privacy. It results from both the study of how to adapt the machine learning paradigm to federated learning, and the definition of methodological guidelines for developing artificial intelligence services based on federated learning and differential privacy. We show how to follow the methodological guidelines with the Sherpa.ai Federated Learning framework by means of a classification and a regression use cases.SHERPA Europe S.L. OTRI-4137Spanish GovernmentEuropean Commission TIN2017-89517-PSpanish Government fellowship programmes Formacion de Profesorado Universitario FPU18/04475 Juan de la Cierva Incorporacion IJC2018-036092-

    Recommendation Systems: An Insight Into Current Development and Future Research Challenges

    Get PDF
    Research on recommendation systems is swiftly producing an abundance of novel methods, constantly challenging the current state-of-the-art. Inspired by advancements in many related fields, like Natural Language Processing and Computer Vision, many hybrid approaches based on deep learning are being proposed, making solid improvements over traditional methods. On the downside, this flurry of research activity, often focused on improving over a small number of baselines, makes it hard to identify reference methods and standardized evaluation protocols. Furthermore, the traditional categorization of recommendation systems into content-based, collaborative filtering and hybrid systems lacks the informativeness it once had. With this work, we provide a gentle introduction to recommendation systems, describing the task they are designed to solve and the challenges faced in research. Building on previous work, an extension to the standard taxonomy is presented, to better reflect the latest research trends, including the diverse use of content and temporal information. To ease the approach toward the technical methodologies recently proposed in this field, we review several representative methods selected primarily from top conferences and systematically describe their goals and novelty. We formalize the main evaluation metrics adopted by researchers and identify the most commonly used benchmarks. Lastly, we discuss issues in current research practices by analyzing experimental results reported on three popular datasets
    corecore