1,107 research outputs found

    Model-based data generation for the evaluation of functional reliability and resilience of distributed machine learning systems against abnormal cases

    Get PDF
    Future production technologies will comprise a multitude of systems whose core functionality is closely related to machine-learned models. Such systems require reliable components to ensure the safety of workers and their trust in the systems. The evaluation of the functional reliability and resilience of systems based on machine-learned models is generally challenging. For this purpose, appropriate test data must be available, which also includes abnormal cases. These abnormal cases can be unexpected usage scenarios, erroneous inputs, accidents during operation or even the failure of certain subcomponents. In this work, approaches to the model-based generation of an arbitrary abundance of data representing such abnormal cases are explored. Such computer-based generation requires domain-specific approaches, especially with respect to the nature and distribution of the data, protocols used, or domain-specific communication structures. In previous work, we found that different use cases impose different requirements on synthetic data, and the requirements in turn imply different generation methods [1]. Based on this, various use cases are identified and different methods for computer-based generation of realistic data, as well as for the quality assessment of such data, are explored. Ultimately we explore the use of Federated Learning (FL) to address data privacy and security challenges in Industrial Control Systems. FL enables local model training while keeping sensitive information decentralized and private to their owners. In detail, we investigate whether FL can benefit clients with limited knowledge by leveraging collaboratively trained models that aggregate client-specific knowledge distributions. We found that in such scenarios federated training results in a significant increase in classification accuracy by 31.3% compared to isolated local training. Furthermore, as we introduce Differential Privacy, the resulting model achieves on par accuracy of 99.62% to an idealized case where data is independent and identically distributed across clients

    Dynamic Circular Network-Based Federated Dual-View Learning for Multivariate Time Series Anomaly Detection

    Get PDF
    Multivariate time-series data exhibit intricate correlations in both temporal and spatial dimensions. However, existing network architectures often overlook dependencies in the spatial dimension and struggle to strike a balance between long-term and short-term patterns when extracting features from the data. Furthermore, industries within the business community are hesitant to share their raw data, which hinders anomaly prediction accuracy and detection performance. To address these challenges, the authors propose a dynamic circular network-based federated dual-view learning approach. Experimental results from four open-source datasets demonstrate that the method outperforms existing methods in terms of accuracy, recall, and F1_score for anomaly detection

    FedDP: A privacy-protecting theft detection scheme in smart grids using federated learning

    Get PDF
    In smart grids (SGs), the systematic utilization of consumer energy data while maintaining its privacy is of paramount importance. This research addresses this problem by energy theft detection while preserving the privacy of client data. In particular, this research identifies centralized models as more accurate in predicting energy theft in SGs but with no or significantly less data protection. Current research proposes a novel federated learning (FL) framework, namely FedDP, to tackle this issue. The proposed framework enables various clients to benefit from on-device prediction with very little communication overhead and to learn from the experience of other clients with the help of a central server (CS). Furthermore, for the accurate identification of energy theft, the use of a novel federated voting classifier (FVC) is proposed. FVC uses the majority voting-based consensus of traditional machine learning (ML) classifiers namely, random forests (RF), k-nearest neighbors (KNN), and bagging classifiers (BG). To the best of our knowledge, conventional ML classifiers have never been used in a federated manner for energy theft detection in SGs. Finally, substantial experiments are performed on the real-world energy consumption dataset. Results illustrate that the proposed model can accurately and efficiently detect energy theft in SGs while guaranteeing the security of client data

    CyberForce: A Federated Reinforcement Learning Framework for Malware Mitigation

    Full text link
    Recent research has shown that the integration of Reinforcement Learning (RL) with Moving Target Defense (MTD) can enhance cybersecurity in Internet-of-Things (IoT) devices. Nevertheless, the practicality of existing work is hindered by data privacy concerns associated with centralized data processing in RL, and the unsatisfactory time needed to learn right MTD techniques that are effective against a rising number of heterogeneous zero-day attacks. Thus, this work presents CyberForce, a framework that combines Federated and Reinforcement Learning (FRL) to collaboratively and privately learn suitable MTD techniques for mitigating zero-day attacks. CyberForce integrates device fingerprinting and anomaly detection to reward or penalize MTD mechanisms chosen by an FRL-based agent. The framework has been deployed and evaluated in a scenario consisting of ten physical devices of a real IoT platform affected by heterogeneous malware samples. A pool of experiments has demonstrated that CyberForce learns the MTD technique mitigating each attack faster than existing RL-based centralized approaches. In addition, when various devices are exposed to different attacks, CyberForce benefits from knowledge transfer, leading to enhanced performance and reduced learning time in comparison to recent works. Finally, different aggregation algorithms used during the agent learning process provide CyberForce with notable robustness to malicious attacks.Comment: 11 pages, 8 figure

    GR-077 - A Robust Federated Machine Learning Framework for Security Analytics in Solar Farms

    Get PDF

    Suppressing Poisoning Attacks on Federated Learning for Medical Imaging

    Full text link
    Collaboration among multiple data-owning entities (e.g., hospitals) can accelerate the training process and yield better machine learning models due to the availability and diversity of data. However, privacy concerns make it challenging to exchange data while preserving confidentiality. Federated Learning (FL) is a promising solution that enables collaborative training through exchange of model parameters instead of raw data. However, most existing FL solutions work under the assumption that participating clients are \emph{honest} and thus can fail against poisoning attacks from malicious parties, whose goal is to deteriorate the global model performance. In this work, we propose a robust aggregation rule called Distance-based Outlier Suppression (DOS) that is resilient to byzantine failures. The proposed method computes the distance between local parameter updates of different clients and obtains an outlier score for each client using Copula-based Outlier Detection (COPOD). The resulting outlier scores are converted into normalized weights using a softmax function, and a weighted average of the local parameters is used for updating the global model. DOS aggregation can effectively suppress parameter updates from malicious clients without the need for any hyperparameter selection, even when the data distributions are heterogeneous. Evaluation on two medical imaging datasets (CheXpert and HAM10000) demonstrates the higher robustness of DOS method against a variety of poisoning attacks in comparison to other state-of-the-art methods. The code can be found here https://github.com/Naiftt/SPAFD

    Autonomic care platform for optimizing query performance

    Get PDF
    Background: As the amount of information in electronic health care systems increases, data operations get more complicated and time-consuming. Intensive Care platforms require a timely processing of data retrievals to guarantee the continuous display of recent data of patients. Physicians and nurses rely on this data for their decision making. Manual optimization of query executions has become difficult to handle due to the increased amount of queries across multiple sources. Hence, a more automated management is necessary to increase the performance of database queries. The autonomic computing paradigm promises an approach in which the system adapts itself and acts as self-managing entity, thereby limiting human interventions and taking actions. Despite the usage of autonomic control loops in network and software systems, this approach has not been applied so far for health information systems. Methods: We extend the COSARA architecture, an infection surveillance and antibiotic management service platform for the Intensive Care Unit (ICU), with self-managed components to increase the performance of data retrievals. We used real-life ICU COSARA queries to analyse slow performance and measure the impact of optimizations. Each day more than 2 million COSARA queries are executed. Three control loops, which monitor the executions and take action, have been proposed: reactive, deliberative and reflective control loops. We focus on improvements of the execution time of microbiology queries directly related to the visual displays of patients' data on the bedside screens. Results: The results show that autonomic control loops are beneficial for the optimizations in the data executions in the ICU. The application of reactive control loop results in a reduction of 8.61% of the average execution time of microbiology results. The combined application of the reactive and deliberative control loop results in an average query time reduction of 10.92% and the combination of reactive, deliberative and reflective control loops provides a reduction of 13.04%. Conclusions: We found that by controlled reduction of queries' executions the performance for the end-user can be improved. The implementation of autonomic control loops in an existing health platform, COSARA, has a positive effect on the timely data visualization for the physician and nurse
    corecore