1,107 research outputs found
Model-based data generation for the evaluation of functional reliability and resilience of distributed machine learning systems against abnormal cases
Future production technologies will comprise a multitude of systems whose core functionality is closely related to machine-learned models. Such systems require reliable components to ensure the safety of workers and their trust in the systems. The evaluation of the functional reliability and resilience of systems based on machine-learned models is generally challenging. For this purpose, appropriate test data must be available, which also includes abnormal cases. These abnormal cases can be unexpected usage scenarios, erroneous inputs, accidents during operation or even the failure of certain subcomponents. In this work, approaches to the model-based generation of an arbitrary abundance of data representing such abnormal cases are explored. Such computer-based generation requires domain-specific approaches, especially with respect to the nature and distribution of the data, protocols used, or domain-specific communication structures. In previous work, we found that different use cases impose different requirements on synthetic data, and the requirements in turn imply different generation methods [1]. Based on this, various use cases are identified and different methods for computer-based generation of realistic data, as well as for the quality assessment of such data, are explored. Ultimately we explore the use of Federated Learning (FL) to address data privacy and security challenges in Industrial Control Systems. FL enables local model training while keeping sensitive information decentralized and private to their owners. In detail, we investigate whether FL can benefit clients with limited knowledge by leveraging collaboratively trained models that aggregate client-specific knowledge distributions. We found that in such scenarios federated training results in a significant increase in classification accuracy by 31.3% compared to isolated local training. Furthermore, as we introduce Differential Privacy, the resulting model achieves on par accuracy of 99.62% to an idealized case where data is independent and identically distributed across clients
Dynamic Circular Network-Based Federated Dual-View Learning for Multivariate Time Series Anomaly Detection
Multivariate time-series data exhibit intricate correlations in both temporal and spatial dimensions. However, existing network architectures often overlook dependencies in the spatial dimension and struggle to strike a balance between long-term and short-term patterns when extracting features from the data. Furthermore, industries within the business community are hesitant to share their raw data, which hinders anomaly prediction accuracy and detection performance. To address these challenges, the authors propose a dynamic circular network-based federated dual-view learning approach. Experimental results from four open-source datasets demonstrate that the method outperforms existing methods in terms of accuracy, recall, and F1_score for anomaly detection
FedDP: A privacy-protecting theft detection scheme in smart grids using federated learning
In smart grids (SGs), the systematic utilization of consumer energy data while maintaining its privacy is of paramount importance. This research addresses this problem by energy theft detection while preserving the privacy of client data. In particular, this research identifies centralized models as more accurate in predicting energy theft in SGs but with no or significantly less data protection. Current research proposes a novel federated learning (FL) framework, namely FedDP, to tackle this issue. The proposed framework enables various clients to benefit from on-device prediction with very little communication overhead and to learn from the experience of other clients with the help of a central server (CS). Furthermore, for the accurate identification of energy theft, the use of a novel federated voting classifier (FVC) is proposed. FVC uses the majority voting-based consensus of traditional machine learning (ML) classifiers namely, random forests (RF), k-nearest neighbors (KNN), and bagging classifiers (BG). To the best of our knowledge, conventional ML classifiers have never been used in a federated manner for energy theft detection in SGs. Finally, substantial experiments are performed on the real-world energy consumption dataset. Results illustrate that the proposed model can accurately and efficiently detect energy theft in SGs while guaranteeing the security of client data
CyberForce: A Federated Reinforcement Learning Framework for Malware Mitigation
Recent research has shown that the integration of Reinforcement Learning (RL)
with Moving Target Defense (MTD) can enhance cybersecurity in
Internet-of-Things (IoT) devices. Nevertheless, the practicality of existing
work is hindered by data privacy concerns associated with centralized data
processing in RL, and the unsatisfactory time needed to learn right MTD
techniques that are effective against a rising number of heterogeneous zero-day
attacks. Thus, this work presents CyberForce, a framework that combines
Federated and Reinforcement Learning (FRL) to collaboratively and privately
learn suitable MTD techniques for mitigating zero-day attacks. CyberForce
integrates device fingerprinting and anomaly detection to reward or penalize
MTD mechanisms chosen by an FRL-based agent. The framework has been deployed
and evaluated in a scenario consisting of ten physical devices of a real IoT
platform affected by heterogeneous malware samples. A pool of experiments has
demonstrated that CyberForce learns the MTD technique mitigating each attack
faster than existing RL-based centralized approaches. In addition, when various
devices are exposed to different attacks, CyberForce benefits from knowledge
transfer, leading to enhanced performance and reduced learning time in
comparison to recent works. Finally, different aggregation algorithms used
during the agent learning process provide CyberForce with notable robustness to
malicious attacks.Comment: 11 pages, 8 figure
Suppressing Poisoning Attacks on Federated Learning for Medical Imaging
Collaboration among multiple data-owning entities (e.g., hospitals) can
accelerate the training process and yield better machine learning models due to
the availability and diversity of data. However, privacy concerns make it
challenging to exchange data while preserving confidentiality. Federated
Learning (FL) is a promising solution that enables collaborative training
through exchange of model parameters instead of raw data. However, most
existing FL solutions work under the assumption that participating clients are
\emph{honest} and thus can fail against poisoning attacks from malicious
parties, whose goal is to deteriorate the global model performance. In this
work, we propose a robust aggregation rule called Distance-based Outlier
Suppression (DOS) that is resilient to byzantine failures. The proposed method
computes the distance between local parameter updates of different clients and
obtains an outlier score for each client using Copula-based Outlier Detection
(COPOD). The resulting outlier scores are converted into normalized weights
using a softmax function, and a weighted average of the local parameters is
used for updating the global model. DOS aggregation can effectively suppress
parameter updates from malicious clients without the need for any
hyperparameter selection, even when the data distributions are heterogeneous.
Evaluation on two medical imaging datasets (CheXpert and HAM10000) demonstrates
the higher robustness of DOS method against a variety of poisoning attacks in
comparison to other state-of-the-art methods. The code can be found here
https://github.com/Naiftt/SPAFD
Autonomic care platform for optimizing query performance
Background: As the amount of information in electronic health care systems increases, data operations get more complicated and time-consuming. Intensive Care platforms require a timely processing of data retrievals to guarantee the continuous display of recent data of patients. Physicians and nurses rely on this data for their decision making. Manual optimization of query executions has become difficult to handle due to the increased amount of queries across multiple sources. Hence, a more automated management is necessary to increase the performance of database queries. The autonomic computing paradigm promises an approach in which the system adapts itself and acts as self-managing entity, thereby limiting human interventions and taking actions. Despite the usage of autonomic control loops in network and software systems, this approach has not been applied so far for health information systems.
Methods: We extend the COSARA architecture, an infection surveillance and antibiotic management service platform for the Intensive Care Unit (ICU), with self-managed components to increase the performance of data retrievals. We used real-life ICU COSARA queries to analyse slow performance and measure the impact of optimizations. Each day more than 2 million COSARA queries are executed. Three control loops, which monitor the executions and take action, have been proposed: reactive, deliberative and reflective control loops. We focus on improvements of the execution time of microbiology queries directly related to the visual displays of patients' data on the bedside screens.
Results: The results show that autonomic control loops are beneficial for the optimizations in the data executions in the ICU. The application of reactive control loop results in a reduction of 8.61% of the average execution time of microbiology results. The combined application of the reactive and deliberative control loop results in an average query time reduction of 10.92% and the combination of reactive, deliberative and reflective control loops provides a reduction of 13.04%.
Conclusions: We found that by controlled reduction of queries' executions the performance for the end-user can be improved. The implementation of autonomic control loops in an existing health platform, COSARA, has a positive effect on the timely data visualization for the physician and nurse
- …