83 research outputs found
Try to Avoid Attacks: A Federated Data Sanitization Defense for Healthcare IoMT Systems
Healthcare IoMT systems are becoming intelligent, miniaturized, and more
integrated into daily life. As for the distributed devices in the IoMT,
federated learning has become a topical area with cloud-based training
procedures when meeting data security. However, the distribution of IoMT has
the risk of protection from data poisoning attacks. Poisoned data can be
fabricated by falsifying medical data, which urges a security defense to IoMT
systems. Due to the lack of specific labels, the filtering of malicious data is
a unique unsupervised scenario. One of the main challenges is finding robust
data filtering methods for various poisoning attacks. This paper introduces a
Federated Data Sanitization Defense, a novel approach to protect the system
from data poisoning attacks. To solve this unsupervised problem, we first use
federated learning to project all the data to the subspace domain, allowing
unified feature mapping to be established since the data is stored locally.
Then we adopt the federated clustering to re-group their features to clarify
the poisoned data. The clustering is based on the consistent association of
data and its semantics. After we get the clustering of the private data, we do
the data sanitization with a simple yet efficient strategy. In the end, each
device of distributed ImOT is enabled to filter malicious data according to
federated data sanitization. Extensive experiments are conducted to evaluate
the efficacy of the proposed defense method against data poisoning attacks.
Further, we consider our approach in the different poisoning ratios and achieve
a high Accuracy and a low attack success rate
Poison is Not Traceless: Fully-Agnostic Detection of Poisoning Attacks
The performance of machine learning models depends on the quality of the
underlying data. Malicious actors can attack the model by poisoning the
training data. Current detectors are tied to either specific data types,
models, or attacks, and therefore have limited applicability in real-world
scenarios. This paper presents a novel fully-agnostic framework, DIVA
(Detecting InVisible Attacks), that detects attacks solely relying on analyzing
the potentially poisoned data set. DIVA is based on the idea that poisoning
attacks can be detected by comparing the classifier's accuracy on poisoned and
clean data and pre-trains a meta-learner using Complexity Measures to estimate
the otherwise unknown accuracy on a hypothetical clean dataset. The framework
applies to generic poisoning attacks. For evaluation purposes, in this paper,
we test DIVA on label-flipping attacks.Comment: 8 page
Explainable Data Poison Attacks on Human Emotion Evaluation Systems based on EEG Signals
The major aim of this paper is to explain the data poisoning attacks using
label-flipping during the training stage of the electroencephalogram (EEG)
signal-based human emotion evaluation systems deploying Machine Learning models
from the attackers' perspective. Human emotion evaluation using EEG signals has
consistently attracted a lot of research attention. The identification of human
emotional states based on EEG signals is effective to detect potential internal
threats caused by insider individuals. Nevertheless, EEG signal-based human
emotion evaluation systems have shown several vulnerabilities to data poison
attacks. The findings of the experiments demonstrate that the suggested data
poison assaults are model-independently successful, although various models
exhibit varying levels of resilience to the attacks. In addition, the data
poison attacks on the EEG signal-based human emotion evaluation systems are
explained with several Explainable Artificial Intelligence (XAI) methods,
including Shapley Additive Explanation (SHAP) values, Local Interpretable
Model-agnostic Explanations (LIME), and Generated Decision Trees. And the codes
of this paper are publicly available on GitHub
Wild Patterns Reloaded: A Survey of Machine Learning Security against Training Data Poisoning
The success of machine learning is fueled by the increasing availability of computing power and large training datasets. The training data is used to learn new models or update existing ones, assuming that it is sufficiently representative of the data that will be encountered at test time. This assumption is challenged by the threat of poisoning, an attack that manipulates the training data to compromise the model's performance at test time. Although poisoning has been acknowledged as a relevant threat in industry applications, and a variety of different attacks and defenses have been proposed so far, a complete systematization and critical review of the field is still missing. In this survey, we provide a comprehensive systematization of poisoning attacks and defenses in machine learning, reviewing more than 100 papers published in the field in the last 15 years. We start by categorizing the current threat models and attacks, and then organize existing defenses accordingly. While we focus mostly on computer-vision applications, we argue that our systematization also encompasses state-of-the-art attacks and defenses for other data modalities. Finally, we discuss existing resources for research in poisoning, and shed light on the current limitations and open research questions in this research field
A policy compliance detection architecture for data exchange infrastructures
Data sharing and federation can significantly increase efficiency and lower the cost of digital collaborations. It is important to convince the data owners that their outsourced data will be used in a secure and controlled manner. To achieve this goal, constructing a policy governing concrete data usage rule among all parties is essential. More importantly, we need to establish digital infrastructures that can enforce the policy. In this thesis, we investigate how to select optimal application-tailored infrastructures and enhance policy compliance capabilities. First, we introduce a component linking the policy to the infrastructure patterns. The mechanism selects digital infrastructure patterns that satisfy the collaboration request to a maximal degree by modelling and closeness identification. Second, we present a threat-analysis driven risk assessment framework. The framework quantitatively assesses the remaining risk of an application delegated to digital infrastructure. The optimal digital infrastructure for a specific data federation application is the one which can support the requested collaboration model and provides the best security guarantee. Finally, we present a distributed architecture that detects policy compliance when an algorithm executes on the data. A profile and an IDS model are built for each containerized algorithm and are distributed to endpoint execution platforms via a secure channel. Syscall traces are monitored and analysed in endpoint points platforms. The machine learning based IDS is retrained periodically to increase generalization. A sanitization algorithm is implemented to filter out malicious samples to further defend the architecture against adversarial machine learning attacks
- …