5 research outputs found
Recommended from our members
The Concept of Identifiability in ML Models
Recent research indicates that the machine learning process can be reversed by adversarial attacks. These attacks can be used to derive personal information from the training. The supposedly anonymising machine learning process represents a process of pseudonymisation and is, therefore, subject to technical and organisational measures. Consequently, the unexamined belief in anonymisation as a guarantor for privacy cannot be easily upheld. It is, therefore, crucial to measure privacy through the lens of adversarial attacks and precisely distinguish what is meant by personal data and non-personal data and above all determine whether ML models represent pseudonyms from the training data
Membership Leakage in Label-Only Exposures
Machine learning (ML) has been widely adopted in various privacy-critical
applications, e.g., face recognition and medical image analysis. However,
recent research has shown that ML models are vulnerable to attacks against
their training data. Membership inference is one major attack in this domain:
Given a data sample and model, an adversary aims to determine whether the
sample is part of the model's training set. Existing membership inference
attacks leverage the confidence scores returned by the model as their inputs
(score-based attacks). However, these attacks can be easily mitigated if the
model only exposes the predicted label, i.e., the final model decision.
In this paper, we propose decision-based membership inference attacks and
demonstrate that label-only exposures are also vulnerable to membership
leakage. In particular, we develop two types of decision-based attacks, namely
transfer-attack and boundary-attack. Empirical evaluation shows that our
decision-based attacks can achieve remarkable performance, and even outperform
the previous score-based attacks. We further present new insights on the
success of membership inference based on quantitative and qualitative analysis,
i.e., member samples of a model are more distant to the model's decision
boundary than non-member samples. Finally, we evaluate multiple defense
mechanisms against our decision-based attacks and show that our two types of
attacks can bypass most of these defenses
Preserving data privacy in machine learning systems
peer reviewedThe wide adoption of Machine Learning to solve a large set of real-life problems came with the need to collect and process large volumes of data, some of which are considered personal and sensitive, raising serious concerns about data protection. Privacy-enhancing technologies (PETs) are often indicated as a solution to protect personal data and to achieve a general trustworthiness as required by current EU regulations on data protection and AI. However, an off-the-shelf application of PETs is insufficient to ensure a high-quality of data protection, which one needs to understand. This work systematically discusses the risks against data protection in modern Machine Learning systems taking the original perspective of the data owners, who are those who hold the various data sets, data models, or both, throughout the machine learning life cycle and considering the different Machine Learning architectures. It argues that the origin of the threats, the risks against the data, and the level of protection offered by PETs depend on the data processing phase, the role of the parties involved, and the architecture where the machine learning systems are deployed. By offering a framework in which to discuss privacy and confidentiality risks for data owners and by identifying and assessing privacy-preserving countermeasures for machine learning, this work could facilitate the discussion about compliance with EU regulations and directives.
We discuss current challenges and research questions that are still unsolved in the field. In this respect, this paper provides researchers and developers working on machine learning with a comprehensive body of knowledge to let them advance in the science of data protection in machine learning field as well as in closely related fields such as Artificial Intelligence