8,061 research outputs found
Local Differential Privacy In Smart Manufacturing: Application Scenario, Mechanisms and Tools
To utilize the potential of machine learning and deep learning, enormous amounts of data are required. To find the optimal solution, it is beneficial to share and publish data sets. Due to privacy leaks in publically released datasets and the exposure of sensitive information of individuals by attackers, the research field of differential privacy addresses solutions to avoid this in the future. Compared to other domains, the application of differential privacy in the manufacturing context is very challenging. Manufacturing data contains sensitive information about the companies and their process knowledge, products, and orders. Furthermore, data of individuals operating machines could be exposed and thus their performance evaluated. This paper describes scenarios of how differential privacy can be used in the manufacturing context. In particular, the potential threats that arise when sharing manufacturing data are addressed. This is described by identifying different manufacturing parameters and their variable types. Simplified examples show how the differentially private mechanisms can be applied to binary, numeric, categorical variables, and time series. Finally, libraries are presented which enable the productive use of differential privacy
Balancing Data Protection and Model Accuracy : An Investigation of Protection Methods on Machine Learning Model Performance for a Bank Marketing Dataset
The practice of sharing customer data among companies for marketing purposes is becoming
increasingly common. However, sharing customer-level data poses potential risks and serious
problems for businesses, such as substantial declines in brand value, erosion of customer trust,
loss of competitive advantage, and the imposition of legal penalties (Schneider et al. 2017).
These may eventually lead to financial loss and reputation damage for the companies. With the
growing awareness of the value of personal information, more companies and customers are
concerned about protecting data privacy.
In this paper, we used marketing data from a Portuguese bank to explore methods for balancing
prediction accuracy and customer data privacy using various machine learning and data privacy
techniques. The dataset includes observations from 45211 respondents and the observation
period is from May 2008 to November 2010. Our goal is to find a method that enables third
parties to share data with the bank while safeguarding customer privacy and maintaining
accuracy in predicting customer behaviour.
We tested several machine learning models: Logistic Regression, Random Forest, and Neural
Network (feedforward) on original data and then chose Random Forest, which gave the best
prediction performance, as the model to proceed to explore. After using two different data
privacy methods (Sampling and Random Noise) on the original data, we found the Random
Forest model gives us accuracy levels that are very close to the accuracy before using the
privacy methods. By doing this, we demonstrated a method for companies to protect customer
data privacy without sacrificing predictive accuracy. The results of this study will have
significant implications for companies that seek to share customer data while maintaining high
levels of privacy and accuracy.nhhma
Explaining Data-Driven Decisions made by AI Systems: The Counterfactual Approach
We examine counterfactual explanations for explaining the decisions made by
model-based AI systems. The counterfactual approach we consider defines an
explanation as a set of the system's data inputs that causally drives the
decision (i.e., changing the inputs in the set changes the decision) and is
irreducible (i.e., changing any subset of the inputs does not change the
decision). We (1) demonstrate how this framework may be used to provide
explanations for decisions made by general, data-driven AI systems that may
incorporate features with arbitrary data types and multiple predictive models,
and (2) propose a heuristic procedure to find the most useful explanations
depending on the context. We then contrast counterfactual explanations with
methods that explain model predictions by weighting features according to their
importance (e.g., SHAP, LIME) and present two fundamental reasons why we should
carefully consider whether importance-weight explanations are well-suited to
explain system decisions. Specifically, we show that (i) features that have a
large importance weight for a model prediction may not affect the corresponding
decision, and (ii) importance weights are insufficient to communicate whether
and how features influence decisions. We demonstrate this with several concise
examples and three detailed case studies that compare the counterfactual
approach with SHAP to illustrate various conditions under which counterfactual
explanations explain data-driven decisions better than importance weights
Interviewer Effects on Nonresponse
In face-to-face surveys interviewers play a crucial role in making contact with and gaining cooperation from sample units. While some analyses investigate the influence of interviewers on nonresponse, they are typically restricted to single-country studies. However, interviewer training, contacting and cooperation strategies as well as survey climates may differ across countries. Combining call-record data from the European Social Survey (ESS) with data from a detailed interviewer questionnaire on attitudes and doorstep behavior we find systematic country differences in nonresponse processes, which can in part be explained by differences in interviewer characteristics, such as contacting strategies and avowed doorstep behavior.
Privacy Measurement in Tabular Synthetic Data: State of the Art and Future Research Directions
Synthetic data (SD) have garnered attention as a privacy enhancing
technology. Unfortunately, there is no standard for quantifying their degree of
privacy protection. In this paper, we discuss proposed quantification
approaches. This contributes to the development of SD privacy standards;
stimulates multi-disciplinary discussion; and helps SD researchers make
informed modeling and evaluation decisions.Comment: 20 pages, 4 tables, 8 figures; NeurIPS 2023 Workshop on Synthetic
Data Generation with Generative A
Application Of Blockchain Technology And Integration Of Differential Privacy: Issues In E-Health Domains
A systematic and comprehensive review of critical applications of Blockchain Technology with Differential Privacy integration lies within privacy and security enhancement. This paper aims to highlight the research issues in the e-Health domain (e.g., EMR) and to review the current research directions in Differential Privacy integration with Blockchain Technology.Firstly, the current state of concerns in the e-Health domain are identified as follows: (a) healthcare information poses a high level of security and privacy concerns due to its sensitivity; (b) due to vulnerabilities surrounding the healthcare system, a data breach is common and poses a risk for attack by an adversary; and (c) the current privacy and security apparatus needs further fortification. Secondly, Blockchain Technology (BT) is one of the approaches to address these privacy and security issues. The alternative solution is the integration of Differential Privacy (DP) with Blockchain Technology. Thirdly, collections of scientific journals and research papers, published between 2015 and 2022, from IEEE, Science Direct, Google Scholar, ACM, and PubMed on the e-Health domain approach are summarized in terms of security and privacy. The methodology uses a systematic mapping study (SMS) to identify and select relevant research papers and academic journals regarding DP and BT. With this understanding of the current privacy issues in EMR, this paper focuses on three categories: (a) e-Health Record Privacy, (b) Real-Time Health Data, and (c) Health Survey Data Protection. In this study, evidence exists to identify inherent issues and technical challenges associated with the integration of Differential Privacy and Blockchain Technology
- …