215 research outputs found
Towards federated multivariate statistical process control (FedMSPC)
The ongoing transition from a linear (produce-use-dispose) to a circular
economy poses significant challenges to current state-of-the-art information
and communication technologies. In particular, the derivation of integrated,
high-level views on material, process, and product streams from (real-time)
data produced along value chains is challenging for several reasons. Most
importantly, sufficiently rich data is often available yet not shared across
company borders because of privacy concerns which make it impossible to build
integrated process models that capture the interrelations between input
materials, process parameters, and key performance indicators along value
chains. In the current contribution, we propose a privacy-preserving, federated
multivariate statistical process control (FedMSPC) framework based on Federated
Principal Component Analysis (PCA) and Secure Multiparty Computation to foster
the incentive for closer collaboration of stakeholders along value chains. We
tested our approach on two industrial benchmark data sets - SECOM and ST-AWFD.
Our empirical results demonstrate the superior fault detection capability of
the proposed approach compared to standard, single-party (multiway) PCA.
Furthermore, we showcase the possibility of our framework to provide
privacy-preserving fault diagnosis to each data holder in the value chain to
underpin the benefits of secure data sharing and federated process modeling
A Survey on Differential Privacy with Machine Learning and Future Outlook
Nowadays, machine learning models and applications have become increasingly
pervasive. With this rapid increase in the development and employment of
machine learning models, a concern regarding privacy has risen. Thus, there is
a legitimate need to protect the data from leaking and from any attacks. One of
the strongest and most prevalent privacy models that can be used to protect
machine learning models from any attacks and vulnerabilities is differential
privacy (DP). DP is strict and rigid definition of privacy, where it can
guarantee that an adversary is not capable to reliably predict if a specific
participant is included in the dataset or not. It works by injecting a noise to
the data whether to the inputs, the outputs, the ground truth labels, the
objective functions, or even to the gradients to alleviate the privacy issue
and protect the data. To this end, this survey paper presents different
differentially private machine learning algorithms categorized into two main
categories (traditional machine learning models vs. deep learning models).
Moreover, future research directions for differential privacy with machine
learning algorithms are outlined.Comment: 12 pages, 3 figure
Data Valuation for Vertical Federated Learning: A Model-free and Privacy-preserving Method
Vertical Federated learning (VFL) is a promising paradigm for predictive
analytics, empowering an organization (i.e., task party) to enhance its
predictive models through collaborations with multiple data suppliers (i.e.,
data parties) in a decentralized and privacy-preserving way. Despite the
fast-growing interest in VFL, the lack of effective and secure tools for
assessing the value of data owned by data parties hinders the application of
VFL in business contexts. In response, we propose FedValue, a
privacy-preserving, task-specific but model-free data valuation method for VFL,
which consists of a data valuation metric and a federated computation method.
Specifically, we first introduce a novel data valuation metric, namely
MShapley-CMI. The metric evaluates a data party's contribution to a predictive
analytics task without the need of executing a machine learning model, making
it well-suited for real-world applications of VFL. Next, we develop an
innovative federated computation method that calculates the MShapley-CMI value
for each data party in a privacy-preserving manner. Extensive experiments
conducted on six public datasets validate the efficacy of FedValue for data
valuation in the context of VFL. In addition, we illustrate the practical
utility of FedValue with a case study involving federated movie
recommendations
Near-Optimal Algorithms for Differentially-Private Principal Components
Principal components analysis (PCA) is a standard tool for identifying good
low-dimensional approximations to data in high dimension. Many data sets of
interest contain private or sensitive information about individuals. Algorithms
which operate on such data should be sensitive to the privacy risks in
publishing their outputs. Differential privacy is a framework for developing
tradeoffs between privacy and the utility of these outputs. In this paper we
investigate the theory and empirical performance of differentially private
approximations to PCA and propose a new method which explicitly optimizes the
utility of the output. We show that the sample complexity of the proposed
method differs from the existing procedure in the scaling with the data
dimension, and that our method is nearly optimal in terms of this scaling. We
furthermore illustrate our results, showing that on real data there is a large
performance gap between the existing method and our method.Comment: 37 pages, 8 figures; final version to appear in the Journal of
Machine Learning Research, preliminary version was at NIPS 201
- …