12 research outputs found
-Mutual Information: A Tunable Privacy Measure for Privacy Protection in Data Sharing
This paper adopts Arimoto's -Mutual Information as a tunable privacy
measure, in a privacy-preserving data release setting that aims to prevent
disclosing private data to adversaries. By fine-tuning the privacy metric, we
demonstrate that our approach yields superior models that effectively thwart
attackers across various performance dimensions. We formulate a general
distortion-based mechanism that manipulates the original data to offer privacy
protection. The distortion metrics are determined according to the data
structure of a specific experiment. We confront the problem expressed in the
formulation by employing a general adversarial deep learning framework that
consists of a releaser and an adversary, trained with opposite goals. This
study conducts empirical experiments on images and time-series data to verify
the functionality of -Mutual Information. We evaluate the
privacy-utility trade-off of customized models and compare them to mutual
information as the baseline measure. Finally, we analyze the consequence of an
attacker's access to side information about private data and witness that
adapting the privacy measure results in a more refined model than the
state-of-the-art in terms of resiliency against side information.Comment: 2023 22nd IEEE International Conference on Machine Learning and
Applications (ICMLA
On the Impact of Side Information on Smart Meter Privacy-Preserving Methods
Smart meters (SMs) can pose privacy threats for consumers, an issue that has
received significant attention in recent years. This paper studies the impact
of Side Information (SI) on the performance of distortion-based real-time
privacy-preserving algorithms for SMs. In particular, we consider a deep
adversarial learning framework, in which the desired releaser (a recurrent
neural network) is trained by fighting against an adversary network until
convergence. To define the loss functions, two different approaches are
considered: the Causal Adversarial Learning (CAL) and the Directed Information
(DI)-based learning. The main difference between these approaches is in how the
privacy term is measured during the training process. On the one hand, the
releaser in the CAL method, by getting supervision from the actual values of
the private variables and feedback from the adversary performance, tries to
minimize the adversary log-likelihood. On the other hand, the releaser in the
DI approach completely relies on the feedback received from the adversary and
is optimized to maximize its uncertainty. The performance of these two
algorithms is evaluated empirically using real-world SMs data, considering an
attacker with access to SI (e.g., the day of the week) that tries to infer the
occupancy status from the released SMs data. The results show that, although
they perform similarly when the attacker does not exploit the SI, in general,
the CAL method is less sensitive to the inclusion of SI. However, in both
cases, privacy levels are significantly affected, particularly when multiple
sources of SI are included
Deep Directed Information-Based Learning for Privacy-Preserving Smart Meter Data Release
The explosion of data collection has raised serious privacy concerns in users
due to the possibility that sharing data may also reveal sensitive information.
The main goal of a privacy-preserving mechanism is to prevent a malicious third
party from inferring sensitive information while keeping the shared data
useful. In this paper, we study this problem in the context of time series data
and smart meters (SMs) power consumption measurements in particular. Although
Mutual Information (MI) between private and released variables has been used as
a common information-theoretic privacy measure, it fails to capture the causal
time dependencies present in the power consumption time series data. To
overcome this limitation, we introduce the Directed Information (DI) as a more
meaningful measure of privacy in the considered setting and propose a novel
loss function. The optimization is then performed using an adversarial
framework where two Recurrent Neural Networks (RNNs), referred to as the
releaser and the adversary, are trained with opposite goals. Our empirical
studies on real-world data sets from SMs measurements in the worst-case
scenario where an attacker has access to all the training data set used by the
releaser, validate the proposed method and show the existing trade-offs between
privacy and utility.Comment: to appear in IEEESmartGridComm 2019. arXiv admin note: substantial
text overlap with arXiv:1906.0642
Cardiotocography Signal Abnormality Detection based on Deep Unsupervised Models
Cardiotocography (CTG) is a key element when it comes to monitoring fetal
well-being. Obstetricians use it to observe the fetal heart rate (FHR) and the
uterine contraction (UC). The goal is to determine how the fetus reacts to the
contraction and whether it is receiving adequate oxygen. If a problem occurs,
the physician can then respond with an intervention. Unfortunately, the
interpretation of CTGs is highly subjective and there is a low inter- and
intra-observer agreement rate among practitioners. This can lead to unnecessary
medical intervention that represents a risk for both the mother and the fetus.
Recently, computer-assisted diagnosis techniques, especially based on
artificial intelligence models (mostly supervised), have been proposed in the
literature. But, many of these models lack generalization to unseen/test data
samples due to overfitting. Moreover, the unsupervised models were applied to a
very small portion of the CTG samples where the normal and abnormal classes are
highly separable. In this work, deep unsupervised learning approaches, trained
in a semi-supervised manner, are proposed for anomaly detection in CTG signals.
The GANomaly framework, modified to capture the underlying distribution of data
samples, is used as our main model and is applied to the CTU-UHB dataset.
Unlike the recent studies, all the CTG data samples, without any specific
preferences, are used in our work. The experimental results show that our
modified GANomaly model outperforms state-of-the-arts. This study admit the
superiority of the deep unsupervised models over the supervised ones in CTG
abnormality detection
Privacy-Cost Management in Smart Meters with Mutual Information-Based Reinforcement Learning
The rapid development and expansion of the Internet of Things (IoT) paradigm
has drastically increased the collection and exchange of data between sensors
and systems, a phenomenon that raises serious privacy concerns in some domains.
In particular, Smart Meters (SMs) share fine-grained electricity consumption of
households with utility providers that can potentially violate users' privacy
as sensitive information is leaked through the data. In order to enhance
privacy, the electricity consumers can exploit the availability of physical
resources such as a rechargeable battery (RB) to shape their power demand as
dictated by a Privacy-Cost Management Unit (PCMU). In this paper, we present a
novel method to learn the PCMU policy using Deep Reinforcement Learning (DRL).
We adopt the mutual information (MI) between the user's demand load and the
masked load seen by the power grid as a reliable and general privacy measure.
Unlike previous studies, we model the whole temporal correlation in the data to
learn the MI in its general form and use a neural network to estimate the
MI-based reward signal to guide the PCMU learning process. This approach is
combined with a model-free DRL algorithm known as the Deep Double Q-Learning
(DDQL) method. The performance of the complete DDQL-MI algorithm is assessed
empirically using an actual SMs dataset and compared with simpler privacy
measures. Our results show significant improvements over state-of-the-art
privacy-aware demand shaping methods
Privacy-Cost Management in Smart Meters Using Deep Reinforcement Learning
Smart meters (SMs) play a pivotal rule in the smart grid by being able to
report the electricity usage of consumers to the utility provider (UP) almost
in real-time. However, this could leak sensitive information about the
consumers to the UP or a third-party. Recent works have leveraged the
availability of energy storage devices, e.g., a rechargeable battery (RB), in
order to provide privacy to the consumers with minimal additional energy cost.
In this paper, a privacy-cost management unit (PCMU) is proposed based on a
model-free deep reinforcement learning algorithm, called deep double Q-learning
(DDQL). Empirical results evaluated on actual SMs data are presented to compare
DDQL with the state-of-the-art, i.e., classical Q-learning (CQL). Additionally,
the performance of the method is investigated for two concrete cases where
attackers aim to infer the actual demand load and the occupancy status of
dwellings. Finally, an abstract information-theoretic characterization is
provided
Deep Directed Information-Based Learning for Privacy-Preserving Smart Meter Data Release
International audienceThe explosion of data collection has raised serious privacy concerns in users due to the possibility that sharing data may also reveal sensitive information. The main goal of a privacypreserving mechanism is to prevent a malicious third party from inferring sensitive information while keeping the shared data useful. In this paper, we study this problem in the context of time series data and smart meters (SMs) power consumption measurements in particular. Although Mutual Information (MI) between private and released variables has been used as a common information-theoretic privacy measure, it fails to capture the causal time dependencies present in the power consumption time series data. To overcome this limitation, we introduce the Directed Information (DI) as a more meaningful measure of privacy in the considered setting and propose a novel loss function. The optimization is then performed using an adversarial framework where two Recurrent Neural Networks (RNNs), referred to as the releaser and the adversary, are trained with opposite goals. Our empirical studies on real-world data sets from SMs measurements in the worst-case scenario where an attacker has access to all the training data set used by the releaser, validate the proposed method and show the existing trade-offs between privacy and utility
Acoustic Emission Signal Entropy as a Means to Estimate Loads in Fiber Reinforced Polymer Rods
Fibre reinforced polymer (FRP) rods are widely used as corrosion-resistant reinforcing in civil structures. However, developing a method to determine the loads on in-service FRP rods remains a challenge. In this study, the entropy of acoustic emission (AE) emanating from FRP rods is used to estimate the applied loads. As loads increased, the fraction of AE hits with higher entropy also increased. High entropy AE hits are defined using the one-sided Chebyshev’s inequality with parameter k = 2 where the histogram of AE entropy up to 10–15% of ultimate load was used as a baseline. According to the one-sided Chebyshev’s inequality, when more than 20% (k = 2) of AE hits that fall further than two standard deviations away from the mean are classified as high entropy events, a new distribution of high entropy AE hits is assumed to exist. We have found that the fraction of high AE hits. In glass FRP and carbon FRP rods, a high entropy AE hit fraction of 20% was exceeded at approximately 40% and 50% of the ultimate load, respectively. This work demonstrates that monitoring high entropy AE hits may provide a useful means to estimate the loads on FRP rods
Real-Time Privacy-Preserving Data Release for Smart Meters
International audienceSmart Meters (SMs) are a fundamental component of smart grids, but they carry sensitive information about users such as occupancy status of houses and therefore, they have raised serious concerns about leakage of consumers private information. In particular, we focus on real-time privacy threats, i.e., potential attackers that try to infer sensitive data from SMs reported data in an online fashion. We adopt an information-theoretic privacy measure and show that it effectively limits the performance of any real-time attacker. Using this privacy measure, we propose a general formulation to design a privatization mechanism that can provide a target level of privacy by adding a minimal amount of distortion to the SMs measurements. On the other hand, to cope with different applications, a flexible distortion measure is considered. This formulation leads to a general loss function, which is optimized using a deep learning adversarial framework, where two neural networks-referred to as the releaser and the adversary-are trained with opposite goals. An exhaustive empirical study is then performed to validate the performances of the proposed approach for the occupancy detection privacy problem, assuming the attacker disposes of either limited or full access to the training dataset