3,144 research outputs found
Quantifying Differential Privacy in Continuous Data Release under Temporal Correlations
Differential Privacy (DP) has received increasing attention as a rigorous
privacy framework. Many existing studies employ traditional DP mechanisms
(e.g., the Laplace mechanism) as primitives to continuously release private
data for protecting privacy at each time point (i.e., event-level privacy),
which assume that the data at different time points are independent, or that
adversaries do not have knowledge of correlation between data. However,
continuously generated data tend to be temporally correlated, and such
correlations can be acquired by adversaries. In this paper, we investigate the
potential privacy loss of a traditional DP mechanism under temporal
correlations. First, we analyze the privacy leakage of a DP mechanism under
temporal correlation that can be modeled using Markov Chain. Our analysis
reveals that, the event-level privacy loss of a DP mechanism may
\textit{increase over time}. We call the unexpected privacy loss
\textit{temporal privacy leakage} (TPL). Although TPL may increase over time,
we find that its supremum may exist in some cases. Second, we design efficient
algorithms for calculating TPL. Third, we propose data releasing mechanisms
that convert any existing DP mechanism into one against TPL. Experiments
confirm that our approach is efficient and effective.Comment: accepted in TKDE special issue "Best of ICDE 2017". arXiv admin note:
substantial text overlap with arXiv:1610.0754
Measuring Membership Privacy on Aggregate Location Time-Series
While location data is extremely valuable for various applications,
disclosing it prompts serious threats to individuals' privacy. To limit such
concerns, organizations often provide analysts with aggregate time-series that
indicate, e.g., how many people are in a location at a time interval, rather
than raw individual traces. In this paper, we perform a measurement study to
understand Membership Inference Attacks (MIAs) on aggregate location
time-series, where an adversary tries to infer whether a specific user
contributed to the aggregates.
We find that the volume of contributed data, as well as the regularity and
particularity of users' mobility patterns, play a crucial role in the attack's
success. We experiment with a wide range of defenses based on generalization,
hiding, and perturbation, and evaluate their ability to thwart the attack
vis-a-vis the utility loss they introduce for various mobility analytics tasks.
Our results show that some defenses fail across the board, while others work
for specific tasks on aggregate location time-series. For instance, suppressing
small counts can be used for ranking hotspots, data generalization for
forecasting traffic, hotspot discovery, and map inference, while sampling is
effective for location labeling and anomaly detection when the dataset is
sparse. Differentially private techniques provide reasonable accuracy only in
very specific settings, e.g., discovering hotspots and forecasting their
traffic, and more so when using weaker privacy notions like crowd-blending
privacy. Overall, our measurements show that there does not exist a unique
generic defense that can preserve the utility of the analytics for arbitrary
applications, and provide useful insights regarding the disclosure of sanitized
aggregate location time-series
Entropy-based privacy against profiling of user mobility
Location-based services (LBSs) flood mobile phones nowadays, but their use poses an evident privacy risk. The locations accompanying the LBS queries can be exploited by the LBS provider to build the user profile of visited locations, which might disclose sensitive data, such as work or home locations. The classic concept of entropy is widely used to evaluate privacy in these scenarios, where the information is represented as a sequence of independent samples of categorized data. However, since the LBS queries might be sent very frequently, location profiles can be improved by adding temporal dependencies, thus becoming mobility profiles, where location samples are not independent anymore and might disclose the user's mobility patterns. Since the time dimension is factored in, the classic entropy concept falls short of evaluating the real privacy level, which depends also on the time component. Therefore, we propose to extend the entropy-based privacy metric to the use of the entropy rate to evaluate mobility profiles. Then, two perturbative mechanisms are considered to preserve locations and mobility profiles under gradual utility constraints. We further use the proposed privacy metric and compare it to classic ones to evaluate both synthetic and real mobility profiles when the perturbative methods proposed are applied. The results prove the usefulness of the proposed metric for mobility profiles and the need for tailoring the perturbative methods to the features of mobility profiles in order to improve privacy without completely loosing utility.This work is partially supported by the Spanish Ministry of Science and Innovation through the CONSEQUENCE (TEC2010-20572-C02-01/02) and EMRISCO (TEC2013-47665-C4-4-R) projects.The work of Das was partially supported by NSF Grants IIS-1404673, CNS-1355505, CNS-1404677 and DGE-1433659. Part of the work by Rodriguez-Carrion was conducted while she was visiting the Computer Science Department at Missouri University of Science and Technology in 2013–2014
Health privacy : methods for privacy-preserving data sharing of methylation, microbiome and eye tracking data
This thesis studies the privacy risks of biomedical data and develops mechanisms for privacy-preserving data sharing. The contribution of this work is two-fold: First, we demonstrate privacy risks of a variety of biomedical data types such as DNA methylation data, microbiome data and eye tracking data. Despite being less stable than well-studied genome data and more prone to environmental changes, well-known privacy attacks can be adopted and threaten the privacy of data donors. Nevertheless, data sharing is crucial to advance biomedical research given that collection the data of a sufficiently large population is complex and costly. Therefore, we develop as a second step privacy- preserving tools that enable researchers to share such biomedical data. and second, we equip researchers with tools to enable privacy-preserving data sharing. These tools are mostly based on differential privacy, machine learning techniques and adversarial examples and carefully tuned to the concrete use case to maintain data utility while preserving privacy.Diese Dissertation beleuchtet Risiken für die Privatsphäre von biomedizinischen Daten und entwickelt Mechanismen für privatsphäre-erthaltendes Teilen von Daten. Dies zerfällt in zwei Teile: Zunächst zeigen wir die Risiken für die Privatsphäre auf, die von biomedizinischen Daten wie DNA Methylierung, Mikrobiomdaten und bei der Aufnahme von Augenbewegungen vorkommen. Obwohl diese Daten weniger stabil sind als Genomdaten, deren Risiken der Forschung gut bekannt sind, und sich mehr unter Umwelteinflüssen ändern, können bekannte Angriffe angepasst werden und bedrohen die Privatsphäre der Datenspender. Dennoch ist das Teilen von Daten essentiell um biomedizinische Forschung voranzutreiben, denn Daten von einer ausreichend großen Studienpopulation zu sammeln ist aufwändig und teuer. Deshalb entwickeln wir als zweiten Schritt privatsphäre-erhaltende Techniken, die es Wissenschaftlern erlauben, solche biomedizinischen Daten zu teilen. Diese Techniken basieren im Wesentlichen auf differentieller Privatsphäre und feindlichen Beispielen und sind sorgfältig auf den konkreten Einsatzzweck angepasst um den Nutzen der Daten zu erhalten und gleichzeitig die Privatsphäre zu schützen
Protecting Locations with Differential Privacy under Temporal Correlations
Concerns on location privacy frequently arise with the rapid development of
GPS enabled devices and location-based applications. While spatial
transformation techniques such as location perturbation or generalization have
been studied extensively, most techniques rely on syntactic privacy models
without rigorous privacy guarantee. Many of them only consider static scenarios
or perturb the location at single timestamps without considering temporal
correlations of a moving user's locations, and hence are vulnerable to various
inference attacks. While differential privacy has been accepted as a standard
for privacy protection, applying differential privacy in location based
applications presents new challenges, as the protection needs to be enforced on
the fly for a single user and needs to incorporate temporal correlations
between a user's locations.
In this paper, we propose a systematic solution to preserve location privacy
with rigorous privacy guarantee. First, we propose a new definition,
"-location set" based differential privacy, to account for the temporal
correlations in location data. Second, we show that the well known
-norm sensitivity fails to capture the geometric sensitivity in
multidimensional space and propose a new notion, sensitivity hull, based on
which the error of differential privacy is bounded. Third, to obtain the
optimal utility we present a planar isotropic mechanism (PIM) for location
perturbation, which is the first mechanism achieving the lower bound of
differential privacy. Experiments on real-world datasets also demonstrate that
PIM significantly outperforms baseline approaches in data utility.Comment: Final version Nov-04-201
- …