114,114 research outputs found
Sharing Social Network Data: Differentially Private Estimation of Exponential-Family Random Graph Models
Motivated by a real-life problem of sharing social network data that contain
sensitive personal information, we propose a novel approach to release and
analyze synthetic graphs in order to protect privacy of individual
relationships captured by the social network while maintaining the validity of
statistical results. A case study using a version of the Enron e-mail corpus
dataset demonstrates the application and usefulness of the proposed techniques
in solving the challenging problem of maintaining privacy \emph{and} supporting
open access to network data to ensure reproducibility of existing studies and
discovering new scientific insights that can be obtained by analyzing such
data. We use a simple yet effective randomized response mechanism to generate
synthetic networks under -edge differential privacy, and then use
likelihood based inference for missing data and Markov chain Monte Carlo
techniques to fit exponential-family random graph models to the generated
synthetic networks.Comment: Updated, 39 page
Mining Frequent Graph Patterns with Differential Privacy
Discovering frequent graph patterns in a graph database offers valuable
information in a variety of applications. However, if the graph dataset
contains sensitive data of individuals such as mobile phone-call graphs and
web-click graphs, releasing discovered frequent patterns may present a threat
to the privacy of individuals. {\em Differential privacy} has recently emerged
as the {\em de facto} standard for private data analysis due to its provable
privacy guarantee. In this paper we propose the first differentially private
algorithm for mining frequent graph patterns.
We first show that previous techniques on differentially private discovery of
frequent {\em itemsets} cannot apply in mining frequent graph patterns due to
the inherent complexity of handling structural information in graphs. We then
address this challenge by proposing a Markov Chain Monte Carlo (MCMC) sampling
based algorithm. Unlike previous work on frequent itemset mining, our
techniques do not rely on the output of a non-private mining algorithm.
Instead, we observe that both frequent graph pattern mining and the guarantee
of differential privacy can be unified into an MCMC sampling framework. In
addition, we establish the privacy and utility guarantee of our algorithm and
propose an efficient neighboring pattern counting technique as well.
Experimental results show that the proposed algorithm is able to output
frequent patterns with good precision
Who Tracks Who? A Surveillance Capitalist Examination of Commercial Bluetooth Tracking Networks
Object and person tracking networks powered by Bluetooth and mobile devices
have become increasingly popular for purposes of public safety and individual
concerns. This essay examines popular commercial tracking networks and their
campaigns from Apple, Samsung and Tile with reference to surveillance
capitalism and digital privacy, discovering the hidden assets commodified
through said networks, and their potential of turning users into unregulated
digital labour while leaving individual privacy at risk.Comment: 14 page
Measuring Membership Privacy on Aggregate Location Time-Series
While location data is extremely valuable for various applications,
disclosing it prompts serious threats to individuals' privacy. To limit such
concerns, organizations often provide analysts with aggregate time-series that
indicate, e.g., how many people are in a location at a time interval, rather
than raw individual traces. In this paper, we perform a measurement study to
understand Membership Inference Attacks (MIAs) on aggregate location
time-series, where an adversary tries to infer whether a specific user
contributed to the aggregates.
We find that the volume of contributed data, as well as the regularity and
particularity of users' mobility patterns, play a crucial role in the attack's
success. We experiment with a wide range of defenses based on generalization,
hiding, and perturbation, and evaluate their ability to thwart the attack
vis-a-vis the utility loss they introduce for various mobility analytics tasks.
Our results show that some defenses fail across the board, while others work
for specific tasks on aggregate location time-series. For instance, suppressing
small counts can be used for ranking hotspots, data generalization for
forecasting traffic, hotspot discovery, and map inference, while sampling is
effective for location labeling and anomaly detection when the dataset is
sparse. Differentially private techniques provide reasonable accuracy only in
very specific settings, e.g., discovering hotspots and forecasting their
traffic, and more so when using weaker privacy notions like crowd-blending
privacy. Overall, our measurements show that there does not exist a unique
generic defense that can preserve the utility of the analytics for arbitrary
applications, and provide useful insights regarding the disclosure of sanitized
aggregate location time-series
Review on Present State-of-the-Art of Secure and Privacy Preserving Data Mining Techniques
As people of every walk of life are using Internet for various purposes there is growing evidence of proliferation of sensitive information. Security and privacy of data became an important concern. For this reason privacy preserving data mining (PPDM) has been an active research area. PPDM is a process discovering knowledge from voluminous data while protecting sensitive information. In this paper we explore the present state-of-the-art of secure and privacy preserving data mining algorithms or techniques which will help in real world usage of enterprise applications. The techniques discussed include randomized method, k-Anonymity, l-Diversity, t-Closeness, m-Privacy and other PPDM approaches. This paper also focuses on SQL injection attacks and prevention measures. The paper provides research insights into the areas of secure and privacy preserving data mining techniques or algorithms besides presenting gaps in the research that can be used to plan future research
The effectiveness of backward contact tracing in networks
Discovering and isolating infected individuals is a cornerstone of epidemic
control. Because many infectious diseases spread through close contacts,
contact tracing is a key tool for case discovery and control. However, although
contact tracing has been performed widely, the mathematical understanding of
contact tracing has not been fully established and it has not been clearly
understood what determines the efficacy of contact tracing. Here, we reveal
that, compared with "forward" tracing---tracing to whom disease spreads,
"backward" tracing---tracing from whom disease spreads---is profoundly more
effective. The effectiveness of backward tracing is due to simple but
overlooked biases arising from the heterogeneity in contacts. Using simulations
on both synthetic and high-resolution empirical contact datasets, we show that
even at a small probability of detecting infected individuals, strategically
executed contact tracing can prevent a significant fraction of further
transmissions. We also show that---in terms of the number of prevented
transmissions per isolation---case isolation combined with a small amount of
contact tracing is more efficient than case isolation alone. By demonstrating
that backward contact tracing is highly effective at discovering
super-spreading events, we argue that the potential effectiveness of contact
tracing has been underestimated. Therefore, there is a critical need for
revisiting current contact tracing strategies so that they leverage all forms
of biases. Our results also have important consequences for digital contact
tracing because it will be crucial to incorporate the capability for backward
and deep tracing while adhering to the privacy-preserving requirements of these
new platforms.Comment: 15 pages, 4 figure
DP-LTOD: Differential Privacy Latent Trajectory Community Discovering Services over Location-Based Social Networks
IEEE Community detection for Location-based Social Networks (LBSNs) has been received great attention mainly in the field of large-scale Wireless Communication Networks. In this paper, we present a Differential Privacy Latent Trajectory cOmmunity Discovering (DP-LTOD) scheme, which obfuscates original trajectory sequences into differential privacy-guaranteed trajectory sequences for trajectory privacy-preserving, and discovers latent trajectory communities through clustering the uploaded trajectory sequences. Different with traditional trajectory privacy-preserving methods, we first partition original trajectory sequence into different segments. Then, the suitable locations and segments are selected to constitute obfuscated trajectory sequence. Specifically, we formulate the trajectory obfuscation problem to select an optimal trajectory sequence which has the smallest difference with original trajectory sequence. In order to prevent privacy leakage, we add Laplace noise and exponential noise to the outputs during the stages of location obfuscation matrix generation and trajectory sequence function generation, respectively. Through formal privacy analysis,we prove that DP-LTOD scheme can guarantee \epsilon-differential private. Moreover, we develop a trajectory clustering algorithm to classify the trajectories into different kinds of clusters according to semantic distance and geographical distance. Extensive experiments on two real-world datasets illustrate that our DP-LTOD scheme can not only discover latent trajectory communities, but also protect user privacy from leaking
How Far Removed Are You? Scalable Privacy-Preserving Estimation of Social Path Length with Social PaL
Social relationships are a natural basis on which humans make trust
decisions. Online Social Networks (OSNs) are increasingly often used to let
users base trust decisions on the existence and the strength of social
relationships. While most OSNs allow users to discover the length of the social
path to other users, they do so in a centralized way, thus requiring them to
rely on the service provider and reveal their interest in each other. This
paper presents Social PaL, a system supporting the privacy-preserving discovery
of arbitrary-length social paths between any two social network users. We
overcome the bootstrapping problem encountered in all related prior work,
demonstrating that Social PaL allows its users to find all paths of length two
and to discover a significant fraction of longer paths, even when only a small
fraction of OSN users is in the Social PaL system - e.g., discovering 70% of
all paths with only 40% of the users. We implement Social PaL using a scalable
server-side architecture and a modular Android client library, allowing
developers to seamlessly integrate it into their apps.Comment: A preliminary version of this paper appears in ACM WiSec 2015. This
is the full versio
- …