114,114 research outputs found

    Sharing Social Network Data: Differentially Private Estimation of Exponential-Family Random Graph Models

    Get PDF
    Motivated by a real-life problem of sharing social network data that contain sensitive personal information, we propose a novel approach to release and analyze synthetic graphs in order to protect privacy of individual relationships captured by the social network while maintaining the validity of statistical results. A case study using a version of the Enron e-mail corpus dataset demonstrates the application and usefulness of the proposed techniques in solving the challenging problem of maintaining privacy \emph{and} supporting open access to network data to ensure reproducibility of existing studies and discovering new scientific insights that can be obtained by analyzing such data. We use a simple yet effective randomized response mechanism to generate synthetic networks under ϵ\epsilon-edge differential privacy, and then use likelihood based inference for missing data and Markov chain Monte Carlo techniques to fit exponential-family random graph models to the generated synthetic networks.Comment: Updated, 39 page

    Mining Frequent Graph Patterns with Differential Privacy

    Full text link
    Discovering frequent graph patterns in a graph database offers valuable information in a variety of applications. However, if the graph dataset contains sensitive data of individuals such as mobile phone-call graphs and web-click graphs, releasing discovered frequent patterns may present a threat to the privacy of individuals. {\em Differential privacy} has recently emerged as the {\em de facto} standard for private data analysis due to its provable privacy guarantee. In this paper we propose the first differentially private algorithm for mining frequent graph patterns. We first show that previous techniques on differentially private discovery of frequent {\em itemsets} cannot apply in mining frequent graph patterns due to the inherent complexity of handling structural information in graphs. We then address this challenge by proposing a Markov Chain Monte Carlo (MCMC) sampling based algorithm. Unlike previous work on frequent itemset mining, our techniques do not rely on the output of a non-private mining algorithm. Instead, we observe that both frequent graph pattern mining and the guarantee of differential privacy can be unified into an MCMC sampling framework. In addition, we establish the privacy and utility guarantee of our algorithm and propose an efficient neighboring pattern counting technique as well. Experimental results show that the proposed algorithm is able to output frequent patterns with good precision

    Who Tracks Who? A Surveillance Capitalist Examination of Commercial Bluetooth Tracking Networks

    Full text link
    Object and person tracking networks powered by Bluetooth and mobile devices have become increasingly popular for purposes of public safety and individual concerns. This essay examines popular commercial tracking networks and their campaigns from Apple, Samsung and Tile with reference to surveillance capitalism and digital privacy, discovering the hidden assets commodified through said networks, and their potential of turning users into unregulated digital labour while leaving individual privacy at risk.Comment: 14 page

    Measuring Membership Privacy on Aggregate Location Time-Series

    Get PDF
    While location data is extremely valuable for various applications, disclosing it prompts serious threats to individuals' privacy. To limit such concerns, organizations often provide analysts with aggregate time-series that indicate, e.g., how many people are in a location at a time interval, rather than raw individual traces. In this paper, we perform a measurement study to understand Membership Inference Attacks (MIAs) on aggregate location time-series, where an adversary tries to infer whether a specific user contributed to the aggregates. We find that the volume of contributed data, as well as the regularity and particularity of users' mobility patterns, play a crucial role in the attack's success. We experiment with a wide range of defenses based on generalization, hiding, and perturbation, and evaluate their ability to thwart the attack vis-a-vis the utility loss they introduce for various mobility analytics tasks. Our results show that some defenses fail across the board, while others work for specific tasks on aggregate location time-series. For instance, suppressing small counts can be used for ranking hotspots, data generalization for forecasting traffic, hotspot discovery, and map inference, while sampling is effective for location labeling and anomaly detection when the dataset is sparse. Differentially private techniques provide reasonable accuracy only in very specific settings, e.g., discovering hotspots and forecasting their traffic, and more so when using weaker privacy notions like crowd-blending privacy. Overall, our measurements show that there does not exist a unique generic defense that can preserve the utility of the analytics for arbitrary applications, and provide useful insights regarding the disclosure of sanitized aggregate location time-series

    Review on Present State-of-the-Art of Secure and Privacy Preserving Data Mining Techniques

    Get PDF
    As people of every walk of life are using Internet for various purposes there is growing evidence of proliferation of sensitive information. Security and privacy of data became an important concern. For this reason privacy preserving data mining (PPDM) has been an active research area. PPDM is a process discovering knowledge from voluminous data while protecting sensitive information. In this paper we explore the present state-of-the-art of secure and privacy preserving data mining algorithms or techniques which will help in real world usage of enterprise applications. The techniques discussed include randomized method, k-Anonymity, l-Diversity, t-Closeness, m-Privacy and other PPDM approaches. This paper also focuses on SQL injection attacks and prevention measures. The paper provides research insights into the areas of secure and privacy preserving data mining techniques or algorithms besides presenting gaps in the research that can be used to plan future research

    The effectiveness of backward contact tracing in networks

    Full text link
    Discovering and isolating infected individuals is a cornerstone of epidemic control. Because many infectious diseases spread through close contacts, contact tracing is a key tool for case discovery and control. However, although contact tracing has been performed widely, the mathematical understanding of contact tracing has not been fully established and it has not been clearly understood what determines the efficacy of contact tracing. Here, we reveal that, compared with "forward" tracing---tracing to whom disease spreads, "backward" tracing---tracing from whom disease spreads---is profoundly more effective. The effectiveness of backward tracing is due to simple but overlooked biases arising from the heterogeneity in contacts. Using simulations on both synthetic and high-resolution empirical contact datasets, we show that even at a small probability of detecting infected individuals, strategically executed contact tracing can prevent a significant fraction of further transmissions. We also show that---in terms of the number of prevented transmissions per isolation---case isolation combined with a small amount of contact tracing is more efficient than case isolation alone. By demonstrating that backward contact tracing is highly effective at discovering super-spreading events, we argue that the potential effectiveness of contact tracing has been underestimated. Therefore, there is a critical need for revisiting current contact tracing strategies so that they leverage all forms of biases. Our results also have important consequences for digital contact tracing because it will be crucial to incorporate the capability for backward and deep tracing while adhering to the privacy-preserving requirements of these new platforms.Comment: 15 pages, 4 figure

    DP-LTOD: Differential Privacy Latent Trajectory Community Discovering Services over Location-Based Social Networks

    Full text link
    IEEE Community detection for Location-based Social Networks (LBSNs) has been received great attention mainly in the field of large-scale Wireless Communication Networks. In this paper, we present a Differential Privacy Latent Trajectory cOmmunity Discovering (DP-LTOD) scheme, which obfuscates original trajectory sequences into differential privacy-guaranteed trajectory sequences for trajectory privacy-preserving, and discovers latent trajectory communities through clustering the uploaded trajectory sequences. Different with traditional trajectory privacy-preserving methods, we first partition original trajectory sequence into different segments. Then, the suitable locations and segments are selected to constitute obfuscated trajectory sequence. Specifically, we formulate the trajectory obfuscation problem to select an optimal trajectory sequence which has the smallest difference with original trajectory sequence. In order to prevent privacy leakage, we add Laplace noise and exponential noise to the outputs during the stages of location obfuscation matrix generation and trajectory sequence function generation, respectively. Through formal privacy analysis,we prove that DP-LTOD scheme can guarantee \epsilon-differential private. Moreover, we develop a trajectory clustering algorithm to classify the trajectories into different kinds of clusters according to semantic distance and geographical distance. Extensive experiments on two real-world datasets illustrate that our DP-LTOD scheme can not only discover latent trajectory communities, but also protect user privacy from leaking

    How Far Removed Are You? Scalable Privacy-Preserving Estimation of Social Path Length with Social PaL

    Get PDF
    Social relationships are a natural basis on which humans make trust decisions. Online Social Networks (OSNs) are increasingly often used to let users base trust decisions on the existence and the strength of social relationships. While most OSNs allow users to discover the length of the social path to other users, they do so in a centralized way, thus requiring them to rely on the service provider and reveal their interest in each other. This paper presents Social PaL, a system supporting the privacy-preserving discovery of arbitrary-length social paths between any two social network users. We overcome the bootstrapping problem encountered in all related prior work, demonstrating that Social PaL allows its users to find all paths of length two and to discover a significant fraction of longer paths, even when only a small fraction of OSN users is in the Social PaL system - e.g., discovering 70% of all paths with only 40% of the users. We implement Social PaL using a scalable server-side architecture and a modular Android client library, allowing developers to seamlessly integrate it into their apps.Comment: A preliminary version of this paper appears in ACM WiSec 2015. This is the full versio
    • …
    corecore