Search CORE

356 research outputs found

Anonymizing Social Graphs via Uncertainty Semantics

Author: Imine Abdessamad
Nguyen Hiep H.
Rusinowitch Michaël
Publication venue
Publication date: 06/08/2014
Field of study

Rather than anonymizing social graphs by generalizing them to super nodes/edges or adding/removing nodes and edges to satisfy given privacy parameters, recent methods exploit the semantics of uncertain graphs to achieve privacy protection of participating entities and their relationship. These techniques anonymize a deterministic graph by converting it into an uncertain form. In this paper, we propose a generalized obfuscation model based on uncertain adjacency matrices that keep expected node degrees equal to those in the unanonymized graph. We analyze two recently proposed schemes and show their fitting into the model. We also point out disadvantages in each method and present several elegant techniques to fill the gap between them. Finally, to support fair comparisons, we develop a new tradeoff quantifying framework by leveraging the concept of incorrectness in location privacy research. Experiments on large social graphs demonstrate the effectiveness of our schemes

arXiv.org e-Print Archive

HAL - Université de Franche-Comté

Crossref

INRIA a CCSD electronic archive server

A Maximum Variance Approach for Graph Anonymization

Author: Imine Abdessamad
Nguyen Huu Hiep
Rusinowitch Michael
Publication venue: HAL CCSD
Publication date: 03/11/2014
Field of study

Best Paper AwardInternational audienceUncertain graphs, a form of uncertain data, have recently attracted a lot of attention as they can represent inherent uncertainty in collected data. The uncertain graphs pose challenges to conventional data processing techniques and open new research directions. Going in the reserve direction, this paper focuses on the problem of anonymizing a deterministic graph by converting it into an uncertain form. The paper first analyzes drawbacks in a recent uncertainty-based anonymization scheme and then proposes Maximum Variance, a novel approach that provides better tradeoff between privacy and utility. Towards a fair com-parison between the anonymization schemes on graphs, the second con-tribution of this paper is to describe a quantifying framework for graph anonymization by assessing privacy and utility scores of typical schemes in a unified space. The extensive experiments show the effectiveness and efficiency of Maximum Variance on three large real graphs

HAL - Université de Franche-Comté

INRIA a CCSD electronic archive server

Kronecker Graphs: An Approach to Modeling Networks

Author: Chakrabarti D
Faloutsos C
Ghahramani Z
Kleinberg J
Leskovec J
Publication venue
Publication date: 21/08/2009
Field of study

How can we model networks with a mathematically tractable model that allows for rigorous analysis of network properties? Networks exhibit a long list of surprising properties: heavy tails for the degree distribution; small diameters; and densification and shrinking diameters over time. Most present network models either fail to match several of the above properties, are complicated to analyze mathematically, or both. In this paper we propose a generative model for networks that is both mathematically tractable and can generate networks that have the above mentioned properties. Our main idea is to use the Kronecker product to generate graphs that we refer to as "Kronecker graphs". First, we prove that Kronecker graphs naturally obey common network properties. We also provide empirical evidence showing that Kronecker graphs can effectively model the structure of real networks. We then present KronFit, a fast and scalable algorithm for fitting the Kronecker graph generation model to large real networks. A naive approach to fitting would take super- exponential time. In contrast, KronFit takes linear time, by exploiting the structure of Kronecker matrix multiplication and by using statistical simulation techniques. Experiments on large real and synthetic networks show that KronFit finds accurate parameters that indeed very well mimic the properties of target networks. Once fitted, the model parameters can be used to gain insights about the network structure, and the resulting synthetic graphs can be used for null- models, anonymization, extrapolations, and graph summarization

arXiv.org e-Print Archive

CiteSeerX

CUED - Cambridge University Engineering Department

Privacy preserving linkage and sharing of sensitive data

Author: Lazrig Ibrahim Meftah
Publication venue: Colorado State University. Libraries
Publication date: 01/01/2018
Field of study

2018 Summer.Includes bibliographical references.Sensitive data, such as personal and business information, is collected by many service providers nowadays. This data is considered as a rich source of information for research purposes that could benet individuals, researchers and service providers. However, because of the sensitivity of such data, privacy concerns, legislations, and con ict of interests, data holders are reluctant to share their data with others. Data holders typically lter out or obliterate privacy related sensitive information from their data before sharing it, which limits the utility of this data and aects the accuracy of research. Such practice will protect individuals' privacy; however it prevents researchers from linking records belonging to the same individual across dierent sources. This is commonly referred to as record linkage problem by the healthcare industry. In this dissertation, our main focus is on designing and implementing ecient privacy preserving methods that will encourage sensitive information sources to share their data with researchers without compromising the privacy of the clients or aecting the quality of the research data. The proposed solution should be scalable and ecient for real-world deploy- ments and provide good privacy assurance. While this problem has been investigated before, most of the proposed solutions were either considered as partial solutions, not accurate, or impractical, and therefore subject to further improvements. We have identied several issues and limitations in the state of the art solutions and provided a number of contributions that improve upon existing solutions. Our rst contribution is the design of privacy preserving record linkage protocol using semi-trusted third party. The protocol allows a set of data publishers (data holders) who compete with each other, to share sensitive information with subscribers (researchers) while preserving the privacy of their clients and without sharing encryption keys. Our second contribution is the design and implementation of a probabilistic privacy preserving record linkage protocol, that accommodates discrepancies and errors in the data such as typos. This work builds upon the previous work by linking the records that are similar, where the similarity range is formally dened. Our third contribution is a protocol that performs information integration and sharing without third party services. We use garbled circuits secure computation to design and build a system to perform the record linkages between two parties without sharing their data. Our design uses Bloom lters as inputs to the garbled circuits and performs a probabilistic record linkage using the Dice coecient similarity measure. As garbled circuits are known for their expensive computations, we propose new approaches that reduce the computation overhead needed, to achieve a given level of privacy. We built a scalable record linkage system using garbled circuits, that could be deployed in a distributed computation environment like the cloud, and evaluated its security and performance. One of the performance issues for linking large datasets is the amount of secure computation to compare every pair of records across the linked datasets to nd all possible record matches. To reduce the amount of computations a method, known as blocking, is used to lter out as much as possible of the record pairs that will not match, and limit the comparison to a subset of the record pairs (called can- didate pairs) that possibly match. Most of the current blocking methods either require the parties to share blocking keys (called blocks identiers), extracted from the domain of some record attributes (termed blocking variables), or share reference data points to group their records around these points using some similarity measures. Though these methods reduce the computation substantially, they leak too much information about the records within each block. Toward this end, we proposed a novel privacy preserving approximate blocking scheme that allows parties to generate the list of candidate pairs with high accuracy, while protecting the privacy of the records in each block. Our scheme is congurable such that the level of performance and accuracy could be achieved according to the required level of privacy. We analyzed the accuracy and privacy of our scheme, implemented a prototype of the scheme, and experimentally evaluated its accuracy and performance against dierent levels of privacy

Mountain Scholar (Digital Collections of Colorado and Wyoming)

k-Anonymity on Graphs using the Szemerédi Regularity Lemma

Author: Minello G
Rossi L
Torsello A
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

Graph anonymisation aims at reducing the ability of an attacker to identify the nodes of a graph by obfuscating its structural information. In k-anonymity, this means making each node indistinguishable from at least other k-1 nodes. Simply stripping the nodes of a graph of their identifying label is insufficient, as with enough structural knowledge an attacker can still recover the nodes identities. We propose an algorithm to enforce k-anonymity based on the Szemerédi regularity lemma. Given a graph, we start by computing a regular partition of its nodes. The Szemerédi regularity lemma ensures that such a partition exists and that the edges between the sets of nodes behave quasi-randomly. With this partition to hand, we anonymize the graph by randomizing the edges within each set, obtaining a graph that is structurally similar to the original one yet the nodes within each set are structurally indistinguishable. Unlike other k-anonymisation methods, our approach does not consider a single type of attack, but instead it aims to prevent any structure-based de-anonymisation attempt. We test our framework on a wide range of real-world networks and we compare it against another simple yet widely used k-anonymisation technique demonstrating the effectiveness of our approach

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Queen Mary Research Online

A Comprehensive Bibliometric Analysis on Social Network Anonymization: Current Approaches and Future Directions

Author: Gandomi Amir H.
Gharoun Hassan
Khorshidi Mohammad Sadegh
Rakhshaninejad Morteza
Yazdanjouei Hossein
Yazdanjue Navid
Publication venue
Publication date: 24/07/2023
Field of study

In recent decades, social network anonymization has become a crucial research field due to its pivotal role in preserving users' privacy. However, the high diversity of approaches introduced in relevant studies poses a challenge to gaining a profound understanding of the field. In response to this, the current study presents an exhaustive and well-structured bibliometric analysis of the social network anonymization field. To begin our research, related studies from the period of 2007-2022 were collected from the Scopus Database then pre-processed. Following this, the VOSviewer was used to visualize the network of authors' keywords. Subsequently, extensive statistical and network analyses were performed to identify the most prominent keywords and trending topics. Additionally, the application of co-word analysis through SciMAT and the Alluvial diagram allowed us to explore the themes of social network anonymization and scrutinize their evolution over time. These analyses culminated in an innovative taxonomy of the existing approaches and anticipation of potential trends in this domain. To the best of our knowledge, this is the first bibliometric analysis in the social network anonymization field, which offers a deeper understanding of the current state and an insightful roadmap for future research in this domain.Comment: 73 pages, 28 figure

arXiv.org e-Print Archive

Extending the Exposure Score of Web Browsers by Incorporating CVSS

Author: Mohammad Lena
Mohsen Fadi
Naser Riham
Shtayyeh Adel
Struijk Marten
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

When browsing the Internet, HTTP headers enable both clients and servers send extra data in their requests or responses such as the User-Agent string. This string contains information related to the sender’s device, browser, and operating system. Yet its content differs from one browser to another. Despite the privacy and security risks of User-Agent strings, very few works have tackled this problem. Our previous work proposed giving Internet browsers exposure relative scores to aid users to choose less intrusive ones. Thus, the objective of this work is to extend our previous work through: first, conducting a user study to identify its limitations. Second, extending the exposure score via incorporating data from the NVD. Third, providing a full implementation, instead of a limited prototype. The proposed system: assigns scores to users’ browsers upon visiting our website. It also suggests alternative safe browsers, and finally it allows updating the back-end database with a click of a button. We applied our method to a data set of more than 52 thousand unique browsers. Our performance and validation analysis show that our solution is accurate and efficient. The source code and data set are publicly available here [4].</p

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Differentially Private Mixture of Generative Neural Networks

Author: Acs Gergely
Castelluccia Claude
De Cristofaro Emiliano
Melis Luca
Publication venue
Publication date: 18/11/2017
Field of study

Generative models are used in a wide range of applications building on large amounts of contextually rich information. Due to possible privacy violations of the individuals whose data is used to train these models, however, publishing or sharing generative models is not always viable. In this paper, we present a novel technique for privately releasing generative models and entire high-dimensional datasets produced by these models. We model the generator distribution of the training data with a mixture of

k

generative neural networks. These are trained together and collectively learn the generator distribution of a dataset. Data is divided into

k

clusters, using a novel differentially private kernel

k

-means, then each cluster is given to separate generative neural networks, such as Restricted Boltzmann Machines or Variational Autoencoders, which are trained only on their own cluster using differentially private gradient descent. We evaluate our approach using the MNIST dataset, as well as call detail records and transit datasets, showing that it produces realistic synthetic samples, which can also be used to accurately compute arbitrary number of counting queries.Comment: A shorter version of this paper appeared at the 17th IEEE International Conference on Data Mining (ICDM 2017). This is the full version, published in IEEE Transactions on Knowledge and Data Engineering (TKDE

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

UCL Discovery