1,107 research outputs found
A Comprehensive Bibliometric Analysis on Social Network Anonymization: Current Approaches and Future Directions
In recent decades, social network anonymization has become a crucial research
field due to its pivotal role in preserving users' privacy. However, the high
diversity of approaches introduced in relevant studies poses a challenge to
gaining a profound understanding of the field. In response to this, the current
study presents an exhaustive and well-structured bibliometric analysis of the
social network anonymization field. To begin our research, related studies from
the period of 2007-2022 were collected from the Scopus Database then
pre-processed. Following this, the VOSviewer was used to visualize the network
of authors' keywords. Subsequently, extensive statistical and network analyses
were performed to identify the most prominent keywords and trending topics.
Additionally, the application of co-word analysis through SciMAT and the
Alluvial diagram allowed us to explore the themes of social network
anonymization and scrutinize their evolution over time. These analyses
culminated in an innovative taxonomy of the existing approaches and
anticipation of potential trends in this domain. To the best of our knowledge,
this is the first bibliometric analysis in the social network anonymization
field, which offers a deeper understanding of the current state and an
insightful roadmap for future research in this domain.Comment: 73 pages, 28 figure
Injecting Uncertainty in Graphs for Identity Obfuscation
Data collected nowadays by social-networking applications create fascinating
opportunities for building novel services, as well as expanding our
understanding about social structures and their dynamics. Unfortunately,
publishing social-network graphs is considered an ill-advised practice due to
privacy concerns. To alleviate this problem, several anonymization methods have
been proposed, aiming at reducing the risk of a privacy breach on the published
data, while still allowing to analyze them and draw relevant conclusions. In
this paper we introduce a new anonymization approach that is based on injecting
uncertainty in social graphs and publishing the resulting uncertain graphs.
While existing approaches obfuscate graph data by adding or removing edges
entirely, we propose using a finer-grained perturbation that adds or removes
edges partially: this way we can achieve the same desired level of obfuscation
with smaller changes in the data, thus maintaining higher utility. Our
experiments on real-world networks confirm that at the same level of identity
obfuscation our method provides higher usefulness than existing randomized
methods that publish standard graphs.Comment: VLDB201
GraphSE: An Encrypted Graph Database for Privacy-Preserving Social Search
In this paper, we propose GraphSE, an encrypted graph database for online
social network services to address massive data breaches. GraphSE preserves
the functionality of social search, a key enabler for quality social network
services, where social search queries are conducted on a large-scale social
graph and meanwhile perform set and computational operations on user-generated
contents. To enable efficient privacy-preserving social search, GraphSE
provides an encrypted structural data model to facilitate parallel and
encrypted graph data access. It is also designed to decompose complex social
search queries into atomic operations and realise them via interchangeable
protocols in a fast and scalable manner. We build GraphSE with various
queries supported in the Facebook graph search engine and implement a
full-fledged prototype. Extensive evaluations on Azure Cloud demonstrate that
GraphSE is practical for querying a social graph with a million of users.Comment: This is the full version of our AsiaCCS paper "GraphSE: An
Encrypted Graph Database for Privacy-Preserving Social Search". It includes
the security proof of the proposed scheme. If you want to cite our work,
please cite the conference version of i
Towards Data Privacy and Utility in the Applications of Graph Neural Networks
Graph Neural Networks (GNNs) are essential for handling graph-structured data, often containing sensitive information. It’s vital to maintain a balance between data privacy and usability. To address this, this dissertation introduces three studies aimed at enhancing privacy and utility in GNN applications, particularly in node classification, link prediction, and graph classification. The first work tackles celebrity privacy in social networks. We develop a novel framework using adversarial learning for link-privacy preserved graph embedding, which effectively safeguards sensitive links without compromising the graph’s structure and node attributes. This approach is validated using real social network data. In the second work, we confront challenges in federated graph learning with non-independent and identically distributed (non-IID) data. We introduce PPFL-GNN, a privacy-preserving federated graph neural network framework that mitigates overfitting on the client side and inefficient aggregation on the server side. It leverages local graph data for embeddings and employs embedding alignment techniques for enhanced privacy, addressing the hurdles in federated learning on non-IID graph data. The third work explores Few-Shot graph classification, which aims to classify novel graph types with limited labeled data. We propose a unique framework combining Meta-learning and contrastive learning to better utilize graph structures in molecular and social network datasets. Additionally, we offer benchmark graph datasets with extensive node-attribute dimensions for future research. These studies collectively advance the field of graph-based machine learning by addressing critical issues of data privacy and utility in GNN applications
Differentially Private Link Prediction With Protected Connections
Link prediction (LP) algorithms propose to each node a ranked list of nodes
that are currently non-neighbors, as the most likely candidates for future
linkage. Owing to increasing concerns about privacy, users (nodes) may prefer
to keep some of their connections protected or private. Motivated by this
observation, our goal is to design a differentially private LP algorithm, which
trades off between privacy of the protected node-pairs and the link prediction
accuracy. More specifically, we first propose a form of differential privacy on
graphs, which models the privacy loss only of those node-pairs which are marked
as protected. Next, we develop DPLP , a learning to rank algorithm, which
applies a monotone transform to base scores from a non-private LP system, and
then adds noise. DPLP is trained with a privacy induced ranking loss, which
optimizes the ranking utility for a given maximum allowed level of privacy
leakage of the protected node-pairs. Under a recently-introduced latent node
embedding model, we present a formal trade-off between privacy and LP utility.
Extensive experiments with several real-life graphs and several LP heuristics
show that DPLP can trade off between privacy and predictive performance more
effectively than several alternatives
Comparative Analysis of Privacy Preservation Mechanism: Assessing Trustworthy Cloud Services with a Hybrid Framework and Swarm Intelligence
Cloud computing has emerged as a prominent field in modern computational technology, offering diverse services and resources. However, it has also raised pressing concerns regarding data privacy and the trustworthiness of cloud service providers. Previous works have grappled with these challenges, but many have fallen short in providing comprehensive solutions. In this context, this research proposes a novel framework designed to address the issues of maintaining data privacy and fostering trust in cloud computing services. The primary objective of this work is to develop a robust and integrated solution that safeguards sensitive data and enhances trust in cloud service providers. The proposed architecture encompasses a series of key components, including data collection and preprocessing with k-anonymity, trust generation using the Firefly Algorithm, Ant Colony Optimization for task scheduling and resource allocation, hybrid framework integration, and privacy-preserving computation. The scientific contribution of this work lies in the integration of multiple optimization techniques, such as the Firefly Algorithm and Ant Colony Optimization, to select reliable cloud service providers while considering trust factors and task/resource allocation. Furthermore, the proposed framework ensures data privacy through k-anonymity compliance, dynamic resource allocation, and privacy-preserving computation techniques such as differential privacy and homomorphic encryption. The outcomes of this research provide a comprehensive solution to the complex challenges of data privacy and trust in cloud computing services. By combining these techniques into a hybrid framework, this work contributes to the advancement of secure and effective cloud-based operations, offering a substantial step forward in addressing the critical issues faced by organizations and individuals in an increasingly interconnected digital landscape
SoK: Chasing Accuracy and Privacy, and Catching Both in Differentially Private Histogram Publication
Histograms and synthetic data are of key importance in data analysis. However, researchers have shown that even aggregated data such as histograms, containing no obvious sensitive attributes, can result in privacy leakage. To enable data analysis, a strong notion of privacy is required to avoid risking unintended privacy violations.Such a strong notion of privacy is differential privacy, a statistical notion of privacy that makes privacy leakage quantifiable. The caveat regarding differential privacy is that while it has strong guarantees for privacy, privacy comes at a cost of accuracy. Despite this trade-off being a central and important issue in the adoption of differential privacy, there exists a gap in the literature regarding providing an understanding of the trade-off and how to address it appropriately. Through a systematic literature review (SLR), we investigate the state-of-the-art within accuracy improving differentially private algorithms for histogram and synthetic data publishing. Our contribution is two-fold: 1) we identify trends and connections in the contributions to the field of differential privacy for histograms and synthetic data and 2) we provide an understanding of the privacy/accuracy trade-off challenge by crystallizing different dimensions to accuracy improvement. Accordingly, we position and visualize the ideas in relation to each other and external work, and deconstruct each algorithm to examine the building blocks separately with the aim of pinpointing which dimension of accuracy improvement each technique/approach is targeting. Hence, this systematization of knowledge (SoK) provides an understanding of in which dimensions and how accuracy improvement can be pursued without sacrificing privacy
- …