Search CORE

17 research outputs found

A Comprehensive Bibliometric Analysis on Social Network Anonymization: Current Approaches and Future Directions

Author: Gandomi Amir H.
Gharoun Hassan
Khorshidi Mohammad Sadegh
Rakhshaninejad Morteza
Yazdanjouei Hossein
Yazdanjue Navid
Publication venue
Publication date: 24/07/2023
Field of study

In recent decades, social network anonymization has become a crucial research field due to its pivotal role in preserving users' privacy. However, the high diversity of approaches introduced in relevant studies poses a challenge to gaining a profound understanding of the field. In response to this, the current study presents an exhaustive and well-structured bibliometric analysis of the social network anonymization field. To begin our research, related studies from the period of 2007-2022 were collected from the Scopus Database then pre-processed. Following this, the VOSviewer was used to visualize the network of authors' keywords. Subsequently, extensive statistical and network analyses were performed to identify the most prominent keywords and trending topics. Additionally, the application of co-word analysis through SciMAT and the Alluvial diagram allowed us to explore the themes of social network anonymization and scrutinize their evolution over time. These analyses culminated in an innovative taxonomy of the existing approaches and anticipation of potential trends in this domain. To the best of our knowledge, this is the first bibliometric analysis in the social network anonymization field, which offers a deeper understanding of the current state and an insightful roadmap for future research in this domain.Comment: 73 pages, 28 figure

arXiv.org e-Print Archive

Alignment and Assembly:Inferring Networks from Noisy Observations

Author: Yartseva Lyudmila
Publication venue: Lausanne, EPFL
Publication date: 28/03/2017
Field of study

Over recent years, many large network datasets become available, giving rise to novel and valuable applications of data mining and machine learning techniques. These datasets include social networks, the structure of the Internet, and protein-interaction networks, to name just a few. Graph mining exploits information hidden in these data to shed light on such problems as finding relevant pages on the web, or identifying communities of strongly connected individuals. Clearly, to address such problems, we first need the complete and reliable network graph. In many real-world scenarios, the full graph is not available for free. For example, data-collection processes may be noisy and unreliable or node identifiers may be hidden for privacy protection. Therefore, we cannot rely on the node labels to infer the full graph. In this thesis, we address fundamental and practical questions of inferring a true full network from multiple ambiguous observations. We formulate two variations of this problem: network alignment and network assembly. In each variant, we address two types of questions: first, we characterize how graph features impact the fundamental feasibility of reconstruction; second, we seek efficient algorithms that can scale to very large networks. In the first part of this thesis, we consider network alignment. We assume two large, noisy observations of the true network that are not labeled. Network alignment refers to the problem of aligning the vertices of the two networks using only structural cues and it can be viewed as a generalization of the classic graph-isomorphism problem. We make the following contributions. First, we introduce a random bigraph model with parameters p, t and s that generates two correlated graphs. We characterize conditions on p, t and s for the feasibility of alignment of two graphs. Second, we create an algorithm named percolation graph-matching (PGM) that builds an alignment from a small set of pre-matched nodes S. We prove conditions on the parameters p, t , s and r for which PGM succeeds, and we establish a phase transition in |S|. In the second part of this thesis, we consider network assembly. We assume many small, noisy observations of the true network, called patches. The node labels are either absent or not unique. The network assembly problem consists in reconstructing the true graph from these patches. We make the following contributions. First, we introduce a novel random-graph model with parameters p and q that generates a network with high clustering. We characterize conditions on p and q for feasibility of assembly. Second, we propose a heuristic assembly algorithm to reconstruct the true graph from arbitrary patches with label ambiguity

Infoscience - École polytechnique fédérale de Lausanne

Recommended from our members

Predictive Privacy: Modeling Privacy Harms

Author: Radway Sarah
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2021
Field of study

This work examines how we can account for harm arising from innovation in machine learning and database matching, and will propose a prefatory method for the quantification of harm. The work’s primary contribution is to propose a method of modeling the harm of a dataset, analyzing the risk of deanonymization given these new factors. We carry out two experiments in order to demonstrate the harm accompanying modern dataset applications -- the first exploring machine learning's applications to WhatsApp social network analysis, the second exploring the potential for large-scale database matching between University directories and TinderU profiles. We apply our model to these examples, to demonstrate its the efficacy in a real world setting. We show that this model can be applied as a general framework, to guide both legal regulation of data release and implementation of current anonymization methods

Columbia University Academic Commons

Network Alignment: Theory, Algorithms, and Applications

Author: Kazemi Ehsan
Publication venue: Lausanne, EPFL
Publication date: 25/10/2016
Field of study

Networks are central in the modeling and analysis of many large-scale human and technical systems, and they have applications in diverse fields such as computer science, biology, social sciences, and economics. Recently, network mining has been an active area of research. In this thesis, we study several related network-mining problems, from three different perspectives: the modeling and theory perspective, the computational perspective, and the application perspective. In the bulk of this thesis, we focus on network alignment, where the data provides two (or more) partial views of the network, and where the node labels are sometimes ambiguous. Network alignment has applications in social-network reconciliation and de-anonymization, protein-network alignment in biology, and computer vision. In the first part of this thesis, we investigate the feasibility of network alignment with a random-graph model. This random-graph model generates two (or several) correlated networks, and lets the two networks to overlap only partially. For a particular alignment, we define a cost function for structural mismatch. We show that the minimization of the proposed cost function (assuming that we have access to infinite computational power), with high probability, results in an alignment that recovers the set of shared nodes between the two networks, and that also recovers the true matching between the shared nodes. The most scalable network-alignment approaches use ideas from percolation theory, where a matched node-couple infects its neighboring couples that are additional potential matches. In the second part of this thesis, we propose a new percolation-based network-alignment algorithm that can match large networks by using only the network structure and a handful of initially pre-matched node-couples called seed set. We characterize a phase transition in matching performance as a function of the seed-set size. In the third part of this thesis, we consider two important application areas of network mining in biology and public health. The first application area is percolation-based network alignment of protein-protein interaction (PPI) networks in biology. The alignment of biological networks has many uses, such as the detection of conserved biological network motifs, the prediction of protein interactions, and the reconstruction of phylogenetic trees. Network alignment can be used to transfer biological knowledge between species. We introduce a new global pairwise-network alignment algorithm for PPI networks, called PROPER. The PROPER algorithm shows higher accuracy and speed compared to other global network-alignment methods. We also extend PROPER to the global multiple-network alignment problem. We introduce a new algorithm, called MPROPER, for matching multiple networks. Finally, we explore IsoRank, one of the first and most referenced global pairwise-network alignment algorithms. Our second application area is the control of epidemic processes. We develop and model strategies for mitigating an epidemic in a large-scale dynamic contact network. More precisely, we study epidemics of infectious diseases by (i) modeling the spread of epidemics on a network by using many pieces of information about the mobility and behavior of a population; and by (ii) designing personalized behavioral recommendations for individuals, in order to mitigate the effect of epidemics on that network

Infoscience - École polytechnique fédérale de Lausanne

A survey of results on mobile phone datasets analysis

Author: A Amini
A Bogomolov
A Bogomolov
A Bogomolov
A Clauset
A Kuusik
A Narayanan
A Noulas
A Stopczynski
A Wesolowski
AA Nanavati
AL Barabási
AL Barabási
AL Barabási
B Csáji
C Cortes
C Herrera-Yagüe
C Ratti
C Ratti
C Smith-Clarke
C Song
C Song
CA Hidalgo
CO Buckee
D Grady
D Lazer
D Liben-Nowell
D Naboulsi
D Quercia
D Wang
DJ Mir
DJ Watts
DJ Watts
E Carolan
E Ferrara
E Frias-Martinez
E Katz
ED Fitkov-Norris
EU
F Baccelli
F Calabrese
F Calabrese
F Calabrese
F Calabrese
F Calabrese
F Manfredini
F Peruani
F Simini
FHZ Xavier
FHZ Xavier
G Ghoshal
G Kossinets
G Krings
G Krings
G Krings
G Miritello
G Miritello
G Miritello
G Miritello
G Palla
G Ranjan
G Tibély
GK Zipf
H Mao
H Risselada
H Sterly
H Zang
H Zhang
H-H Jo
H-H Jo
H-H Jo
I Trestian
J Abello
J Candia
J Karikoski
J Karikoski
J McInerney
J Park
J Reades
J Reades
J Saramäki
J Steenbruggen
J Wiese
J-P Onnela
JE Blumenstock
JE Blumenstock
JE Blumenstock
JL Toole
JP Bagrow
JP Bagrow
JP Onnela
JP Onnela
JP Onnela
JP Onnela
K Dasgupta
K Kianmehr
K Yu
KS Xu
KS Xu
L Backstrom
L Gao
L Kovanen
L Kovanen
L Kovanen
L Sweeney
L Sweeney
L Tabourier
M Barthélemy
M Berlingerio
M Cebrian
M Karsai
M Karsai
M Karsai
M Karsai
M Kivelä
M Martino
M Nanni
M Pielot
M Rosvall
M Schläpfer
M Seshadri
M Tizzoni
M-X Li
MC González
MEJ Newman
MEJ Newman
MS Granovetter
N Aharony
N Du
N Eagle
N Eagle
N Eagle
N Eagle
N Eagle
N Eagle
O Bucicovschi
P Deville
P Expert
P Holme
P Wang
P Wang
P Wang
PJ Mucha
R Kwok
R Lambiotte
R Ling
R Trasarti
RD Malmgren
S Catanese
S Gambs
S Hill
S Isaacman
S Isaacman
S Isaacman
S Jiang
S Kirkpatrick
S Landau
S Motahari
SY Hung
T Aynaud
T Dierkes
T Louail
T Raeder
V Angelakis
V Blondel
V Blondel
V Frias-Martinez
V Frias-Martinez
V Frias-Martinez
V Frias-Martinez
V Palchykov
V Salnikov
V-P Backlund
VD Blondel
VD Blondel
W Aiello
X Lu
Y Altshuler
Y Kim
Y Kryvasheyeu
Y Richter
Y Song
Y Wu
YA Montjoye de
YY Ahn
YY Liu
Z Huang
Z Smoreda
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Data mining applications of singular value decomposition

Author: Kurucz Miklós
Publication venue
Publication date: 01/01/2011
Field of study

ELTE Digital Institutional Repository (EDIT)

Transforming Graph Representations for Statistical Relational Learning

Author: Aha David W.
McDowell Luke K.
Neville Jennifer
Rossi Ryan A.
Publication venue
Publication date: 01/01/2012
Field of study

Relational data representations have become an increasingly important topic due to the recent proliferation of network datasets (e.g., social, biological, information networks) and a corresponding increase in the application of statistical relational learning (SRL) algorithms to these domains. In this article, we examine a range of representation issues for graph-based relational data. Since the choice of relational data representation for the nodes, links, and features can dramatically affect the capabilities of SRL algorithms, we survey approaches and opportunities for relational representation transformation designed to improve the performance of these algorithms. This leads us to introduce an intuitive taxonomy for data representation transformations in relational domains that incorporates link transformation and node transformation as symmetric representation tasks. In particular, the transformation tasks for both nodes and links include (i) predicting their existence, (ii) predicting their label or type, (iii) estimating their weight or importance, and (iv) systematically constructing their relevant features. We motivate our taxonomy through detailed examples and use it to survey and compare competing approaches for each of these tasks. We also discuss general conditions for transforming links, nodes, and features. Finally, we highlight challenges that remain to be addressed

arXiv.org e-Print Archive

CiteSeerX

On the Privacy and Utility of Social Networks

Author: SONG YI
Publication venue
Publication date: 04/08/2014
Field of study

Ph.DDOCTOR OF PHILOSOPH

ScholarBank@NUS