7 research outputs found
Discovering Polarized Communities in Signed Networks
Signed networks contain edge annotations to indicate whether each interaction
is friendly (positive edge) or antagonistic (negative edge). The model is
simple but powerful and it can capture novel and interesting structural
properties of real-world phenomena. The analysis of signed networks has many
applications from modeling discussions in social media, to mining user reviews,
and to recommending products in e-commerce sites. In this paper we consider the
problem of discovering polarized communities in signed networks. In particular,
we search for two communities (subsets of the network vertices) where within
communities there are mostly positive edges while across communities there are
mostly negative edges. We formulate this novel problem as a "discrete
eigenvector" problem, which we show to be NP-hard. We then develop two
intuitive spectral algorithms: one deterministic, and one randomized with
quality guarantee (where is the number of vertices in the
graph), tight up to constant factors. We validate our algorithms against
non-trivial baselines on real-world signed networks. Our experiments confirm
that our algorithms produce higher quality solutions, are much faster and can
scale to much larger networks than the baselines, and are able to detect
ground-truth polarized communities
Unsupervised Attributed Graph Learning: Models and Applications
abstract: Graph is a ubiquitous data structure, which appears in a broad range of real-world scenarios. Accordingly, there has been a surge of research to represent and learn from graphs in order to accomplish various machine learning and graph analysis tasks. However, most of these efforts only utilize the graph structure while nodes in real-world graphs usually come with a rich set of attributes. Typical examples of such nodes and their attributes are users and their profiles in social networks, scientific articles and their content in citation networks, protein molecules and their gene sets in biological networks as well as web pages and their content on the Web. Utilizing node features in such graphs---attributed graphs---can alleviate the graph sparsity problem and help explain various phenomena (e.g., the motives behind the formation of communities in social networks). Therefore, further study of attributed graphs is required to take full advantage of node attributes.
In the wild, attributed graphs are usually unlabeled. Moreover, annotating data is an expensive and time-consuming process, which suffers from many limitations such as annotators’ subjectivity, reproducibility, and consistency. The challenges of data annotation and the growing increase of unlabeled attributed graphs in various real-world applications significantly demand unsupervised learning for attributed graphs.
In this dissertation, I propose a set of novel models to learn from attributed graphs in an unsupervised manner. To better understand and represent nodes and communities in attributed graphs, I present different models in node and community levels. In node level, I utilize node features as well as the graph structure in attributed graphs to learn distributed representations of nodes, which can be useful in a variety of downstream machine learning applications. In community level, with a focus on social media, I take advantage of both node attributes and the graph structure to discover not only communities but also their sentiment-driven profiles and inter-community relations (i.e., alliance, antagonism, or no relation). The discovered community profiles and relations help to better understand the structure and dynamics of social media.Dissertation/ThesisDoctoral Dissertation Computer Science 201
Regularized spectral methods for clustering signed networks
We study the problem of -way clustering in signed graphs. Considerable
attention in recent years has been devoted to analyzing and modeling signed
graphs, where the affinity measure between nodes takes either positive or
negative values. Recently, Cucuringu et al. [CDGT 2019] proposed a spectral
method, namely SPONGE (Signed Positive over Negative Generalized Eigenproblem),
which casts the clustering task as a generalized eigenvalue problem optimizing
a suitably defined objective function. This approach is motivated by social
balance theory, where the clustering task aims to decompose a given network
into disjoint groups, such that individuals within the same group are connected
by as many positive edges as possible, while individuals from different groups
are mainly connected by negative edges. Through extensive numerical
simulations, SPONGE was shown to achieve state-of-the-art empirical
performance. On the theoretical front, [CDGT 2019] analyzed SPONGE and the
popular Signed Laplacian method under the setting of a Signed Stochastic Block
Model (SSBM), for equal-sized clusters, in the regime where the graph is
moderately dense.
In this work, we build on the results in [CDGT 2019] on two fronts for the
normalized versions of SPONGE and the Signed Laplacian. Firstly, for both
algorithms, we extend the theoretical analysis in [CDGT 2019] to the general
setting of unequal-sized clusters in the moderately dense regime.
Secondly, we introduce regularized versions of both methods to handle sparse
graphs -- a regime where standard spectral methods underperform -- and provide
theoretical guarantees under the same SSBM model. To the best of our knowledge,
regularized spectral methods have so far not been considered in the setting of
clustering signed graphs. We complement our theoretical results with an
extensive set of numerical experiments on synthetic data.Comment: 55 pages, 5 figure