81 research outputs found
A Novel Approach to Finding Near-Cliques: The Triangle-Densest Subgraph Problem
Many graph mining applications rely on detecting subgraphs which are
near-cliques. There exists a dichotomy between the results in the existing work
related to this problem: on the one hand the densest subgraph problem (DSP)
which maximizes the average degree over all subgraphs is solvable in polynomial
time but for many networks fails to find subgraphs which are near-cliques. On
the other hand, formulations that are geared towards finding near-cliques are
NP-hard and frequently inapproximable due to connections with the Maximum
Clique problem.
In this work, we propose a formulation which combines the best of both
worlds: it is solvable in polynomial time and finds near-cliques when the DSP
fails. Surprisingly, our formulation is a simple variation of the DSP.
Specifically, we define the triangle densest subgraph problem (TDSP): given
, find a subset of vertices such that , where is the number of triangles induced
by the set . We provide various exact and approximation algorithms which the
solve the TDSP efficiently. Furthermore, we show how our algorithms adapt to
the more general problem of maximizing the -clique average density. Finally,
we provide empirical evidence that the TDSP should be used whenever the output
of the DSP fails to output a near-clique.Comment: 42 page
Towards Quantifying Vertex Similarity in Networks
Vertex similarity is a major problem in network science with a wide range of
applications. In this work we provide novel perspectives on finding
(dis)similar vertices within a network and across two networks with the same
number of vertices (graph matching). With respect to the former problem, we
propose to optimize a geometric objective which allows us to express each
vertex uniquely as a convex combination of a few extreme types of vertices. Our
method has the important advantage of supporting efficiently several types of
queries such as "which other vertices are most similar to this vertex?" by the
use of the appropriate data structures and of mining interesting patterns in
the network. With respect to the latter problem (graph matching), we propose
the generalized condition number --a quantity widely used in numerical
analysis-- of the Laplacian matrix representations of
as a measure of graph similarity, where are the graphs of interest. We
show that this objective has a solid theoretical basis and propose a
deterministic and a randomized graph alignment algorithm. We evaluate our
algorithms on both synthetic and real data. We observe that our proposed
methods achieve high-quality results and provide us with significant insights
into the network structure.Comment: 16 papers, 5 figures, 2 table
Rainbow Connection of Random Regular Graphs
An edge colored graph is rainbow edge connected if any two vertices are
connected by a path whose edges have distinct colors. The rainbow connection of
a connected graph , denoted by , is the smallest number of colors
that are needed in order to make rainbow connected.
In this work we study the rainbow connection of the random -regular graph
of order , where is a constant. We prove that with
probability tending to one as goes to infinity the rainbow connection of
satisfies , which is best possible up to a hidden
constant
Minimizing Polarization and Disagreement in Social Networks
The rise of social media and online social networks has been a disruptive
force in society. Opinions are increasingly shaped by interactions on online
social media, and social phenomena including disagreement and polarization are
now tightly woven into everyday life. In this work we initiate the study of the
following question: given agents, each with its own initial opinion that
reflects its core value on a topic, and an opinion dynamics model, what is the
structure of a social network that minimizes {\em polarization} and {\em
disagreement} simultaneously?
This question is central to recommender systems: should a recommender system
prefer a link suggestion between two online users with similar mindsets in
order to keep disagreement low, or between two users with different opinions in
order to expose each to the other's viewpoint of the world, and decrease
overall levels of polarization? Our contributions include a mathematical
formalization of this question as an optimization problem and an exact,
time-efficient algorithm. We also prove that there always exists a network with
edges that is a approximation to the optimum.
For a fixed graph, we additionally show how to optimize our objective function
over the agents' innate opinions in polynomial time.
We perform an empirical study of our proposed methods on synthetic and
real-world data that verify their value as mining tools to better understand
the trade-off between of disagreement and polarization. We find that there is a
lot of space to reduce both polarization and disagreement in real-world
networks; for instance, on a Reddit network where users exchange comments on
politics, our methods achieve a -fold reduction in polarization
and disagreement.Comment: 19 pages (accepted, WWW 2018
Optimal learning of joint alignments with a faulty oracle
We consider the following problem, which is useful in applications such as joint image and
shape alignment. The goal is to recover n discrete variables gi ∈ {0, . . . , k − 1} (up to some
global offset) given noisy observations of a set of their pairwise differences {(gi − gj) mod k};
specifically, with probability 1
k + for some > 0 one obtains the correct answer, and with
the remaining probability one obtains a uniformly random incorrect answer. We consider a
learning-based formulation where one can perform a query to observe a pairwise difference, and
the goal is to perform as few queries as possible while obtaining the exact joint alignment.
We provide an easy-to-implement, time efficient algorithm that performs O (n lg n
k^2 ) queries, and
recovers the joint alignment with high probability. We also show that our algorithm is optimal
by proving a general lower bound that holds for all non-adaptive algorithms. Our work improves
significantly recent work by Chen and Cand´es [CC16], who view the problem as a constrained
principal components analysis problem that can be solved using the power method. Specifically,
our approach is simpler both in the algorithm and the analysis, and provides additional insights
into the problem structure.First author draf
- …