81 research outputs found

    A Novel Approach to Finding Near-Cliques: The Triangle-Densest Subgraph Problem

    Full text link
    Many graph mining applications rely on detecting subgraphs which are near-cliques. There exists a dichotomy between the results in the existing work related to this problem: on the one hand the densest subgraph problem (DSP) which maximizes the average degree over all subgraphs is solvable in polynomial time but for many networks fails to find subgraphs which are near-cliques. On the other hand, formulations that are geared towards finding near-cliques are NP-hard and frequently inapproximable due to connections with the Maximum Clique problem. In this work, we propose a formulation which combines the best of both worlds: it is solvable in polynomial time and finds near-cliques when the DSP fails. Surprisingly, our formulation is a simple variation of the DSP. Specifically, we define the triangle densest subgraph problem (TDSP): given G(V,E)G(V,E), find a subset of vertices SS^* such that τ(S)=maxSVt(S)S\tau(S^*)=\max_{S \subseteq V} \frac{t(S)}{|S|}, where t(S)t(S) is the number of triangles induced by the set SS. We provide various exact and approximation algorithms which the solve the TDSP efficiently. Furthermore, we show how our algorithms adapt to the more general problem of maximizing the kk-clique average density. Finally, we provide empirical evidence that the TDSP should be used whenever the output of the DSP fails to output a near-clique.Comment: 42 page

    Towards Quantifying Vertex Similarity in Networks

    Full text link
    Vertex similarity is a major problem in network science with a wide range of applications. In this work we provide novel perspectives on finding (dis)similar vertices within a network and across two networks with the same number of vertices (graph matching). With respect to the former problem, we propose to optimize a geometric objective which allows us to express each vertex uniquely as a convex combination of a few extreme types of vertices. Our method has the important advantage of supporting efficiently several types of queries such as "which other vertices are most similar to this vertex?" by the use of the appropriate data structures and of mining interesting patterns in the network. With respect to the latter problem (graph matching), we propose the generalized condition number --a quantity widely used in numerical analysis-- κ(LG,LH)\kappa(L_G,L_H) of the Laplacian matrix representations of G,HG,H as a measure of graph similarity, where G,HG,H are the graphs of interest. We show that this objective has a solid theoretical basis and propose a deterministic and a randomized graph alignment algorithm. We evaluate our algorithms on both synthetic and real data. We observe that our proposed methods achieve high-quality results and provide us with significant insights into the network structure.Comment: 16 papers, 5 figures, 2 table

    Rainbow Connection of Random Regular Graphs

    Full text link
    An edge colored graph GG is rainbow edge connected if any two vertices are connected by a path whose edges have distinct colors. The rainbow connection of a connected graph GG, denoted by rc(G)rc(G), is the smallest number of colors that are needed in order to make GG rainbow connected. In this work we study the rainbow connection of the random rr-regular graph G=G(n,r)G=G(n,r) of order nn, where r4r\ge 4 is a constant. We prove that with probability tending to one as nn goes to infinity the rainbow connection of GG satisfies rc(G)=O(logn)rc(G)=O(\log n), which is best possible up to a hidden constant

    Minimizing Polarization and Disagreement in Social Networks

    Full text link
    The rise of social media and online social networks has been a disruptive force in society. Opinions are increasingly shaped by interactions on online social media, and social phenomena including disagreement and polarization are now tightly woven into everyday life. In this work we initiate the study of the following question: given nn agents, each with its own initial opinion that reflects its core value on a topic, and an opinion dynamics model, what is the structure of a social network that minimizes {\em polarization} and {\em disagreement} simultaneously? This question is central to recommender systems: should a recommender system prefer a link suggestion between two online users with similar mindsets in order to keep disagreement low, or between two users with different opinions in order to expose each to the other's viewpoint of the world, and decrease overall levels of polarization? Our contributions include a mathematical formalization of this question as an optimization problem and an exact, time-efficient algorithm. We also prove that there always exists a network with O(n/ϵ2)O(n/\epsilon^2) edges that is a (1+ϵ)(1+\epsilon) approximation to the optimum. For a fixed graph, we additionally show how to optimize our objective function over the agents' innate opinions in polynomial time. We perform an empirical study of our proposed methods on synthetic and real-world data that verify their value as mining tools to better understand the trade-off between of disagreement and polarization. We find that there is a lot of space to reduce both polarization and disagreement in real-world networks; for instance, on a Reddit network where users exchange comments on politics, our methods achieve a 60000\sim 60\,000-fold reduction in polarization and disagreement.Comment: 19 pages (accepted, WWW 2018

    Optimal learning of joint alignments with a faulty oracle

    Full text link
    We consider the following problem, which is useful in applications such as joint image and shape alignment. The goal is to recover n discrete variables gi ∈ {0, . . . , k − 1} (up to some global offset) given noisy observations of a set of their pairwise differences {(gi − gj) mod k}; specifically, with probability 1 k + for some > 0 one obtains the correct answer, and with the remaining probability one obtains a uniformly random incorrect answer. We consider a learning-based formulation where one can perform a query to observe a pairwise difference, and the goal is to perform as few queries as possible while obtaining the exact joint alignment. We provide an easy-to-implement, time efficient algorithm that performs O (n lg n k^2 ) queries, and recovers the joint alignment with high probability. We also show that our algorithm is optimal by proving a general lower bound that holds for all non-adaptive algorithms. Our work improves significantly recent work by Chen and Cand´es [CC16], who view the problem as a constrained principal components analysis problem that can be solved using the power method. Specifically, our approach is simpler both in the algorithm and the analysis, and provides additional insights into the problem structure.First author draf
    corecore