Search CORE

13,779 research outputs found

Detecting communities is Hard (And Counting Them is Even Harder)

Author: Rubinstein Aviad
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 8th Innovations in Theoretical Computer Science Conference (ITCS 2017)
Publication date: 24/11/2016
Field of study

We consider the algorithmic problem of community detection in networks. Given an undirected friendship graph G, a subset S of vertices is an (a,b)-community if: * Every member of the community is friends with an (a)-fraction of the community; and * every non-member is friends with at most a (b)-fraction of the community. [Arora, Ge, Sachdeva, Schoenebeck 2012] gave a quasi-polynomial time algorithm for enumerating all the (a,b)-communities for any constants a>b. Here, we prove that, assuming the Exponential Time Hypothesis (ETH), quasi-polynomial time is in fact necessary - and even for a much weaker approximation desideratum. Namely, distinguishing between: * G contains an (1,o(1))-community; and * G does not contain a (b,b+o(1))-community for any b. We also prove that counting the number of (1,o(1))-communities requires quasi-polynomial time assuming the weaker #ETH

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Is It Easier to Count Communities Than Find Them?

Author: Rush Cynthia
Skerman Fiona
Wein Alexander S.
Yang Dana
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 14th Innovations in Theoretical Computer Science Conference (ITCS 2023)
Publication date: 01/01/2023
Field of study

Random graph models with community structure have been studied extensively in the literature. For both the problems of detecting and recovering community structure, an interesting landscape of statistical and computational phase transitions has emerged. A natural unanswered question is: might it be possible to infer properties of the community structure (for instance, the number and sizes of communities) even in situations where actually finding those communities is believed to be computationally hard? We show the answer is no. In particular, we consider certain hypothesis testing problems between models with different community structures, and we show (in the low-degree polynomial framework) that testing between two options is as hard as finding the communities. In addition, our methods give the first computational lower bounds for testing between two different "planted" distributions, whereas previous results have considered testing between a planted distribution and an i.i.d. "null" distribution

Publikationer från Uppsala Universitet

Dagstuhl Research Online Publication Server

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Detecting Communities under Differential Privacy

Author: Imine Abdessamad
Nguyen Hiep H.
Rusinowitch Michael
Publication venue
Publication date: 07/07/2016
Field of study

Complex networks usually expose community structure with groups of nodes sharing many links with the other nodes in the same group and relatively few with the nodes of the rest. This feature captures valuable information about the organization and even the evolution of the network. Over the last decade, a great number of algorithms for community detection have been proposed to deal with the increasingly complex networks. However, the problem of doing this in a private manner is rarely considered. In this paper, we solve this problem under differential privacy, a prominent privacy concept for releasing private data. We analyze the major challenges behind the problem and propose several schemes to tackle them from two perspectives: input perturbation and algorithm perturbation. We choose Louvain method as the back-end community detection for input perturbation schemes and propose the method LouvainDP which runs Louvain algorithm on a noisy super-graph. For algorithm perturbation, we design ModDivisive using exponential mechanism with the modularity as the score. We have thoroughly evaluated our techniques on real graphs of different sizes and verified their outperformance over the state-of-the-art

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

On Efficiently Detecting Overlapping Communities over Distributed Dynamic Graphs

Author: Chen Lei
Jian Xun
Lian Xiang
Publication venue
Publication date: 18/01/2018
Field of study

Modern networks are of huge sizes as well as high dynamics, which challenges the efficiency of community detection algorithms. In this paper, we study the problem of overlapping community detection on distributed and dynamic graphs. Given a distributed, undirected and unweighted graph, the goal is to detect overlapping communities incrementally as the graph is dynamically changing. We propose an efficient algorithm, called \textit{randomized Speaker-Listener Label Propagation Algorithm} (rSLPA), based on the \textit{Speaker-Listener Label Propagation Algorithm} (SLPA) by relaxing the probability distribution of label propagation. Besides detecting high-quality communities, rSLPA can incrementally update the detected communities after a batch of edge insertion and deletion operations. To the best of our knowledge, rSLPA is the first algorithm that can incrementally capture the same communities as those obtained by applying the detection algorithm from the scratch on the updated graph. Extensive experiments are conducted on both synthetic and real-world datasets, and the results show that our algorithm can achieve high accuracy and efficiency at the same time.Comment: A short version of this paper will be published as ICDE'2018 poste

arXiv.org e-Print Archive

Crossref

Discovering Communities of Community Discovery

Author: Amelio Alessia
Coscia Michele
Coscia Michele
Dao Vinh-Loc
Leskovec Jure
Porter MA
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 04/07/2019
Field of study

Discovering communities in complex networks means grouping nodes similar to each other, to uncover latent information about them. There are hundreds of different algorithms to solve the community detection task, each with its own understanding and definition of what a "community" is. Dozens of review works attempt to order such a diverse landscape -- classifying community discovery algorithms by the process they employ to detect communities, by their explicitly stated definition of community, or by their performance on a standardized task. In this paper, we classify community discovery algorithms according to a fourth criterion: the similarity of their results. We create an Algorithm Similarity Network (ASN), whose nodes are the community detection approaches, connected if they return similar groupings. We then perform community detection on this network, grouping algorithms that consistently return the same partitions or overlapping coverage over a span of more than one thousand synthetic and real world networks. This paper is an attempt to create a similarity-based classification of community detection algorithms based on empirical data. It improves over the state of the art by comparing more than seventy approaches, discovering that the ASN contains well-separated groups, making it a sensible tool for practitioners, aiding their choice of algorithms fitting their analytic needs

arXiv.org e-Print Archive

Crossref

The IT University of Copenhagen's Repository

Phase Transitions of the Typical Algorithmic Complexity of the Random Satisfiability Problem Studied with Linear Programming

Author: Bleim Roman
Hartmann Alexander K.
Schawe Hendrik
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 18/09/2018
Field of study

Here we study the NP-complete

K

-SAT problem. Although the worst-case complexity of NP-complete problems is conjectured to be exponential, there exist parametrized random ensembles of problems where solutions can typically be found in polynomial time for suitable ranges of the parameter. In fact, random

K

-SAT, with

\alpha=M/N

as control parameter, can be solved quickly for small enough values of

\alpha

. It shows a phase transition between a satisfiable phase and an unsatisfiable phase. For branch and bound algorithms, which operate in the space of feasible Boolean configurations, the empirically hardest problems are located only close to this phase transition. Here we study

K

-SAT (

K=3,4

) and the related optimization problem MAX-SAT by a linear programming approach, which is widely used for practical problems and allows for polynomial run time. In contrast to branch and bound it operates outside the space of feasible configurations. On the other hand, finding a solution within polynomial time is not guaranteed. We investigated several variants like including artificial objective functions, so called cutting-plane approaches, and a mapping to the NP-complete vertex-cover problem. We observed several easy-hard transitions, from where the problems are typically solvable (in polynomial time) using the given algorithms, respectively, to where they are not solvable in polynomial time. For the related vertex-cover problem on random graphs these easy-hard transitions can be identified with structural properties of the graphs, like percolation transitions. For the present random

K

-SAT problem we have investigated numerous structural properties also exhibiting clear transitions, but they appear not be correlated to the here observed easy-hard transitions. This renders the behaviour of random

K

-SAT more complex than, e.g., the vertex-cover problem.Comment: 11 pages, 5 figure

arXiv.org e-Print Archive

Directory of Open Access Journals

FigShare

Automatic Detection of Online Jihadist Hate Speech

Author: De Pauw Guy
De Smedt Tom
Van Ostaeyen Pieter
Publication venue
Publication date: 01/01/2018
Field of study

We have developed a system that automatically detects online jihadist hate speech with over 80% accuracy, by using techniques from Natural Language Processing and Machine Learning. The system is trained on a corpus of 45,000 subversive Twitter messages collected from October 2014 to December 2016. We present a qualitative and quantitative analysis of the jihadist rhetoric in the corpus, examine the network of Twitter users, outline the technical procedure used to train the system, and discuss examples of use.Comment: 31 page

arXiv.org e-Print Archive

Institutional Repository Universiteit Antwerpen

Bridges of the BeltLine

Author: Evans Jeff
Gilliland Scott
LaForce Salimah
Publication venue: Georgia Institute of Technology
Publication date: 01/04/2022
Field of study

As currently realized, the Atlanta BeltLine weaves under, over, and through a multitude of overpasses, footbridges, and tunnels. As in any city, this significant feature is simultaneously an asset and a potential hazard. These types of structures are "vulnerable critical facilities" that should be included in emergency risk assessments and mitigation planning (FEMA, 2013). As such, the Bridges of the BeltLine project was proposed as a mixed-methods study to understand how people's movement along the BeltLine can inform emergency management mitigation, planning, and response. Understanding pedestrian flow in cities has been underfunded and understudied but is nonetheless critical to city infrastructure monitoring and improvement projects. This study focused on developing inexpensive, low-power consumption sensors capable of detecting human presence while preserving privacy, as well as a survey designed to collect data that the sensors cannot. The survey data were intended to describe BeltLine users, querying on demographics, reasons, frequency, duration of use, and mode of travel to and on the BeltLine. After conferring with the Atlanta BeltLine, Inc. (ABI) leadership, it became apparent that ABI's primary interest is in understanding which communities are being served by the BeltLine and whether it has changed commuting and travel behaviors or created new demand. As a result, the project's original focus on emergency management was expanded to explore which communities are being served and for what kind of use. As such, the project's revised objective was two-fold: to facilitate understanding of (a) whether the BeltLine is serving the adjacent communities and purpose of use and (b) to inform emergency mitigation, planning, and response.This research was made possible by a grant from Georgia Tech's Executive Vice President of Research, Small Bets Seed Grants program, with supplemental funding from the Center for the Development and Application of Internet of Things Technologies (CDAIT)

Scholarly Materials And Research @ Georgia Tech