14 research outputs found

    Exploring Communities in Large Profiled Graphs

    Full text link
    Given a graph GG and a vertex qGq\in G, the community search (CS) problem aims to efficiently find a subgraph of GG whose vertices are closely related to qq. Communities are prevalent in social and biological networks, and can be used in product advertisement and social event recommendation. In this paper, we study profiled community search (PCS), where CS is performed on a profiled graph. This is a graph in which each vertex has labels arranged in a hierarchical manner. Extensive experiments show that PCS can identify communities with themes that are common to their vertices, and is more effective than existing CS approaches. As a naive solution for PCS is highly expensive, we have also developed a tree index, which facilitate efficient and online solutions for PCS

    Core Decomposition in Multilayer Networks: Theory, Algorithms, and Applications

    Get PDF
    Multilayer networks are a powerful paradigm to model complex systems, where multiple relations occur between the same entities. Despite the keen interest in a variety of tasks, algorithms, and analyses in this type of network, the problem of extracting dense subgraphs has remained largely unexplored so far. In this work we study the problem of core decomposition of a multilayer network. The multilayer context is much challenging as no total order exists among multilayer cores; rather, they form a lattice whose size is exponential in the number of layers. In this setting we devise three algorithms which differ in the way they visit the core lattice and in their pruning techniques. We then move a step forward and study the problem of extracting the inner-most (also known as maximal) cores, i.e., the cores that are not dominated by any other core in terms of their core index in all the layers. Inner-most cores are typically orders of magnitude less than all the cores. Motivated by this, we devise an algorithm that effectively exploits the maximality property and extracts inner-most cores directly, without first computing a complete decomposition. Finally, we showcase the multilayer core-decomposition tool in a variety of scenarios and problems. We start by considering the problem of densest-subgraph extraction in multilayer networks. We introduce a definition of multilayer densest subgraph that trades-off between high density and number of layers in which the high density holds, and exploit multilayer core decomposition to approximate this problem with quality guarantees. As further applications, we show how to utilize multilayer core decomposition to speed-up the extraction of frequent cross-graph quasi-cliques and to generalize the community-search problem to the multilayer setting

    DMCS : Density Modularity based Community Search

    Full text link
    Community Search, or finding a connected subgraph (known as a community) containing the given query nodes in a social network, is a fundamental problem. Most of the existing community search models only focus on the internal cohesiveness of a community. However, a high-quality community often has high modularity, which means dense connections inside communities and sparse connections to the nodes outside the community. In this paper, we conduct a pioneer study on searching a community with high modularity. We point out that while modularity has been popularly used in community detection (without query nodes), it has not been adopted for community search, surprisingly, and its application in community search (related to query nodes) brings in new challenges. We address these challenges by designing a new graph modularity function named Density Modularity. To the best of our knowledge, this is the first work on the community search problem using graph modularity. The community search based on the density modularity, termed as DMCS, is to find a community in a social network that contains all the query nodes and has high density-modularity. We prove that the DMCS problem is NP-hard. To efficiently address DMCS, we present new algorithms that run in log-linear time to the graph size. We conduct extensive experimental studies in real-world and synthetic networks, which offer insights into the efficiency and effectiveness of our algorithms. In particular, our algorithm achieves up to 8.5 times higher accuracy in terms of NMI than baseline algorithms

    Towards a "Swiss Army Knife" for Scalable User-Defined Temporal (k,X)(k,\mathcal{X})-Core Analysis

    Full text link
    Querying cohesive subgraphs on temporal graphs (e.g., social network, finance network, etc.) with various conditions has attracted intensive research interests recently. In this paper, we study a novel Temporal (k,X)(k,\mathcal{X})-Core Query (TXCQ) that extends a fundamental Temporal kk-Core Query (TCQ) proposed in our conference paper by optimizing or constraining an arbitrary metric X\mathcal{X} of kk-core, such as size, engagement, interaction frequency, time span, burstiness, periodicity, etc. Our objective is to address specific TXCQ instances with conditions on different X\mathcal{X} in a unified algorithm framework that guarantees scalability. For that, this journal paper proposes a taxonomy of measurement X()\mathcal{X}(\cdot) and achieve our objective using a two-phase framework while X()\mathcal{X}(\cdot) is time-insensitive or time-monotonic. Specifically, Phase 1 still leverages the query processing algorithm of TCQ to induce all distinct kk-cores during a given time range, and meanwhile locates the "time zones" in which the cores emerge. Then, Phase 2 conducts fast local search and X\mathcal{X} evaluation in each time zone with respect to the time insensitivity or monotonicity of X()\mathcal{X}(\cdot). By revealing two insightful concepts named tightest time interval and loosest time interval that bound time zones, the redundant core induction and unnecessary X\mathcal{X} evaluation in a zone can be reduced dramatically. Our experimental results demonstrate that TXCQ can be addressed as efficiently as TCQ, which achieves the latest state-of-the-art performance, by using a general algorithm framework that leaves X()\mathcal{X}(\cdot) as a user-defined function

    Span-core Decomposition for Temporal Networks: Algorithms and Applications

    Full text link
    When analyzing temporal networks, a fundamental task is the identification of dense structures (i.e., groups of vertices that exhibit a large number of links), together with their temporal span (i.e., the period of time for which the high density holds). In this paper we tackle this task by introducing a notion of temporal core decomposition where each core is associated with two quantities, its coreness, which quantifies how densely it is connected, and its span, which is a temporal interval: we call such cores \emph{span-cores}. For a temporal network defined on a discrete temporal domain TT, the total number of time intervals included in TT is quadratic in T|T|, so that the total number of span-cores is potentially quadratic in T|T| as well. Our first main contribution is an algorithm that, by exploiting containment properties among span-cores, computes all the span-cores efficiently. Then, we focus on the problem of finding only the \emph{maximal span-cores}, i.e., span-cores that are not dominated by any other span-core by both their coreness property and their span. We devise a very efficient algorithm that exploits theoretical findings on the maximality condition to directly extract the maximal ones without computing all span-cores. Finally, as a third contribution, we introduce the problem of \emph{temporal community search}, where a set of query vertices is given as input, and the goal is to find a set of densely-connected subgraphs containing the query vertices and covering the whole underlying temporal domain TT. We derive a connection between this problem and the problem of finding (maximal) span-cores. Based on this connection, we show how temporal community search can be solved in polynomial-time via dynamic programming, and how the maximal span-cores can be profitably exploited to significantly speed-up the basic algorithm.Comment: ACM Transactions on Knowledge Discovery from Data (TKDD), 2020. arXiv admin note: substantial text overlap with arXiv:1808.0937

    Yksityisyyden turvaavia protokollia verkkoliikenteen suojaamiseen

    Get PDF
    Digital technologies have become an essential part of our lives. In many parts of the world, activities such as socializing, providing health care, leisure and education are entirely or partially relying on the internet. Moreover, the COVID-19 world pandemic has also contributed significantly to our dependency on the on-line world. While the advancement of the internet brings many advantages, there are also disadvantages such as potential loss of privacy and security. While the users enjoy surfing on the web, service providers may collect a variety of information about their users, such as the users’ location, gender, and religion. Moreover, the attackers may try to violate the users’ security, for example, by infecting the users’ devices with malware. In this PhD dissertation, to provide means to protect networking we propose several privacy-preserving protocols. Our protocols empower internet users to get a variety of services, while at the same time ensuring users’ privacy and security in the digital world. In other words, we design our protocols such that the users only share the amount of information with the service providers that is absolutely necessary to gain the service. Moreover, our protocols only add minimal additional time and communication costs, while leveraging cryptographic schemes to ensure users’ privacy and security. The dissertation contains two main themes of protocols: privacy-preserving set operations and privacy-preserving graph queries. These protocols can be applied to a variety of application areas. We delve deeper into three application areas: privacy-preserving technologies for malware protection, protection of remote access, and protecting minors.Digitaaliteknologiasta on tullut oleellinen osa ihmisten elämää. Monissa osissa maailmaa sellaiset toiminnot kuten terveydenhuolto, vapaa-ajan vietto ja opetus ovat osittain tai kokonaan riippuvaisia internetistä. Lisäksi COVID-19 -pandemia on lisännyt ihmisten riippuvuutta tietoverkoista. Vaikkakin internetin kehittyminen on tuonut paljon hyvää, se on tuonut mukanaan myös haasteita yksityisyydelle ja tietoturvalle. Käyttäjien selatessa verkkoa palveluntarjoajat voivat kerätä käyttäjästä monenlaista tietoa, kuten esimerkiksi käyttäjän sijainnin, sukupuolen ja uskonnon. Lisäksi hyökkääjät voivat yrittää murtaa käyttäjän tietoturvan esimerkiksi asentamalla hänen koneelleen haittaohjelmia. Tässä väitöskirjassa esitellään useita turvallisuutta suojaavia protokollia tietoverkossa tapahtuvan toiminnan turvaamiseen. Nämä protokollat mahdollistavat internetin käytön monilla tavoilla samalla kun ne turvaavat käyttäjän yksityisyyden ja tietoturvan digitaalisessa maailmassa. Toisin sanoen nämä protokollat on suunniteltu siten, että käyttäjät jakavat palveluntarjoajille vain sen tiedon, joka on ehdottoman välttämätöntä palvelun tuottamiseksi. Protokollat käyttävät kryptografisia menetelmiä käyttäjän yksityisyyden sekä tietoturvan varmistamiseksi, ja ne hidastavat kommunikaatiota mahdollisimman vähän. Tämän väitöskirjan sisältämät protokollat voidaan jakaa kahteen eri teemaan: protokollat yksityisyyden suojaaville joukko-operaatioille ja protokollat yksityisyyden suojaaville graafihauille. Näitä protokollia voidaan soveltaa useilla aloilla. Näistä aloista väitöskirjassa käsitellään tarkemmin haittaohjelmilta suojautumista, etäyhteyksien suojaamista ja alaikäisten suojelemista

    Identifying High-Coverage Communities in Edge-Weighted Networks

    Get PDF
    Την τελευταία δεκαετία, η αναζήτηση κοινοτήτων έχει συγκεντρώσει μεγάλη απήχηση σε επιστημονικά πεδία όπως η ανάλυση κοινωνικών και βιολογικών δικτύων. Σχετικές μελέτες χρησιμοποιούν μη σταθμισμένους γράφους για να αναπαριστούν υποκείμενες δομές και στοχεύουν στην εύρεση κοινωτήτων με υψηλή συνοχή. Παράλληλα, νέες έρευνες έχουν επικεντρωθεί στην αναζήτηση κοινοτήτων των οποίων τα μέλη 1) πληρούν ένα σύνολο προκαθορισμένων περιορισμών και 2) συλλογικά μεγιστοποιούν την τιμή μια συνάρτησης. Παρα το γεγονός ότι πλήθος δικτύων του πραγματικού κόσμου διαθέτουν ακμές με βάρη καθώς και κόμβους που σχετίζονται με ένα σύνολο χαρακτηριστικών, οι παραπάνω ήδη καταβληθείσες προσπάθειες επικεντρώνονται κυρίως σε μη σταθμισμένα δίκτυα χωρίς χαρακτηριστικά στους κόμβους. Σε αυτή τη διπλωματική, διερευνούμε μια παραλλαγή του προβλήματος αναζήτησης κοινοτήτων για μη κατευθυνόμενα δίκτυα, με βάρη στις ακμές και κόμβους που διαθέτουν ένα σύνολο χαρακτηριστικών. Δοθέντων ενός γράφου G, ενός συνόλου αρχικών κόμβων, ένα άνω όριο h ως προς το μέγεθος της επιστρεπτέας λύσης, καθώς και ένα κάτω φράγμα s ως προς την συνεκτικότητα, στοχεύουμε στην εύρεση ενός συνδεδεμένου υπογράφου του G ο οποίος: 1) περιέχει τους αρχικούς κόμβους, 2) το μέγεθος της κοινότητας που προσδιορίζεται είναι το πολύ h, 3) το μέτρο συνοχής είναι τουλάχιστον s και 4) ο συνολικός αριθμός των διαφορετικών χαρακτηριστικών που καλύπτονται από τους κόμβους της λύσης μεγιστοποιείται. Ονομάζουμε αυτό το πρόβλημα Αναζήτηση Κοινωτήτων Υψηλής Κάλυψης σε Δίκτυα με Βάρη Ακμών (WCCS). Σε αυτή την διπλωματική, εκμεταλλευόμαστε την πληροφορία που προέρχεται από τα βάρη των ακμών για να ποσοτικοποιήσουμε το ελάχιστο άθροισμα των βαρών που πρέπει να έχει κάποιος κόμβος σε κάθε υποψήφιο υπογράφημα. Υπό αυτές τις συνθήκες, αυτό το ελάχιστο άθροισμα των βαρών, χρησιμεύει ως μέτρο συνοχής. Δείχνουμε ότι Αναζήτηση Κοινωτήτων Υψηλής Κάλυψης σε Δίκτυα με Βάρη Ακμών (WCCS) είναι ένα NP-δύσκολο πρόβλημα όταν πρόκειται για γενικευμένα δίκτυα και ως εκ τούτου, προτείνουμε τρεις προσεγγίσεις για την αντιμετώπιση του εν λόγω προβλήματος. Πειραματικά αποτελέσματα έξι σύνολου δεδομένων πραγματικού κόσμου, δείχνουν ότι παρά τη δυσκολία του προβλήματός μας, μπορούμε αποδοτικά να εντοπίουμε λύσεις που παρέχουν αποτελεσματική κάλυψη.Over the past decade, community search has garnered massive appeal in the areas of social and biology network analysis. Pertinent studies have utilized unweighted graphs to represent underlying structures and seek to reveal highly-cohesive formed groups. Concurrent initiatives have focused on the search for communities whose members 1) comply with designated constraint(s) and 2) collectively present maximization of a score function. Despite the fact that a multitude of real-world networks feature both weighted edges and node attributes, the above already expended efforts focus mostly on unweighted networks without node attributes. In this thesis, we investigate a variant of the community search problem for undirected, edge-weighted, and node-attributed networks modeled as graphs. Given a weighted graph G, a query set of seed nodes Q, a community size constraint h, and a connectivity constraint s, we aim to find a connected subgraph of G that: 1) contains the seed nodes, 2) the size of the community identified is at most h, 3) its cohesiveness measure is at least s and 4) its total number of associated elements is maximized. We term this problem Weighted Covering Community Search (WCCS). In this thesis, we exploit edge-weight-information to quantify the minimum strength within each candidate subgraph considered. In this regard, this minimum strength serves as our cohesiveness measure. We show that the Weighted Covering Community Search (WCCS) is an NP-hard problem when it comes to generalized networks and therefore, we suggest three approaches to address the problem in question. Experimental results with six realworld datasets point to the fact that despite the hardness of our problem, we can efficiently identify solutions that render effective coverage

    Attribute-Driven Community Search

    Get PDF
    Recently, community search over graphs has gained significant interest. In applications such as analysis of protein-protein interaction (PPI) networks, citation graphs, and collaboration networks, nodes tend to have attributes. Unfortunately, most previous community search algorithms ignore attributes and result in communities with poor cohesion w.r.t. their node attributes. In this paper, we study the problem of attribute-driven community search, that is, given an undirected graph G where nodes are associated with attributes, and an input query Q consisting of nodes Vq and attributes Wq, find the communities containing Vq, in which most community members are densely inter-connected and have similar attributes. We formulate this problem as finding attributed truss communities (ATC), i.e., finding connected and close k-truss subgraphs containing Vq, with the largest attribute relevance score. We design a framework of desirable properties that good score function should satisfy. We show that the problem is NP-hard. However, we develop an efficient greedy algorithmic framework to iteratively remove nodes with the least popular attributes, and shrink the graph into an ATC. In addition, we also build an elegant index to maintain k-truss structure and attribute information, and propose efficient query processing algorithms. Extensive experiments on large real-world networks with ground-truth communities show that our algorithms significantly outperform the state of the art and demonstrates their efficiency and effectiveness
    corecore