22 research outputs found

    Community-aware network sparsification

    Full text link
    Network sparsification aims to reduce the number of edges of a network while maintaining its structural properties; such properties include shortest paths, cuts, spectral measures, or network modularity. Sparsification has multiple applications, such as, speeding up graph-mining algorithms, graph visualization, as well as identifying the important network edges. In this paper we consider a novel formulation of the network-sparsification problem. In addition to the network, we also consider as input a set of communities. The goal is to sparsify the network so as to preserve the network structure with respect to the given communities. We introduce two variants of the community-aware sparsification problem, leading to sparsifiers that satisfy different connectedness community properties. From the technical point of view, we prove hardness results and devise effective approximation algorithms. Our experimental results on a large collection of datasets demonstrate the effectiveness of our algorithms.https://epubs.siam.org/doi/10.1137/1.9781611974973.48Accepted manuscrip

    The network-untangling problem : from interactions to activity timelines

    Get PDF
    In this paper we study a problem of determining when entities are active based on their interactions with each other. We consider a set of entities V and a sequence of time-stamped edges E among the entities. Each edge (u, v, t) is an element of E denotes an interaction between entities u and v at time t. We assume an activity model where each entity is active during at most k time intervals. An interaction (u, v, t) can be explained if at least one of u or v are active at time t. Our goal is to reconstruct the activity intervals for all entities in the network, so as to explain the observed interactions. This problem, the network-untangling problem, can be applied to discover event timelines from complex entity interactions. We provide two formulations of the network-untangling problem: (i) minimizing the total interval length over all entities (sum version), and (ii) minimizing the maximum interval length (max version). We study separately the two problems for k = 1 and k > 1 activity intervals per entity. For the case k = 1, we show that the sum problem is NP-hard, while the max problem can be solved optimally in linear time. For the sum problem we provide efficient algorithms motivated by realistic assumptions. For the case of k > 1, we show that both formulations are inapproximable. However, wepropose efficient algorithms based on alternative optimization. We complement our study with an evaluation on synthetic and real-world datasets, which demonstrates the validity of our concepts and the good performance of our algorithms.Peer reviewe

    Discovering dynamic communities in interaction networks

    Get PDF
    Very often online social networks are defined by aggregating information regarding the interaction between the nodes of the network. For example, a call graph is defined by considering an edge for each pair of individuals who have called each other at least once --- or at least k times. Similarly, an implicit social network in a social-media site is defined by considering an edge for each pair of users who have interacted in some way, e.g., have made a conversation, commented to each other's content, etc. Despite the fact that this type of definitions have been used to obtain a lot of insights regarding the structure of social networks, it is obvious that they suffer from a severe limitation: they neglect the precise time that the interaction between network nodes occurs. In this thesis we propose to study interaction networks, where one considers not only the underlying topology of the social network, but also the exact time instances that nodes interact. In an interaction network an edge is associated with a time stamp, and multiple edges may occur for the same pair of nodes. Consequently, interaction networks offer a more fine-grained representation that can be used to reveal otherwise hidden dynamic phenomena in the network. In the context of interaction networks, we study the problem of discovering communities, which are dense in terms of the underlying network structure, and whose edges occur in short time intervals. Such communities represent groups of individuals who interact with each other in some specific time instances, for example, a group of employees who work on a project and whose interaction intensifies before certain project milestones. We prove that the problem we define is NP-hard, and we provide effective algorithms by adapting techniques used to find dense subgraphs. We perform extensive evaluation of the proposed methods on synthetic and real datasets, which demonstrates the validity of our concepts and the good performance of our algorithms

    Mining Dense Subgraphs with Similar Edges

    Full text link
    When searching for interesting structures in graphs, it is often important to take into account not only the graph connectivity, but also the metadata available, such as node and edge labels, or temporal information. In this paper we are interested in settings where such metadata is used to define a similarity between edges. We consider the problem of finding subgraphs that are dense and whose edges are similar to each other with respect to a given similarity function. Depending on the application, this function can be, for example, the Jaccard similarity between the edge label sets, or the temporal correlation of the edge occurrences in a temporal graph. We formulate a Lagrangian relaxation-based optimization problem to search for dense subgraphs with high pairwise edge similarity. We design a novel algorithm to solve the problem through parametric MinCut, and provide an efficient search scheme to iterate through the values of the Lagrangian multipliers. Our study is complemented by an evaluation on real-world datasets, which demonstrates the usefulness and efficiency of the proposed approach

    Discovering Dense Correlated Subgraphs in Dynamic Networks

    Full text link
    Given a dynamic network, where edges appear and disappear over time, we are interested in finding sets of edges that have similar temporal behavior and form a dense subgraph. Formally, we define the problem as the enumeration of the maximal subgraphs that satisfy specific density and similarity thresholds. To measure the similarity of the temporal behavior, we use the correlation between the binary time series that represent the activity of the edges. For the density, we study two variants based on the average degree. For these problem variants we enumerate the maximal subgraphs and compute a compact subset of subgraphs that have limited overlap. We propose an approximate algorithm that scales well with the size of the network, while achieving a high accuracy. We evaluate our framework on both real and synthetic datasets. The results of the synthetic data demonstrate the high accuracy of the approximation and show the scalability of the framework.Comment: Full version of the paper included in the proceedings of the PAKDD 2021 conferenc

    Finding events in temporal networks: Segmentation meets densest-subgraph discovery

    Get PDF
    International audienceIn this paper we study the problem of discovering a timeline of events in a temporal network. We model events as dense subgraphs that occur within intervals of network activity. We formulate the event-discovery task as an optimization problem, where we search for a partition of the network timeline into k non-overlapping intervals, such that the intervals span subgraphs with maximum total density. The output is a sequence of dense subgraphs along with corresponding time intervals, capturing the most interesting events during the network lifetime. A naïve solution to our optimization problem has polynomial but prohibitively high running time complexity. We adapt existing recent work on dynamic densest-subgraph discovery and approximate dynamic programming to design a fast approximation algorithm. Next, to ensure richer structure, we adjust the problem formulation to encourage coverage of a larger set of nodes. This problem is NP-hard even for static graphs. However, on static graphs a simple greedy algorithm leads to approximate solution due to submodularity. We extended this greedy approach for the case of temporal networks. However, the approximation guarantee does not hold. Nevertheless, according to the experiments, the algorithm finds good quality solutions

    Finding events in temporal networks : segmentation meets densest subgraph discovery

    Get PDF
    In this paper, we study the problem of discovering a timeline of events in a temporal network. We model events as dense subgraphs that occur within intervals of network activity. We formulate the event discovery task as an optimization problem, where we search for a partition of the network timeline into k non-overlapping intervals, such that the intervals span subgraphs with maximum total density. The output is a sequence of dense subgraphs along with corresponding time intervals, capturing the most interesting events during the network lifetime. A naïve solution to our optimization problem has polynomial but prohibitively high running time. We adapt existing recent work on dynamic densest subgraph discovery and approximate dynamic programming to design a fast approximation algorithm. Next, to ensure richer structure, we adjust the problem formulation to encourage coverage of a larger set of nodes. This problem is NP-hard; however, we show that on static graphs a simple greedy algorithm leads to approximate solution due to submodularity. We extend this greedy approach for temporal networks, but we lose the approximation guarantee in the process. Finally, we demonstrate empirically that our algorithms recover solutions with good quality.Peer reviewe

    Methods for analyzing temporal networks

    No full text
    In this thesis we study networks with a temporal component. We use the interaction-network model to preserve the exact timestamp, and possibly other information, regarding each interaction between the nodes. In this model we study summarization, event detection, centrality measures, and infection spreading in temporal networks. First, we propose a novel generalization of PageRank centrality measure, which produces realistic centrality estimates with respect to a history of interactions. It captures the temporal changes in the distribution of interactions. We show that if the distribution remains stable, temporal PageRank converges to static PageRank. Second, we consider the problem of reconstructing an activity propagation in an interaction network. We show that with access to a small noisy set of reported infections we can reconstruct the epidemic spread over time without any assumptions on the propagation model.Next, we consider two event-detection problems. In the first one we propose a novel model for topically- and temporally-coherent events in social networks. We use textual information and define events to be sets of interactions that are topically close and form a directed tree of information flow. In addition, we address and solve the problem of discovering top-k events. In the second event-detection problem we aggregate node activity over fixed time intervals. Given a graph snapshot we then define an event to be a set of nodes, which is compact in the graph and has high total activity. We represent compactness in two ways: by total distance between nodes and by minimum spanning tree. To discover snapshots with prominent events in real-world interaction networks we apply greedy heuristic to sweep through the sequence of snapshots. Summarization covers a wide range of problems. Here we study summarization from two different points of view. First, we formulate a novel problem to explain and summarize all interactions in the interaction network by identifying activity intervals of the nodes. We consider two variants of the summarization problem: minimization of the total length of activity intervals and minimization of the maximum interval length. Second, we consider a novel problem of discovering a set of nodes, which forms a dense subgraph and whose interactions occur in short time intervals. This formulation provides an accurate representation of dense overlapping communities and their dynamics over the network history. For all proposed novel problems we present complexity analysis, develop novel or adapt existing algorithms, and prove quality guarantees. The algorithms are thoroughly evaluated on synthetic and real-world datasets and compared against baselines
    corecore