384 research outputs found

    Outlier Detection from Network Data with Subnetwork Interpretation

    Full text link
    Detecting a small number of outliers from a set of data observations is always challenging. This problem is more difficult in the setting of multiple network samples, where computing the anomalous degree of a network sample is generally not sufficient. In fact, explaining why the network is exceptional, expressed in the form of subnetwork, is also equally important. In this paper, we develop a novel algorithm to address these two key problems. We treat each network sample as a potential outlier and identify subnetworks that mostly discriminate it from nearby regular samples. The algorithm is developed in the framework of network regression combined with the constraints on both network topology and L1-norm shrinkage to perform subnetwork discovery. Our method thus goes beyond subspace/subgraph discovery and we show that it converges to a global optimum. Evaluation on various real-world network datasets demonstrates that our algorithm not only outperforms baselines in both network and high dimensional setting, but also discovers highly relevant and interpretable local subnetworks, further enhancing our understanding of anomalous networks

    Anomaly Detection on Dynamic Graph

    Get PDF

    Discovering Dense Correlated Subgraphs in Dynamic Networks

    Full text link
    Given a dynamic network, where edges appear and disappear over time, we are interested in finding sets of edges that have similar temporal behavior and form a dense subgraph. Formally, we define the problem as the enumeration of the maximal subgraphs that satisfy specific density and similarity thresholds. To measure the similarity of the temporal behavior, we use the correlation between the binary time series that represent the activity of the edges. For the density, we study two variants based on the average degree. For these problem variants we enumerate the maximal subgraphs and compute a compact subset of subgraphs that have limited overlap. We propose an approximate algorithm that scales well with the size of the network, while achieving a high accuracy. We evaluate our framework on both real and synthetic datasets. The results of the synthetic data demonstrate the high accuracy of the approximation and show the scalability of the framework.Comment: Full version of the paper included in the proceedings of the PAKDD 2021 conferenc

    Laplacian Change Point Detection for Dynamic Graphs

    Full text link
    Dynamic and temporal graphs are rich data structures that are used to model complex relationships between entities over time. In particular, anomaly detection in temporal graphs is crucial for many real world applications such as intrusion identification in network systems, detection of ecosystem disturbances and detection of epidemic outbreaks. In this paper, we focus on change point detection in dynamic graphs and address two main challenges associated with this problem: I) how to compare graph snapshots across time, II) how to capture temporal dependencies. To solve the above challenges, we propose Laplacian Anomaly Detection (LAD) which uses the spectrum of the Laplacian matrix of the graph structure at each snapshot to obtain low dimensional embeddings. LAD explicitly models short term and long term dependencies by applying two sliding windows. In synthetic experiments, LAD outperforms the state-of-the-art method. We also evaluate our method on three real dynamic networks: UCI message network, US senate co-sponsorship network and Canadian bill voting network. In all three datasets, we demonstrate that our method can more effectively identify anomalous time points according to significant real world events.Comment: in KDD 2020, 10 page

    Risk assessment in centralized and decentralized online social network.

    Get PDF
    One of the main concerns in centralized and decentralized OSNs is related to the fact that OSNs users establish new relationships with unknown people with the result of exposing a huge amount of personal data. This can attract the variety of attackers that try to propagate malwares and malicious items in the network to misuse the personal information of users. Therefore, there have been several research studies to detect specific kinds of attacks by focusing on the topology of the graph [159, 158, 32, 148, 157]. On the other hand, there are several solutions to detect specific kinds of attackers based on the behavior of users. But, most of these approaches either focus on just the topology of the graph [159, 158] or the detection of anomalous users by exploiting supervised learning techniques [157, 47, 86, 125]. However, we have to note that the main issue of supervised learning is that they are not able to detect new attacker's behaviors, since the classifier is trained based on the known behavioral patterns. Literature also offers approaches to detect anomalous users in OSNs that use unsupervised learning approaches [150, 153, 36, 146] or a combination of supervised and unsupervised techniques [153]. But, existing attack defenses are designed to cope with just one specific type of attack. Although several solutions to detect specific kinds of attacks have been recently proposed, there is no general solution to cope with the main privacy/security attacks in OSNs. In such a scenario, it would be very beneficial to have a solution that can cope with the main privacy/security attacks that can be perpetrated using the social network graph. Our main contribution is proposing a unique unsupervised approach that helps OSNs providers and users to have a global understanding of risky users and detect them. We believe that the core of such a solution is a mechanism able to assign a risk score to each OSNs account. Over the last three years, we have done significant research efforts in analyzing user's behavior to detect risky users included some kinds of well known attacks in centralized and decentralized online social networks. Our research started by proposing a risk assessment approach based on the idea that the more a user behavior diverges from normal behavior, the more it should be considered risky. In our proposed approach, we monitor and analyze the combination of interaction or activity patterns and friendship patterns of users and build the risk estimation model in order to detect and identify those risky users who follow the behavioral patterns of attackers. Since, users in OSNs follow different behavioral patterns, it is not possible to define a unique standard behavioral model that fits all OSNs users' behaviors. Towards this goal, we propose a two-phase risk assessment approach by grouping users in the first phase to find similar users that share the same behavioral patterns and, then in the second phase, for each identified group, building some normal behavior models and compute for each user the level of divergency from these normal behaviors. Then, we extend this approach for Decentralized Online Social Networks (i.e., DOSNs). In the following of this approach, we propose a solution in defining a risk measure to help users in OSNs to judge their direct contacts so as to avoid friendship with malicious users. Finally, we monitor dynamically the friendship patterns of users in a large social graph over time for any anomalous changes reflecting attacker's behaviors. In this thesis, we will describe all the solutions that we proposed

    Span-core Decomposition for Temporal Networks: Algorithms and Applications

    Full text link
    When analyzing temporal networks, a fundamental task is the identification of dense structures (i.e., groups of vertices that exhibit a large number of links), together with their temporal span (i.e., the period of time for which the high density holds). In this paper we tackle this task by introducing a notion of temporal core decomposition where each core is associated with two quantities, its coreness, which quantifies how densely it is connected, and its span, which is a temporal interval: we call such cores \emph{span-cores}. For a temporal network defined on a discrete temporal domain TT, the total number of time intervals included in TT is quadratic in ∣T∣|T|, so that the total number of span-cores is potentially quadratic in ∣T∣|T| as well. Our first main contribution is an algorithm that, by exploiting containment properties among span-cores, computes all the span-cores efficiently. Then, we focus on the problem of finding only the \emph{maximal span-cores}, i.e., span-cores that are not dominated by any other span-core by both their coreness property and their span. We devise a very efficient algorithm that exploits theoretical findings on the maximality condition to directly extract the maximal ones without computing all span-cores. Finally, as a third contribution, we introduce the problem of \emph{temporal community search}, where a set of query vertices is given as input, and the goal is to find a set of densely-connected subgraphs containing the query vertices and covering the whole underlying temporal domain TT. We derive a connection between this problem and the problem of finding (maximal) span-cores. Based on this connection, we show how temporal community search can be solved in polynomial-time via dynamic programming, and how the maximal span-cores can be profitably exploited to significantly speed-up the basic algorithm.Comment: ACM Transactions on Knowledge Discovery from Data (TKDD), 2020. arXiv admin note: substantial text overlap with arXiv:1808.0937
    • …
    corecore