384 research outputs found
Outlier Detection from Network Data with Subnetwork Interpretation
Detecting a small number of outliers from a set of data observations is
always challenging. This problem is more difficult in the setting of multiple
network samples, where computing the anomalous degree of a network sample is
generally not sufficient. In fact, explaining why the network is exceptional,
expressed in the form of subnetwork, is also equally important. In this paper,
we develop a novel algorithm to address these two key problems. We treat each
network sample as a potential outlier and identify subnetworks that mostly
discriminate it from nearby regular samples. The algorithm is developed in the
framework of network regression combined with the constraints on both network
topology and L1-norm shrinkage to perform subnetwork discovery. Our method thus
goes beyond subspace/subgraph discovery and we show that it converges to a
global optimum. Evaluation on various real-world network datasets demonstrates
that our algorithm not only outperforms baselines in both network and high
dimensional setting, but also discovers highly relevant and interpretable local
subnetworks, further enhancing our understanding of anomalous networks
Discovering Dense Correlated Subgraphs in Dynamic Networks
Given a dynamic network, where edges appear and disappear over time, we are
interested in finding sets of edges that have similar temporal behavior and
form a dense subgraph. Formally, we define the problem as the enumeration of
the maximal subgraphs that satisfy specific density and similarity thresholds.
To measure the similarity of the temporal behavior, we use the correlation
between the binary time series that represent the activity of the edges. For
the density, we study two variants based on the average degree. For these
problem variants we enumerate the maximal subgraphs and compute a compact
subset of subgraphs that have limited overlap. We propose an approximate
algorithm that scales well with the size of the network, while achieving a high
accuracy. We evaluate our framework on both real and synthetic datasets. The
results of the synthetic data demonstrate the high accuracy of the
approximation and show the scalability of the framework.Comment: Full version of the paper included in the proceedings of the PAKDD
2021 conferenc
Laplacian Change Point Detection for Dynamic Graphs
Dynamic and temporal graphs are rich data structures that are used to model
complex relationships between entities over time. In particular, anomaly
detection in temporal graphs is crucial for many real world applications such
as intrusion identification in network systems, detection of ecosystem
disturbances and detection of epidemic outbreaks. In this paper, we focus on
change point detection in dynamic graphs and address two main challenges
associated with this problem: I) how to compare graph snapshots across time,
II) how to capture temporal dependencies. To solve the above challenges, we
propose Laplacian Anomaly Detection (LAD) which uses the spectrum of the
Laplacian matrix of the graph structure at each snapshot to obtain low
dimensional embeddings. LAD explicitly models short term and long term
dependencies by applying two sliding windows. In synthetic experiments, LAD
outperforms the state-of-the-art method. We also evaluate our method on three
real dynamic networks: UCI message network, US senate co-sponsorship network
and Canadian bill voting network. In all three datasets, we demonstrate that
our method can more effectively identify anomalous time points according to
significant real world events.Comment: in KDD 2020, 10 page
Risk assessment in centralized and decentralized online social network.
One of the main concerns in centralized and decentralized OSNs is related to the fact that OSNs users establish new relationships with unknown people with the result of exposing a huge amount of personal data. This can attract the variety of attackers that try to propagate malwares and malicious items in the network to misuse the personal information of users. Therefore, there have been several research studies to detect specific kinds of attacks by focusing on the topology of the graph [159, 158, 32, 148, 157]. On the other hand, there are several solutions to detect specific kinds of attackers based on the behavior of users. But, most of these approaches either focus on just the topology of the graph [159, 158] or the detection of anomalous users by exploiting supervised learning techniques [157, 47, 86, 125]. However, we have to note that the main issue of supervised learning is that they are not able to detect new attacker's behaviors, since the classifier is trained based on the known behavioral patterns. Literature also offers approaches to detect anomalous users in OSNs that use unsupervised learning approaches [150, 153, 36, 146] or a combination of supervised and unsupervised techniques [153]. But, existing attack defenses are designed to cope with just one specific type of attack. Although several solutions to detect specific kinds of attacks have been recently proposed, there is no general solution to cope with the main privacy/security attacks in OSNs.
In such a scenario, it would be very beneficial to have a solution that can cope with the main privacy/security attacks that can be perpetrated using the social network graph. Our main contribution is proposing a unique unsupervised approach that helps OSNs providers and users to have a global understanding of risky users and detect them. We believe that the core of such a solution is a mechanism able to assign a risk score to each OSNs account. Over the last three years, we have done significant research efforts in analyzing user's behavior to detect risky users included some kinds of well known attacks in centralized and decentralized online social networks.
Our research started by proposing a risk assessment approach based on the idea that the more a user behavior diverges from normal behavior, the more it should be considered risky. In our proposed approach, we monitor and analyze the combination of interaction or activity patterns and friendship patterns of users and build the risk estimation model in order to detect and identify those risky users who follow the behavioral patterns of attackers. Since, users in OSNs follow different behavioral patterns, it is not possible to define a unique standard behavioral model that fits all OSNs users' behaviors. Towards this goal, we propose a two-phase risk assessment approach by grouping users in the first phase to find similar users that share the same behavioral patterns and, then in the second phase, for each identified group, building some normal behavior models and compute for each user the level of divergency from these normal behaviors. Then, we extend this approach for Decentralized Online Social Networks (i.e., DOSNs). In the following of this approach, we propose a solution in defining a risk measure to help users in OSNs to judge their direct contacts so as to avoid friendship with malicious users. Finally, we monitor dynamically the friendship patterns of users in a large social graph over time for any anomalous changes reflecting attacker's behaviors. In this thesis, we will describe all the solutions that we proposed
Span-core Decomposition for Temporal Networks: Algorithms and Applications
When analyzing temporal networks, a fundamental task is the identification of
dense structures (i.e., groups of vertices that exhibit a large number of
links), together with their temporal span (i.e., the period of time for which
the high density holds). In this paper we tackle this task by introducing a
notion of temporal core decomposition where each core is associated with two
quantities, its coreness, which quantifies how densely it is connected, and its
span, which is a temporal interval: we call such cores \emph{span-cores}.
For a temporal network defined on a discrete temporal domain , the total
number of time intervals included in is quadratic in , so that the
total number of span-cores is potentially quadratic in as well. Our first
main contribution is an algorithm that, by exploiting containment properties
among span-cores, computes all the span-cores efficiently. Then, we focus on
the problem of finding only the \emph{maximal span-cores}, i.e., span-cores
that are not dominated by any other span-core by both their coreness property
and their span. We devise a very efficient algorithm that exploits theoretical
findings on the maximality condition to directly extract the maximal ones
without computing all span-cores.
Finally, as a third contribution, we introduce the problem of \emph{temporal
community search}, where a set of query vertices is given as input, and the
goal is to find a set of densely-connected subgraphs containing the query
vertices and covering the whole underlying temporal domain . We derive a
connection between this problem and the problem of finding (maximal)
span-cores. Based on this connection, we show how temporal community search can
be solved in polynomial-time via dynamic programming, and how the maximal
span-cores can be profitably exploited to significantly speed-up the basic
algorithm.Comment: ACM Transactions on Knowledge Discovery from Data (TKDD), 2020. arXiv
admin note: substantial text overlap with arXiv:1808.0937
- …