88,838 research outputs found
Anomaly detection in the dynamics of web and social networks
In this work, we propose a new, fast and scalable method for anomaly
detection in large time-evolving graphs. It may be a static graph with dynamic
node attributes (e.g. time-series), or a graph evolving in time, such as a
temporal network. We define an anomaly as a localized increase in temporal
activity in a cluster of nodes. The algorithm is unsupervised. It is able to
detect and track anomalous activity in a dynamic network despite the noise from
multiple interfering sources. We use the Hopfield network model of memory to
combine the graph and time information. We show that anomalies can be spotted
with a good precision using a memory network. The presented approach is
scalable and we provide a distributed implementation of the algorithm. To
demonstrate its efficiency, we apply it to two datasets: Enron Email dataset
and Wikipedia page views. We show that the anomalous spikes are triggered by
the real-world events that impact the network dynamics. Besides, the structure
of the clusters and the analysis of the time evolution associated with the
detected events reveals interesting facts on how humans interact, exchange and
search for information, opening the door to new quantitative studies on
collective and social behavior on large and dynamic datasets.Comment: The Web Conference 2019, 10 pages, 7 figure
Network Partitioning in Distributed Agent-Based Models
Agent-Based Models (ABMs) are an emerging simulation paradigm for modeling complex systems, comprised of autonomous, possibly heterogeneous, interacting agents. The utility of ABMs lies in their ability to represent such complex systems as self-organizing networks of agents. Modeling and understanding the behavior of complex systems usually occurs at large and representative scales, and often obtaining and visualizing of simulation results in real-time is critical. The real-time requirement necessitates the use of in-memory computing, as it is difficult and challenging to handle the latency and unpredictability of disk accesses. Combining this observation with the scale requirement emphasizes the need to use parallel and distributed computing platforms, such as MPI-enabled CPU clusters. Consequently, the agent population must be partitioned across different CPUs in a cluster. Further, the typically high volume of interactions among agents can quickly become a significant bottleneck for real-time or large-scale simulations. The problem is exacerbated if the underlying ABM network is dynamic and the inter-process communication evolves over the course of the simulation. Therefore, it is critical to develop topology-aware partitioning mechanisms to support such large simulations. In this dissertation, we demonstrate that distributed agent-based model simulations benefit from the use of graph partitioning algorithms that involve a local, neighborhood-based perspective. Such methods do not rely on global accesses to the network and thus are more scalable. In addition, we propose two partitioning schemes that consider the bottom-up individual-centric nature of agent-based modeling. The First technique utilizes label-propagation community detection to partition the dynamic agent network of an ABM. We propose a latency-hiding, seamless integration of community detection in the dynamics of a distributed ABM. To achieve this integration, we exploit the similarity in the process flow patterns of a label-propagation community-detection algorithm and self-organizing ABMs. In the second partitioning scheme, we apply a combination of the Guided Local Search (GLS) and Fast Local Search (FLS) metaheuristics in the context of graph partitioning. The main driving principle of GLS is the dynamic modi?cation of the objective function to escape local optima. The algorithm augments the objective of a local search, thereby transforming the landscape structure and escaping a local optimum. FLS is a local search heuristic algorithm that is aimed at reducing the search space of the main search algorithm. It breaks down the space into sub-neighborhoods such that inactive sub-neighborhoods are removed from the search process. The combination of GLS and FLS allowed us to design a graph partitioning algorithm that is both scalable and sensitive to the inherent modularity of real-world networks
Anomaly detection in the dynamics of web and social networks
In this work, we propose a new, fast and scalable method for anomaly detection in large time-evolving graphs. It may be a static graph with dynamic node attributes (e.g. time-series), or a graph evolving in time, such as a temporal network. We define an anomaly as a localized increase in temporal activity in a cluster of nodes. The algorithm is unsupervised. It is able to detect and track anomalous activity in a dynamic network despite the noise from multiple interfering sources. We use the Hopfield network model of memory to combine the graph and time information. We show that anomalies can be spotted with good precision using a memory network. The presented approach is scalable and we provide a distributed implementation of the algorithm. To demonstrate its efficiency, we apply it to two datasets: Enron Email dataset and Wikipedia page views. We show that the anomalous spikes are triggered by the real-world events that impact the network dynamics. Besides, the structure of the clusters and the analysis of the time evolution associated with the detected events reveals interesting facts on how humans interact, exchange and search for information, opening the door to new quantitative studies on collective and social behavior on large and dynamic datasets
Avatar: A Time- and Space-Efficient Self-Stabilizing Overlay Network
Overlay networks present an interesting challenge for fault-tolerant
computing. Many overlay networks operate in dynamic environments (e.g. the
Internet), where faults are frequent and widespread, and the number of
processes in a system may be quite large. Recently, self-stabilizing overlay
networks have been presented as a method for managing this complexity.
\emph{Self-stabilizing overlay networks} promise that, starting from any
weakly-connected configuration, a correct overlay network will eventually be
built. To date, this guarantee has come at a cost: nodes may either have high
degree during the algorithm's execution, or the algorithm may take a long time
to reach a legal configuration. In this paper, we present the first
self-stabilizing overlay network algorithm that does not incur this penalty.
Specifically, we (i) present a new locally-checkable overlay network based upon
a binary search tree, and (ii) provide a randomized algorithm for
self-stabilization that terminates in an expected polylogarithmic number of
rounds \emph{and} increases a node's degree by only a polylogarithmic factor in
expectation
Bicriteria Network Design Problems
We study a general class of bicriteria network design problems. A generic
problem in this class is as follows: Given an undirected graph and two
minimization objectives (under different cost functions), with a budget
specified on the first, find a <subgraph \from a given subgraph-class that
minimizes the second objective subject to the budget on the first. We consider
three different criteria - the total edge cost, the diameter and the maximum
degree of the network. Here, we present the first polynomial-time approximation
algorithms for a large class of bicriteria network design problems for the
above mentioned criteria. The following general types of results are presented.
First, we develop a framework for bicriteria problems and their
approximations. Second, when the two criteria are the same %(note that the cost
functions continue to be different) we present a ``black box'' parametric
search technique. This black box takes in as input an (approximation) algorithm
for the unicriterion situation and generates an approximation algorithm for the
bicriteria case with only a constant factor loss in the performance guarantee.
Third, when the two criteria are the diameter and the total edge costs we use a
cluster-based approach to devise a approximation algorithms --- the solutions
output violate both the criteria by a logarithmic factor. Finally, for the
class of treewidth-bounded graphs, we provide pseudopolynomial-time algorithms
for a number of bicriteria problems using dynamic programming. We show how
these pseudopolynomial-time algorithms can be converted to fully
polynomial-time approximation schemes using a scaling technique.Comment: 24 pages 1 figur
- …