16 research outputs found
Locality statistics for anomaly detection in time series of graphs
The ability to detect change-points in a dynamic network or a time series of
graphs is an increasingly important task in many applications of the emerging
discipline of graph signal processing. This paper formulates change-point
detection as a hypothesis testing problem in terms of a generative latent
position model, focusing on the special case of the Stochastic Block Model time
series. We analyze two classes of scan statistics, based on distinct underlying
locality statistics presented in the literature. Our main contribution is the
derivation of the limiting distributions and power characteristics of the
competing scan statistics. Performance is compared theoretically, on synthetic
data, and on the Enron email corpus. We demonstrate that both statistics are
admissible in one simple setting, while one of the statistics is inadmissible a
second setting.Comment: 15 pages, 6 figure
Multiple Network Embedding for Anomaly Detection in Time Series of Graphs
This paper considers the graph signal processing problem of anomaly detection
in time series of graphs. We examine two related, complementary inference
tasks: the detection of anomalous graphs within a time series, and the
detection of temporally anomalous vertices. We approach these tasks via the
adaptation of statistically principled methods for joint graph inference,
specifically multiple adjacency spectral embedding (MASE) and omnibus embedding
(OMNI). We demonstrate that these two methods are effective for our inference
tasks. Moreover, we assess the performance of these methods in terms of the
underlying nature of detectable anomalies. Our results delineate the relative
strengths and limitations of these procedures, and provide insight into their
use. Applied to a large-scale commercial search engine time series of graphs,
our approaches demonstrate their applicability and identify the anomalous
vertices beyond just large degree change.Comment: 22 pages, 11 figure
Holistic Learning for Multi-Target and Network Monitoring Problems
abstract: Technological advances have enabled the generation and collection of various data from complex systems, thus, creating ample opportunity to integrate knowledge in many decision making applications. This dissertation introduces holistic learning as the integration of a comprehensive set of relationships that are used towards the learning objective. The holistic view of the problem allows for richer learning from data and, thereby, improves decision making.
The first topic of this dissertation is the prediction of several target attributes using a common set of predictor attributes. In a holistic learning approach, the relationships between target attributes are embedded into the learning algorithm created in this dissertation. Specifically, a novel tree based ensemble that leverages the relationships between target attributes towards constructing a diverse, yet strong, model is proposed. The method is justified through its connection to existing methods and experimental evaluations on synthetic and real data.
The second topic pertains to monitoring complex systems that are modeled as networks. Such systems present a rich set of attributes and relationships for which holistic learning is important. In social networks, for example, in addition to friendship ties, various attributes concerning the users' gender, age, topic of messages, time of messages, etc. are collected. A restricted form of monitoring fails to take the relationships of multiple attributes into account, whereas the holistic view embeds such relationships in the monitoring methods. The focus is on the difficult task to detect a change that might only impact a small subset of the network and only occur in a sub-region of the high-dimensional space of the network attributes. One contribution is a monitoring algorithm based on a network statistical model. Another contribution is a transactional model that transforms the task into an expedient structure for machine learning, along with a generalizable algorithm to monitor the attributed network. A learning step in this algorithm adapts to changes that may only be local to sub-regions (with a broader potential for other learning tasks). Diagnostic tools to interpret the change are provided. This robust, generalizable, holistic monitoring method is elaborated on synthetic and real networks.Dissertation/ThesisDoctoral Dissertation Industrial Engineering 201