1,694 research outputs found
Detecting Flow Anomalies in Distributed Systems
Deep within the networks of distributed systems, one often finds anomalies
that affect their efficiency and performance. These anomalies are difficult to
detect because the distributed systems may not have sufficient sensors to
monitor the flow of traffic within the interconnected nodes of the networks.
Without early detection and making corrections, these anomalies may aggravate
over time and could possibly cause disastrous outcomes in the system in the
unforeseeable future. Using only coarse-grained information from the two end
points of network flows, we propose a network transmission model and a
localization algorithm, to detect the location of anomalies and rank them using
a proposed metric within distributed systems. We evaluate our approach on
passengers' records of an urbanized city's public transportation system and
correlate our findings with passengers' postings on social media microblogs.
Our experiments show that the metric derived using our localization algorithm
gives a better ranking of anomalies as compared to standard deviation measures
from statistical models. Our case studies also demonstrate that transportation
events reported in social media microblogs matches the locations of our detect
anomalies, suggesting that our algorithm performs well in locating the
anomalies within distributed systems
Large Scale Spectral Clustering Using Approximate Commute Time Embedding
Spectral clustering is a novel clustering method which can detect complex
shapes of data clusters. However, it requires the eigen decomposition of the
graph Laplacian matrix, which is proportion to and thus is not
suitable for large scale systems. Recently, many methods have been proposed to
accelerate the computational time of spectral clustering. These approximate
methods usually involve sampling techniques by which a lot information of the
original data may be lost. In this work, we propose a fast and accurate
spectral clustering approach using an approximate commute time embedding, which
is similar to the spectral embedding. The method does not require using any
sampling technique and computing any eigenvector at all. Instead it uses random
projection and a linear time solver to find the approximate embedding. The
experiments in several synthetic and real datasets show that the proposed
approach has better clustering quality and is faster than the state-of-the-art
approximate spectral clustering methods
NetLSD: Hearing the Shape of a Graph
Comparison among graphs is ubiquitous in graph analytics. However, it is a
hard task in terms of the expressiveness of the employed similarity measure and
the efficiency of its computation. Ideally, graph comparison should be
invariant to the order of nodes and the sizes of compared graphs, adaptive to
the scale of graph patterns, and scalable. Unfortunately, these properties have
not been addressed together. Graph comparisons still rely on direct approaches,
graph kernels, or representation-based methods, which are all inefficient and
impractical for large graph collections.
In this paper, we propose the Network Laplacian Spectral Descriptor (NetLSD):
the first, to our knowledge, permutation- and size-invariant, scale-adaptive,
and efficiently computable graph representation method that allows for
straightforward comparisons of large graphs. NetLSD extracts a compact
signature that inherits the formal properties of the Laplacian spectrum,
specifically its heat or wave kernel; thus, it hears the shape of a graph. Our
evaluation on a variety of real-world graphs demonstrates that it outperforms
previous works in both expressiveness and efficiency.Comment: KDD '18: The 24th ACM SIGKDD International Conference on Knowledge
Discovery & Data Mining, August 19--23, 2018, London, United Kingdo
Spectral Target Detection using Physics-Based Modeling and a Manifold Learning Technique
Identification of materials from calibrated radiance data collected by an airborne imaging spectrometer depends strongly on the atmospheric and illumination conditions at the time of collection. This thesis demonstrates a methodology for identifying material spectra using the assumption that each unique material class forms a lower-dimensional manifold (surface) in the higher-dimensional spectral radiance space and that all image spectra reside on, or near, these theoretic manifolds. Using a physical model, a manifold characteristic of the target material exposed to varying illumination and atmospheric conditions is formed. A graph-based model is then applied to the radiance data to capture the intricate structure of each material manifold, followed by the application of the commute time distance (CTD) transformation to separate the target manifold from the background. Detection algorithms are then applied in the CTD subspace. This nonlinear transformation is based on a random walk on a graph and is derived from an eigendecomposition of the pseudoinverse of the graph Laplacian matrix. This work provides a geometric interpretation of the CTD transformation, its algebraic properties, the atmospheric and illumination parameters varied in the physics-based model, and the influence the target manifold samples have on the orientation of the coordinate axes in the transformed space.
This thesis concludes by demonstrating improved detection results in the CTD subspace as compared to detection in the original spectral radiance space
- …