71 research outputs found
A Selectivity based approach to Continuous Pattern Detection in Streaming Graphs
Cyber security is one of the most significant technical challenges in current
times. Detecting adversarial activities, prevention of theft of intellectual
properties and customer data is a high priority for corporations and government
agencies around the world. Cyber defenders need to analyze massive-scale,
high-resolution network flows to identify, categorize, and mitigate attacks
involving networks spanning institutional and national boundaries. Many of the
cyber attacks can be described as subgraph patterns, with prominent examples
being insider infiltrations (path queries), denial of service (parallel paths)
and malicious spreads (tree queries). This motivates us to explore subgraph
matching on streaming graphs in a continuous setting. The novelty of our work
lies in using the subgraph distributional statistics collected from the
streaming graph to determine the query processing strategy. We introduce a
"Lazy Search" algorithm where the search strategy is decided on a
vertex-to-vertex basis depending on the likelihood of a match in the vertex
neighborhood. We also propose a metric named "Relative Selectivity" that is
used to select between different query processing strategies. Our experiments
performed on real online news, network traffic stream and a synthetic social
network benchmark demonstrate 10-100x speedups over selectivity agnostic
approaches.Comment: in 18th International Conference on Extending Database Technology
(EDBT) (2015
Recommended from our members
Methods for improved mapping of brain lesion connectivity
Recent advances over the past two decades in neuroimaging methods have enabled us to map the connectivity of the brain. In parallel, pathophysiological models of brain disease have shifted from an emphasis on understanding pathology in specific brain regions to characterizing disruptions to interconnected neural networks. Nevertheless, these recent methods for mapping brain connectivity are still under development. Every step of the mapping process becomes a potential source for additional error due to noise or artifacts that could impact final analyses. Segmentation, parcellation, registration, and tractography are some of the steps where this occurs. Moreover, mapping the connectivity in a brain lesion is even more susceptible to errors in these steps. In this body of work, I describe multiple new methods for improving the accuracy of mapping lesion connectivity by reducing errors at the tractography stage which is the most error prone stage. First, we develop an approach for directly normalizing streamlines into a template space that avoids performing tractography in the normalized template space, reducing the error of connectomes constructed in the template space with respect to the ground truth native space connectome. Second, we develop a rapid approach for performing shortest path tractography and constructing shortest path probability weighted connectomes which increases the connection specificity relative to local streamline tracking approaches. We then demonstrate how our shortest path tractography approach can be used construct a disconnectome, a connectivity map of the proportion of connections lost due to intersecting a lesion. We then develop a fast, greedy graph-theoretic algorithm that extracts the maximally disconnected subgraph containing brain regions with the greatest shared loss of connectivity. Finally, we demonstrate how combining methods from diffusion based image inpainting and optimal estimation can be used to restore or inpaint corrupted fiber diffusion models in lesioned white matter tissue, enabling tractography and the study of lesion connectivity and modeling of microstructural measures in the patientâs native space
Listing k-cliques in Sparse Real-World Graphs
International audienceMotivated by recent studies in the data mining community which require to efficiently list all k-cliques, we revisit the iconic algorithm of Chiba and Nishizeki and develop the most efficient parallel algorithm for such a problem. Our theoretical analysis provides the best asymptotic upper bound on the running time of our algorithm for the case when the input graph is sparse. Our experimental evaluation on large real-world graphs shows that our parallel algorithm is faster than state-of-the-art algorithms, while boasting an excellent degree of parallelism. In particular, we are able to list all k-cliques (for any k) in graphs containing up to tens of millions of edges as well as all 10-cliques in graphs containing billions of edges, within a few minutes and a few hours respectively. Finally, we show how our algorithm can be employed as an effective subroutine for finding the k-clique core decomposition and an approximate k-clique densest subgraphs in very large real-world graphs
Application of Evolutionary Network Concept in Structuring Mathematics Curriculum
Phylogenetic tree and in general, evolutionary network, has found its application well beyond the biological fields and has even percolated into recent high demanding areas, such as data mining and social media chain reactions. An extensive survey of its current applications are presented here. An attempt has been made to apply the very concept in the mathematics course curriculum inside a degree program. Various features of the tree structure are identified within the curriculum network. To highlight various key components and to enhance the visual effect, several diagrams are presented. The combined effect of these diagram provides a sense of the entire curriculum tree structure. The current study can be used as a potential tool for effective student advisement, student placement within the curriculum, efficient resource allocation, etc. Future work may encompass detailing and implementing these applications
Graph based Anomaly Detection and Description: A Survey
Detecting anomalies in data is a vital task, with numerous high-impact applications in areas such as security, finance, health care, and law enforcement. While numerous techniques have been developed in past years for spotting outliers and anomalies in unstructured collections of multi-dimensional points, with graph data becoming ubiquitous, techniques for structured graph data have been of focus recently. As objects in graphs have long-range correlations, a suite of novel technology has been developed for anomaly detection in graph data. This survey aims to provide a general, comprehensive, and structured overview of the state-of-the-art methods for anomaly detection in data represented as graphs. As a key contribution, we give a general framework for the algorithms categorized under various settings: unsupervised vs. (semi-)supervised approaches, for static vs. dynamic graphs, for attributed vs. plain graphs. We highlight the effectiveness, scalability, generality, and robustness aspects of the methods. What is more, we stress the importance of anomaly attribution and highlight the major techniques that facilitate digging out the root cause, or the âwhyâ, of the detected anomalies for further analysis and sense-making. Finally, we present several real-world applications of graph-based anomaly detection in diverse domains, including financial, auction, computer traffic, and social networks. We conclude our survey with a discussion on open theoretical and practical challenges in the field
- âŠ