65 research outputs found

    Restart and Random Walk in Local Search for Maximum Vertex Weight Cliques with Evaluations in Clustering Aggregation

    Full text link

    Graph based Anomaly Detection and Description: A Survey

    Get PDF
    Detecting anomalies in data is a vital task, with numerous high-impact applications in areas such as security, finance, health care, and law enforcement. While numerous techniques have been developed in past years for spotting outliers and anomalies in unstructured collections of multi-dimensional points, with graph data becoming ubiquitous, techniques for structured graph data have been of focus recently. As objects in graphs have long-range correlations, a suite of novel technology has been developed for anomaly detection in graph data. This survey aims to provide a general, comprehensive, and structured overview of the state-of-the-art methods for anomaly detection in data represented as graphs. As a key contribution, we give a general framework for the algorithms categorized under various settings: unsupervised vs. (semi-)supervised approaches, for static vs. dynamic graphs, for attributed vs. plain graphs. We highlight the effectiveness, scalability, generality, and robustness aspects of the methods. What is more, we stress the importance of anomaly attribution and highlight the major techniques that facilitate digging out the root cause, or the ‘why’, of the detected anomalies for further analysis and sense-making. Finally, we present several real-world applications of graph-based anomaly detection in diverse domains, including financial, auction, computer traffic, and social networks. We conclude our survey with a discussion on open theoretical and practical challenges in the field

    Unsupervised Structural Embedding Methods for Efficient Collective Network Mining

    Full text link
    How can we align accounts of the same user across social networks? Can we identify the professional role of an email user from their patterns of communication? Can we predict the medical effects of chemical compounds from their atomic network structure? Many problems in graph data mining, including all of the above, are defined on multiple networks. The central element to all of these problems is cross-network comparison, whether at the level of individual nodes or entities in the network or at the level of entire networks themselves. To perform this comparison meaningfully, we must describe the entities in each network expressively in terms of patterns that generalize across the networks. Moreover, because the networks in question are often very large, our techniques must be computationally efficient. In this thesis, we propose scalable unsupervised methods that embed nodes in vector space by mapping nodes with similar structural roles in their respective networks, even if they come from different networks, to similar parts of the embedding space. We perform network alignment by matching nodes across two or more networks based on the similarity of their embeddings, and refine this process by reinforcing the consistency of each node’s alignment with those of its neighbors. By characterizing the distribution of node embeddings in a graph, we develop graph-level feature vectors that are highly effective for graph classification. With principled sparsification and randomized approximation techniques, we make all our methods computationally efficient and able to scale to graphs with millions of nodes or edges. We demonstrate the effectiveness of structural node embeddings on industry-scale applications, and propose an extensive set of embedding evaluation techniques that lay the groundwork for further methodological development and application.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/162895/1/mheimann_1.pd

    Algorithms and Models for the Web Graph

    Get PDF

    Scalable Graph Algorithms using Practically Efficient Data Reductions

    Get PDF

    LIPIcs, Volume 274, ESA 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 274, ESA 2023, Complete Volum

    Algorithms for nonuniform networks

    Get PDF
    In this thesis, observations on structural properties of natural networks are taken as a starting point for developing efficient algorithms for natural instances of different graph problems. The key areas discussed are sampling, clustering, routing, and pattern mining for large, nonuniform graphs. The results include observations on structural effects together with algorithms that aim to reveal structural properties or exploit their presence in solving an interesting graph problem. Traditionally networks were modeled with uniform random graphs, assuming that each vertex was equally important and each edge equally likely to be present. Within the last decade, the approach has drastically changed due to the numerous observations on structural complexity in natural networks, many of which proved the uniform model to be inadequate for some contexts. This quickly lead to various models and measures that aim to characterize topological properties of different kinds of real-world networks also beyond the uniform networks. The goal of this thesis is to utilize such observations in algorithm design, in addition to empowering the process of network analysis. Knowing that a graph exhibits certain characteristics allows for more efficient storage, processing, analysis, and feature extraction. Our emphasis is on local methods that avoid resorting to information of the graph structure that is not relevant to the answer sought. For example, when seeking for the cluster of a single vertex, we compute it without using any global knowledge of the graph, iteratively examining the vicinity of the seed vertex. Similarly we propose methods for sampling and spanning-tree construction according to certain criteria on the outcome without requiring knowledge of the graph as a whole. Our motivation for concentrating on local methods is two-fold: one driving factor is the ever-increasing size of real-world problems, but an equally important fact is the nonuniformity present in many natural graph instances; properties that hold for the entire graph are often lost when only a small subgraph is examined.reviewe
    • …
    corecore