6 research outputs found

    Graphs, Matrices, and the GraphBLAS: Seven Good Reasons

    Get PDF
    The analysis of graphs has become increasingly important to a wide range of applications. Graph analysis presents a number of unique challenges in the areas of (1) software complexity, (2) data complexity, (3) security, (4) mathematical complexity, (5) theoretical analysis, (6) serial performance, and (7) parallel performance. Implementing graph algorithms using matrix-based approaches provides a number of promising solutions to these challenges. The GraphBLAS standard (istc- bigdata.org/GraphBlas) is being developed to bring the potential of matrix based graph algorithms to the broadest possible audience. The GraphBLAS mathematically defines a core set of matrix-based graph operations that can be used to implement a wide class of graph algorithms in a wide range of programming environments. This paper provides an introduction to the GraphBLAS and describes how the GraphBLAS can be used to address many of the challenges associated with analysis of graphs.Comment: 10 pages; International Conference on Computational Science workshop on the Applications of Matrix Computational Methods in the Analysis of Modern Dat

    Graph Theoretic Lattice Mining Based on Formal Concept Analysis (FCA) Theory for Text Mining

    Get PDF
    The growth of the semantic web has fueled the need to search for information based on the understanding of the intent of the searcher, coupled with the contextual meaning of the keywords supplied by the searcher. The common solution to enhance the searching process includes the deployment of formal concept analysis (FCA) theory to extract concepts from a set of text with the use of corresponding domain ontology. However, creating a domain ontology or cross-platform ontology is a tedious and time consuming process that requires validation from domain experts. Therefore, this study proposed an alternative solution called Lattice Mining (LM) that utilizes FCA theory and graph theory. This is because the process of matching a query to related documents is similar to the process of graph matching if both the query and the documents are represented using graphs. This study adopted the idea of FCA in the determination of the concepts based on texts and deployed the lattice diagrams obtained from an FCA tool for further analysis using graph theory. The LM technique employed in this study utilized the adjacency matrices obtained from the lattice outputs and performed a distance measure technique to calculate the similarity between two graphs. The process was realized successively via the implementation of three algorithms called the Relatedness Algorithm (RA), the Adjacency Matrix Algorithm (AMA) and the Concept-Based Lattice Mining (CBLM) Algorithm. A similarity measure between FCA output lattices yielded promising results based on the ranking of the trace values from the matrices. Recognizing the potential of this method, future work includes refinement in the steps of the CBLM algorithm for a more efficient implementation of the process

    Mining Marked Nodes in Large Graphs

    Get PDF
    abstract: With the rise of the Big Data Era, an exponential amount of network data is being generated at an unprecedented rate across a wide-range of high impact micro and macro areas of research---from protein interaction to social networks. The critical challenge is translating this large scale network data into actionable information. A key task in the data translation is the analysis of network connectivity via marked nodes---the primary focus of our research. We have developed a framework for analyzing network connectivity via marked nodes in large scale graphs, utilizing novel algorithms in three interrelated areas: (1) analysis of a single seed node via it’s ego-centric network (AttriPart algorithm); (2) pathway identification between two seed nodes (K-Simple Shortest Paths Multithreaded and Search Reduced (KSSPR) algorithm); and (3) tree detection, defining the interaction between three or more seed nodes (Shortest Path MST algorithm). In an effort to address both fundamental and applied research issues, we have developed the LocalForcasting algorithm to explore how network connectivity analysis can be applied to local community evolution and recommender systems. The goal is to apply the LocalForecasting algorithm to various domains---e.g., friend suggestions in social networks or future collaboration in co-authorship networks. This algorithm utilizes link prediction in combination with the AttriPart algorithm to predict future connections in local graph partitions. Results show that our proposed AttriPart algorithm finds up to 1.6x denser local partitions, while running approximately 43x faster than traditional local partitioning techniques (PageRank-Nibble). In addition, our LocalForecasting algorithm demonstrates a significant improvement in the number of nodes and edges correctly predicted over baseline methods. Furthermore, results for the KSSPR algorithm demonstrate a speed-up of up to 2.5x the standard k-simple shortest paths algorithm.Dissertation/ThesisMasters Thesis Computer Science 201

    Algorithms and Software for the Analysis of Large Complex Networks

    Get PDF
    The work presented intersects three main areas, namely graph algorithmics, network science and applied software engineering. Each computational method discussed relates to one of the main tasks of data analysis: to extract structural features from network data, such as methods for community detection; or to transform network data, such as methods to sparsify a network and reduce its size while keeping essential properties; or to realistically model networks through generative models

    Local community detection based on small cliques

    Get PDF
    Community detection aims to find dense subgraphs in a network. We consider the problem of finding a community locally around a seed node both in unweighted and weighted networks. This is a faster alternative to algorithms that detect communities that cover the whole network when actually only a single community is required. Further, many overlapping community detection algorithms use local community detection algorithms as basic building block. We provide a broad comparison of different existing strategies of expanding a seed node greedily into a community. For this, we conduct an extensive experimental evaluation both on synthetic benchmark graphs as well as real world networks. We show that results both on synthetic as well as real-world networks can be significantly improved by starting from the largest clique in the neighborhood of the seed node. Further, our experiments indicate that algorithms using scores based on triangles outperform other algorithms in most cases. We provide theoretical descriptions as well as open source implementations of all algorithms used
    corecore