9 research outputs found

    Uncovering Hierarchical Structure in Social Networks using Isospectral Reductions

    Full text link
    We employ the recently developed theory of isospectral network reductions to analyze multi-mode social networks. This procedure allows us to uncover the hierarchical structure of the networks we consider as well as the hierarchical structure of each mode of the network. Additionally, by performing a dynamical analysis of these networks we are able to analyze the evolution of their structure allowing us to find a number of other network features. We apply both of these approaches to the Southern Women Data Set, one of the most studied social networks and demonstrate that these techniques provide new information, which complements previous findings.Comment: 17 pages, 5 figures, 5 table

    Efficient Community Search on Large Bipartite Graphs

    Full text link
    In many real-world applications, bipartite graphs are naturally used to model relationships between two types of entities. Community discovery over bipartite graphs is a fundamental problem and has attracted much attention recently. However, all existing studies overlook the weight (e.g., influence or importance) of vertices in forming the community, thus missing useful properties of the community. In this thesis, we propose a novel cohesive subgraph model named Pareto-optimal (α, β)-community, which is the first to consider both structure cohesiveness and weight of vertices on bipartite graphs. The proposed Pareto-optimal (α, β)-community model follows the concept of (α, β)-core by im- posing degree constraints for each type of vertices, and integrates the Pareto-optimality in mod- eling the weight information from two different types of vertices. An online query algorithm is developed to retrieve Pareto-optimal (α, β)-communities with the time complexity of O(p · m) where p is the number of resulting communities, and m is the number of edges in the bipartite graph G. To support efficient query processing over large graphs, we also develop index-based approaches. A complete index is proposed, and the query algorithm based on I achieves linear query processing time regarding the result size (i.e., the algorithm is optimal). Nevertheless, the index incurs prohibitively expensive space complexity. To strike a balance between query effi- ciency and space complexity, a space-efficient compact index is proposed. Computation-sharing strategies are devised to improve the efficiency of the index construction process for the index. Extensive experiments on 9 real-world graphs validate both the effectiveness and the efficiency of our query processing algorithms and indexing techniques

    Graph Data Processing and Analysis: From Algorithms to System Development

    Full text link
    There are many real-world application domains where data can be naturally modelled as graphs, such as social networks and computer networks. The amount of data generated and published is rapidly increasing with the explosion of information. Effective storage of graph data and querying has become a significant challenge; hence the graph database is emerging to address this challenge. Graph databases have the unique advantages of modelling and querying complex relationships, capturing and navigating complex data relationships and recursive path querying when handling graph data. In this thesis, we enhance graph databases from both system and algorithm perspectives. Firstly, we propose two systems, SQL2Cypher and FSPS, to improve the usability and efficiency of graph databases. SQL2Cypher automatically migrates data from a relational database to a graph database. This system also supports translating SQL queries into Cypher queries. FSPS is the first FPGA-based system for accelerating graph queries on massive graphs. FSPS has the following features 1) a CPU-FPGA co-designed framework, 2) a fully pipelined FPGA execution, and 3) reduced data transfer from FPGA’s external memory. FSPS supports the two most fundamental types of graph queries, namely subgraph and path queries. Performance evaluation shows that FSPS outperforms the most popular graph database, Neo4j, by up to three orders of magnitude. All the draft demo videos can be found at https://www.youtube.com/watch?v=oSpHtJ8iVio and https://www.youtube.com/watch?v=eGaeBrVTJws. Secondly, the graph database does not widely support the cohesive subgraph models (i.e., Neo4j and PatMat). Many real-world relationships can be naturally represented as bipartite graphs such as customer-product, user-item, and author-paper. Therefore, we use efficient construct algorithms to investigate the bipartite hierarchy model. The bipartite hierarchy is the first model to discover the hierarchical structure of bipartite graphs based on the concept of (alpha, beta)-core and graph connectivity. These algorithms can effectively identify the affected regions to limit computation scope and avoid re-building the bipartite hierarchy from scratch. Extensive experiments on 10 real-world graphs demonstrate the effectiveness of the proposed bipartite hierarchy and validate the efficiency of our hierarchy constructions algorithms

    Analysis of two-mode networks and multiplication of networks

    Get PDF
    The topic of presented dissertation is the analysis of two-mode networks. We deal with the analysis from two aspects. On one hand we research the theoretical part – the methods (network multiplication, normalization of a network, and generalized two- mode cores). On the other hand we use an example of bibliographic data to present the conversion of a data into a set of compatible networks. From those we are able to get derived networks using a network multiplication. Described approach can also be used for other types of data. We deal with the network multiplication especially multiplication of large sparse networks. We answer the question when the product of two sparse networks is also sparse. Using different semirings in a network multiplication we get new interpreta- tions of a product of networks. Semirings that are perspective for the network analysis are presented in this dissertation. A new method for determining important subnetworks in two-mode networks is presented – generalized two-mode cores. is method allows a direct analysis of two-mode networks and we can determine subnetworks according to different properties of links or vertices of a network with it. In presented dissertation are also overviewed basic approaches for obtaining the net- work data. We made an analysis of bibliographic networks to explore the impact of normalizations in collaboration network and in other derived networks. Bibliographic data are especially interesting for the network analysis because with them we get an insight into a development of a particular scientific field

    Analysis of two-mode networks and multiplication of networks

    Get PDF
    Tema disertacije je analiza dvovrstnih omrežij. Analize se lotimo iz dveh strani. Po eni strani razdelamo teoretično plat – metode (množenje omrežij, normalizacija omrežij in posplošene dvovrstne sredice). Po drugi strani prikažemo na primeru bibliografskih podatkov, kako lahko podatke pretvorimo v nabor usklajenih omrežij, iz katerih lahko z množenjem pridobimo različna izpeljana omrežja. Opisani pristop je uporaben veliko širše. Pri metodah se podrobneje posvetimo množenju omrežij s poudarkom na velikih redkih omrežjih. Odgovorimo na vprašanje, kdaj je produkt dveh redkih omrežij tudi sam redko omrežje. Z uvedbo različnih polkolobarjev v množenje omrežij se odprejo novi pogledi na uporabnost produkta omrežij. Zato naredimo pregled polkolobarjev, ki so perspektivni za uporabo v analizi omrežij. Razvita in analizirana je tudi nova metoda za iskanje pomembnih podomrežij v dvo- vrstnih omrežjih – posplošene dvovrstne sredice. Ta omogoča direktno analizo dvovrstnih omrežij, v katerih poiščemo podomrežja glede na izbrani lastnosti vozlišč na obeh množicah vozlišč dvovrstnega omrežja. V disertaciji je narejen tudi pregled osnovnih pristopov za pridobivanje omrežnih podatkov. Podrobneje se posvetimo analizi bibliografskih omrežij, kjer raziščemo vpliv normalizacije v omrežjih sodelovanj in drugih izpeljanih omrežjih. Bibliografski podatki so še posebej zanimivi za analize, saj omogočajo vpogled v razvoj posameznega znanstvenega področja.The topic of presented dissertation is the analysis of two-mode networks. We deal with the analysis from two aspects. On one hand we research the theoretical part – the methods (network multiplication, normalization of a network, and generalized two- mode cores). On the other hand we use an example of bibliographic data to present the conversion of a data into a set of compatible networks. From those we are able to get derived networks using a network multiplication. Described approach can also be used for other types of data. We deal with the network multiplication especially multiplication of large sparse networks. We answer the question when the product of two sparse networks is also sparse. Using different semirings in a network multiplication we get new interpreta- tions of a product of networks. Semirings that are perspective for the network analysis are presented in this dissertation. A new method for determining important subnetworks in two-mode networks is presented – generalized two-mode cores. is method allows a direct analysis of two-mode networks and we can determine subnetworks according to different properties of links or vertices of a network with it. In presented dissertation are also overviewed basic approaches for obtaining the net- work data. We made an analysis of bibliographic networks to explore the impact of normalizations in collaboration network and in other derived networks. Bibliographic data are especially interesting for the network analysis because with them we get an insight into a development of a particular scientific field
    corecore