5,422 research outputs found

    Graphulo Implementation of Server-Side Sparse Matrix Multiply in the Accumulo Database

    Full text link
    The Apache Accumulo database excels at distributed storage and indexing and is ideally suited for storing graph data. Many big data analytics compute on graph data and persist their results back to the database. These graph calculations are often best performed inside the database server. The GraphBLAS standard provides a compact and efficient basis for a wide range of graph applications through a small number of sparse matrix operations. In this article, we implement GraphBLAS sparse matrix multiplication server-side by leveraging Accumulo's native, high-performance iterators. We compare the mathematics and performance of inner and outer product implementations, and show how an outer product implementation achieves optimal performance near Accumulo's peak write rate. We offer our work as a core component to the Graphulo library that will deliver matrix math primitives for graph analytics within Accumulo.Comment: To be presented at IEEE HPEC 2015: http://www.ieee-hpec.org

    Cognitive modelling of language acquisition with complex networks

    Get PDF
    ABSTRACT Cognitive modelling is a well-established computational intelligence tool, which is very useful for studying cognitive phenomena, such as young children's first language acquisition. Specifically, linguistic modelling has recently benefited greatly from complex network theory by modelling large sets of empirical linguistic data as complex networks, thereby illuminating interesting new patterns and trends. In this chapter, we show how simple network analysis techniques can be applied to the study of language acquisition, and we argue that they reveal otherwise hidden information. We also note that a key network parameter -the ranked frequency distribution of the links -provides useful knowledge about the data, even though it had been previously neglected in this domain

    Scalable RDF Data Compression using X10

    Get PDF
    The Semantic Web comprises enormous volumes of semi-structured data elements. For interoperability, these elements are represented by long strings. Such representations are not efficient for the purposes of Semantic Web applications that perform computations over large volumes of information. A typical method for alleviating the impact of this problem is through the use of compression methods that produce more compact representations of the data. The use of dictionary encoding for this purpose is particularly prevalent in Semantic Web database systems. However, centralized implementations present performance bottlenecks, giving rise to the need for scalable, efficient distributed encoding schemes. In this paper, we describe an encoding implementation based on the asynchronous partitioned global address space (APGAS) parallel programming model. We evaluate performance on a cluster of up to 384 cores and datasets of up to 11 billion triples (1.9 TB). Compared to the state-of-art MapReduce algorithm, we demonstrate a speedup of 2.6-7.4x and excellent scalability. These results illustrate the strong potential of the APGAS model for efficient implementation of dictionary encoding and contributes to the engineering of larger scale Semantic Web applications

    Proceedings of the 2nd Computer Science Student Workshop: Microsoft Istanbul, Turkey, April 9, 2011

    Get PDF

    Research Collaboration Influence Analysis Using Dynamic Co-authorship and Citation Networks

    Get PDF
    Collaborative research is increasing in terms of publications, skills, and formal interactions, which certainly makes it the hotspot in both academia and the industrial sector. Knowing the factors and behavior of dynamic collaboration network provides insights that helps in improving the researcher’s profile and coordinator’s productivity of research. Despite rapid developments in the research collaboration process with various outcomes, its validity is still difficult to address. Existing approaches have used bibliometric network analysis with different aspects to understand collaboration patterns that measure the quality of their corresponding relationships. At this point in time, we would like to investigate an efficient method to outline the credibility of findings in publication—author relations. In this research, we propose a new collaboration method to analyze the structure of research articles using four types of graphs for discerning authors’ influence. We apply different combinations of network relationships and bibliometric analysis on the G-index parameter to disclose their interrelated differences. Our model is designed to find the dynamic indicators of co-authored collaboration with an influence on the author’s behavior in terms of change in research area/interest. In the research we investigate the dynamic relations in an academic field using metadata of openly available articles and collaborating international authors in interrelated areas/domains. Based on filtered evidence of relationship networks and their statistical results, the research shows an increment in productivity and better influence over time
    • …
    corecore