5,422 research outputs found
Graphulo Implementation of Server-Side Sparse Matrix Multiply in the Accumulo Database
The Apache Accumulo database excels at distributed storage and indexing and
is ideally suited for storing graph data. Many big data analytics compute on
graph data and persist their results back to the database. These graph
calculations are often best performed inside the database server. The GraphBLAS
standard provides a compact and efficient basis for a wide range of graph
applications through a small number of sparse matrix operations. In this
article, we implement GraphBLAS sparse matrix multiplication server-side by
leveraging Accumulo's native, high-performance iterators. We compare the
mathematics and performance of inner and outer product implementations, and
show how an outer product implementation achieves optimal performance near
Accumulo's peak write rate. We offer our work as a core component to the
Graphulo library that will deliver matrix math primitives for graph analytics
within Accumulo.Comment: To be presented at IEEE HPEC 2015: http://www.ieee-hpec.org
Cognitive modelling of language acquisition with complex networks
ABSTRACT Cognitive modelling is a well-established computational intelligence tool, which is very useful for studying cognitive phenomena, such as young children's first language acquisition. Specifically, linguistic modelling has recently benefited greatly from complex network theory by modelling large sets of empirical linguistic data as complex networks, thereby illuminating interesting new patterns and trends. In this chapter, we show how simple network analysis techniques can be applied to the study of language acquisition, and we argue that they reveal otherwise hidden information. We also note that a key network parameter -the ranked frequency distribution of the links -provides useful knowledge about the data, even though it had been previously neglected in this domain
Scalable RDF Data Compression using X10
The Semantic Web comprises enormous volumes of semi-structured data elements.
For interoperability, these elements are represented by long strings. Such
representations are not efficient for the purposes of Semantic Web applications
that perform computations over large volumes of information. A typical method
for alleviating the impact of this problem is through the use of compression
methods that produce more compact representations of the data. The use of
dictionary encoding for this purpose is particularly prevalent in Semantic Web
database systems. However, centralized implementations present performance
bottlenecks, giving rise to the need for scalable, efficient distributed
encoding schemes. In this paper, we describe an encoding implementation based
on the asynchronous partitioned global address space (APGAS) parallel
programming model. We evaluate performance on a cluster of up to 384 cores and
datasets of up to 11 billion triples (1.9 TB). Compared to the state-of-art
MapReduce algorithm, we demonstrate a speedup of 2.6-7.4x and excellent
scalability. These results illustrate the strong potential of the APGAS model
for efficient implementation of dictionary encoding and contributes to the
engineering of larger scale Semantic Web applications
Research Collaboration Influence Analysis Using Dynamic Co-authorship and Citation Networks
Collaborative research is increasing in terms of publications, skills, and formal interactions, which certainly makes it the hotspot in both academia and the industrial sector. Knowing the factors and behavior of dynamic collaboration network provides insights that helps in improving the researcher’s profile and coordinator’s productivity of research. Despite rapid developments in the research collaboration process with various outcomes, its validity is still difficult to address. Existing approaches have used bibliometric network analysis with different aspects to understand collaboration patterns that measure the quality of their corresponding relationships. At this point in time, we would like to investigate an efficient method to outline the credibility of findings in publication—author relations. In this research, we propose a new collaboration method to analyze the structure of research articles using four types of graphs for discerning authors’ influence. We apply different combinations of network relationships and bibliometric analysis on the G-index parameter to disclose their interrelated differences. Our model is designed to find the dynamic indicators of co-authored collaboration with an influence on the author’s behavior in terms of change in research area/interest. In the research we investigate the dynamic relations in an academic field using metadata of openly available articles and collaborating international authors in interrelated areas/domains. Based on filtered evidence of relationship networks and their statistical results, the research shows an increment in productivity and better influence over time
- …