Search CORE

2 research outputs found

iTurboGraph: Scaling and Automating Incremental Graph Analytics

Author: HAN WOOK SHIN
Hong Kijae
Ko Seongyun
Lee Taesung
Lee Wonseok
Seo In
Seo Jiwon
Publication venue: ACM SIGMOD
Publication date: 23/06/2021
Field of study

With the rise of streaming data for dynamic graphs, large-scale graph analytics meets a new requirement of Incremental Computation because the larger the graph, the higher the cost for updating the analytics results by re-execution. A dynamic graph consists of an initial graph G and graph mutation updates ∆G of edge insertions or deletions. Given a query Q, its results Q(G), and updates for ∆G to G, incremental graph analytics computes updates ∆Q such that Q(G ∪ ∆G) = Q(G) ∪ ∆Q where ∪ is a union operator. In this paper, we consider the problem of large-scale incremental neighbor-centric graph analytics (NGA). We solve the limitations of previous systems: lack of usability due to the difficulties in programming incremental algorithms for NGA and limited scalability and efficiency due to the overheads in maintaining intermediate results for graph traversals in NGA. First, we propose a domainspecific language, LN GA, and develop its compiler for intuitive programming of NGA, automatic query incrementalization, and query optimizations. Second, we define Graph Streaming Algebra as a theoretical foundation for scalable processing of incremental NGA. We introduce a concept of Nested Graph Windows and model graph traversals as the generation of walk streams. Lastly, we present a system iTurboGraph, which efficiently processes incremental NGA for large graphs. Comprehensive experiments show that it effectively avoids costly re-executions and efficiently updates the analytics results with reduced IO and computations.1

포항공과대학교

G-CARE: A Framework for Performance Benchmarking of Cardinality Estimation Techniques for Subgraph Matching

Author: Bhowmick Sourav S
HAN WOOK SHIN
HONG KIJAE
KIM KYOUNGMIN
KO SEONGYUN
PARK YEONSU
Publication venue: 'American College of Medical Physics (ACMP)'
Publication date: 14/06/2020
Field of study

Despite the crucial role of cardinality estimation in query optimization, there has been no systematic and in-depth study of the existing cardinality estimation techniques for subgraph matching queries. In this paper, for the first time, we present a comprehensive study of the existing cardinality estimation techniques for subgraph matching queries, scaling far beyond the original experiments. We first introduce a novel framework called g-care that enables us to realize all existing techniques on top of it and that provides insights on their performance. By using g-care, we then reimplement representative cardinality estimation techniques for graph databases as well as relational databases. We next evaluate these techniques w.r.t accuracy on rdf and non-rdf graphs from different domains with subgraph matching queries of various topologies so far considered. Surprisingly, our results reveal that all existing techniques have serious problems in accuracy for various scenarios and datasets. Intriguingly, a simple sampling method based on an online aggregation technique designed for relational data, consistently outperforms all existing techniques.1

Crossref

포항공과대학교