4,732 research outputs found
gMark: Schema-Driven Generation of Graphs and Queries
Massive graph data sets are pervasive in contemporary application domains.
Hence, graph database systems are becoming increasingly important. In the
experimental study of these systems, it is vital that the research community
has shared solutions for the generation of database instances and query
workloads having predictable and controllable properties. In this paper, we
present the design and engineering principles of gMark, a domain- and query
language-independent graph instance and query workload generator. A core
contribution of gMark is its ability to target and control the diversity of
properties of both the generated instances and the generated workloads coupled
to these instances. Further novelties include support for regular path queries,
a fundamental graph query paradigm, and schema-driven selectivity estimation of
queries, a key feature in controlling workload chokepoints. We illustrate the
flexibility and practical usability of gMark by showcasing the framework's
capabilities in generating high quality graphs and workloads, and its ability
to encode user-defined schemas across a variety of application domains.Comment: Accepted in November 2016. URL:
http://ieeexplore.ieee.org/document/7762945/. in IEEE Transactions on
Knowledge and Data Engineering 201
- …