Search CORE

2,096 research outputs found

Distributed Processing of Generalized Graph-Pattern Queries in SPARQL 1.1

Author: Gurajada Sairam
Theobald Martin
Publication venue
Publication date: 01/01/2016
Field of study

We propose an efficient and scalable architecture for processing generalized graph-pattern queries as they are specified by the current W3C recommendation of the SPARQL 1.1 "Query Language" component. Specifically, the class of queries we consider consists of sets of SPARQL triple patterns with labeled property paths. From a relational perspective, this class resolves to conjunctive queries of relational joins with additional graph-reachability predicates. For the scalable, i.e., distributed, processing of this kind of queries over very large RDF collections, we develop a suitable partitioning and indexing scheme, which allows us to shard the RDF triples over an entire cluster of compute nodes and to process an incoming SPARQL query over all of the relevant graph partitions (and thus compute nodes) in parallel. Unlike most prior works in this field, we specifically aim at the unified optimization and distributed processing of queries consisting of both relational joins and graph-reachability predicates. All communication among the compute nodes is established via a proprietary, asynchronous communication protocol based on the Message Passing Interface

arXiv.org e-Print Archive

Open Repository and Bibliography - Luxembourg

MPG.PuRe

06472 Abstracts Collection - XQuery Implementation Paradigms

Author: Boncz Peter A.
Grust Torsten
Keulen Maurice van
Siméon Jerome
Publication venue: Internationales Begegnungs- und Forschungszentrum fuer Informatik (IBFI)
Publication date: 01/01/2007
Field of study

From 19.11.2006 to 22.11.2006, the Dagstuhl Seminar 06472 ``XQuery Implementation Paradigms'' was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

CWI's Institutional Repository

Dagstuhl Research Online Publication Server

University of Twente Research Information

One size does not fit all : accelerating OLAP workloads with GPUs

Author: Han Ruichen
Liu Zhuan
Lu Jiaheng
Wang Shan
Zhang Yansong
Zhang Yu
Publication venue
Publication date: 31/07/2020
Field of study

GPU has been considered as one of the next-generation platforms for real-time query processing databases. In this paper we empirically demonstrate that the representative GPU databases [e.g., OmniSci (Open Source Analytical Database & SQL Engine,, 2019)] may be slower than the representative in-memory databases [e.g., Hyper (Neumann and Leis, IEEE Data Eng Bull 37(1):3-11, 2014)] with typical OLAP workloads (with Star Schema Benchmark) even if the actual dataset size of each query can completely fit in GPU memory. Therefore, we argue that GPU database designs should not be one-size-fits-all; a general-purpose GPU database engine may not be well-suited for OLAP workloads without careful designed GPU memory assignment and GPU computing locality. In order to achieve better performance for GPU OLAP, we need to re-organize OLAP operators and re-optimize OLAP model. In particular, we propose the 3-layer OLAP model to match the heterogeneous computing platforms. The core idea is to maximize data and computing locality to specified hardware. We design the vector grouping algorithm for data-intensive workload which is proved to be assigned to CPU platform adaptive. We design the TOP-DOWN query plan tree strategy to guarantee the optimal operation in final stage and pushing the respective optimizations to the lower layers to make global optimization gains. With this strategy, we design the 3-stage processing model (OLAP acceleration engine) for hybrid CPU-GPU platform, where the computing-intensive star-join stage is accelerated by GPU, and the data-intensive grouping & aggregation stage is accelerated by CPU. This design maximizes the locality of different workloads and simplifies the GPU acceleration implementation. Our experimental results show that with vector grouping and GPU accelerated star-join implementation, the OLAP acceleration engine runs 1.9x, 3.05x and 3.92x faster than Hyper, OmniSci GPU and OmniSci CPU in SSB evaluation with dataset of SF = 100.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

XWeB: the XML Warehouse Benchmark

Author: A. Schmidt
A. Simitsis
C. Kit
J. Darmont
J. Gray
K. Runapongsa
L. Afanasiev
L. Wyatt
P. O’Neil
R. Kimball
R. Torlone
S. Bressan
S. Rizzi
T. Böhme
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/09/2010
Field of study

With the emergence of XML as a standard for representing business data, new decision support applications are being developed. These XML data warehouses aim at supporting On-Line Analytical Processing (OLAP) operations that manipulate irregular XML data. To ensure feasibility of these new tools, important performance issues must be addressed. Performance is customarily assessed with the help of benchmarks. However, decision support benchmarks do not currently support XML features. In this paper, we introduce the XML Warehouse Benchmark (XWeB), which aims at filling this gap. XWeB derives from the relational decision support benchmark TPC-H. It is mainly composed of a test data warehouse that is based on a unified reference model for XML warehouses and that features XML-specific structures, and its associate XQuery decision support workload. XWeB's usage is illustrated by experiments on several XML database management systems

arXiv.org e-Print Archive

Crossref

HAL

Accelerating Foreign-Key Joins using Asymmetric Memory Channels

Author: Kersten M.L. (Martin)
Manegold S. (Stefan)
Pirk H. (Holger)
Publication venue
Publication date: 01/01/2011
Field of study

Indexed Foreign-Key Joins expose a very asymmetric access pattern: the Foreign-Key Index is sequentially scanned whilst the Primary-Key table is target of many quasi-random lookups which is the dominant cost factor. To reduce the costs of the random lookups the fact-table can be (re-) partitioned at runtime to increase access locality on the dimension table, and thus limit the random memory access to inside the CPU's cache. However, this is very hard to optimize and the performance impact on recent architectures is limited because the partitioning costs consume most of the achievable join improvement. GPGPUs on the other hand have an architecture that is well suited for this operation: a relatively slow connection to the large system memory and a very fast connection to the smaller internal device memory. We show how to accelerate Foreign-Key Joins by executing the random table lookups on the GPU's VRAM while sequentially streaming the Foreign- Key-Index through the PCI-E Bus. We also experimentally study the memory access costs on GPU and CPU to provide estimations of the benefit of this technique

CiteSeerX

CWI's Institutional Repository

International Migration, Integration and Social Cohesion online publications

A 2007 Model Curriculum For A Liberal Arts Degree In Computer Science

Author: Kelemen Charles F.
Liberal Arts Computer Science Consortium
Publication venue: 'Transformative Works and Cultures'
Publication date: 25/02/2007
Field of study

Works

Information Exchange Between Humanitarian Organizations: Using the XML Schema IDML

Author: Huesemann Stefan
Publication venue: AIS Electronic Library (AISeL)
Publication date: 01/03/2002
Field of study

This article explains challenges that arise when humanitarian organizations want to coordinate their development activities by means of distributed information systems. It focuses on information exchange based on the eXtensible Markup Language (XML) and relational databases. This piece discusses how to save hierarchical XML documents in relational databases. It introduces conversion rules to derive a relational database model from XML schemas. The rules are applied for the design of a database for the management of humanitarian development projects. The underlying schema for the database is the International Development Markup Language (IDML). This exchange standard for development-related activities is described. The article gives details on how a traditional relational database can import or export XML documents, i.e. how it can be XML-enabled

AIS Electronic Library (AISeL)

Parallel SQL Query Auto-Tuning on Multicore

Author: Heneka Martin
Pankratius Victor
Publication venue: Karlsruher Institut für Technologie
Publication date: 01/01/2011
Field of study

KITopen

A Survey on the Evolution of Stream Processing Systems

Author: Carbone Paris
Fragkoulis Marios
Kalavri Vasiliki
Katsifodimos Asterios
Publication venue
Publication date: 03/08/2020
Field of study

Stream processing has been an active research field for more than 20 years, but it is now witnessing its prime time due to recent successful efforts by the research community and numerous worldwide open-source communities. This survey provides a comprehensive overview of fundamental aspects of stream processing systems and their evolution in the functional areas of out-of-order data management, state management, fault tolerance, high availability, load management, elasticity, and reconfiguration. We review noteworthy past research findings, outline the similarities and differences between early ('00-'10) and modern ('11-'18) streaming systems, and discuss recent trends and open problems.Comment: 34 pages, 15 figures, 5 table

arXiv.org e-Print Archive