Search CORE

7,193 research outputs found

Partout: A Distributed Engine for Efficient RDF Processing

Author: Galárraga Luis
Hose Katja
Schenkel Ralf
Publication venue
Publication date: 01/01/2012
Field of study

The increasing interest in Semantic Web technologies has led not only to a rapid growth of semantic data on the Web but also to an increasing number of backend applications with already more than a trillion triples in some cases. Confronted with such huge amounts of data and the future growth, existing state-of-the-art systems for storing RDF and processing SPARQL queries are no longer sufficient. In this paper, we introduce Partout, a distributed engine for efficient RDF processing in a cluster of machines. We propose an effective approach for fragmenting RDF data sets based on a query log, allocating the fragments to nodes in a cluster, and finding the optimal configuration. Partout can efficiently handle updates and its query optimizer produces efficient query execution plans for ad-hoc SPARQL queries. Our experiments show the superiority of our approach to state-of-the-art approaches for partitioning and distributed SPARQL query processing

arXiv.org e-Print Archive

CiteSeerX

VBN

MPG.PuRe

BriskStream: Scaling Data Stream Processing on Shared-Memory Multicore Architectures

Author: He Bingsheng
He Jiong
Zhang Shuhao
Zhou Amelie Chi
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 07/04/2019
Field of study

We introduce BriskStream, an in-memory data stream processing system (DSPSs) specifically designed for modern shared-memory multicore architectures. BriskStream's key contribution is an execution plan optimization paradigm, namely RLAS, which takes relative-location (i.e., NUMA distance) of each pair of producer-consumer operators into consideration. We propose a branch and bound based approach with three heuristics to resolve the resulting nontrivial optimization problem. The experimental evaluations demonstrate that BriskStream yields much higher throughput and better scalability than existing DSPSs on multi-core architectures when processing different types of workloads.Comment: To appear in SIGMOD'1

arXiv.org e-Print Archive

Crossref

ScholarBank@NUS

Recommended from our members

Valued Redundancy

Author: Chen Shu-Wie
Korz Frederick
Leff Avraham
Pu Calton
Wha Jae M.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/1989
Field of study

Replicated objects increase distributed system performance and availability. An object is more valuable to the system if it contributes more to system performance (e.g. it is frequently accessed) and availability. Similarly, an object is less valuable if it is expensive to maintain (e.g. it is a large object). By replicating only the most valuable objects we use redundancy to maximize system performance and availability a t low cost. A simulation study of a distributed main-memory database shows substantial performance and availability gains with valued redundancy

Columbia University Academic Commons