Search CORE

35 research outputs found

UniBench: A Benchmark for Multi-Model Database Management Systems

Author: Albrecht Schmidt
Ewa Płuciennik
J Lehmann
J Leskovec
Jiaheng Lu
KZ Zhang
M Poess
Michael J. Carey
PS Fader
S Gupta
Yueguo Chen
Z Huang
Publication venue: Springer
Publication date: 01/01/2019
Field of study

Unlike traditional database management systems which are organized around a single data model, a multi-model database (MMDB) utilizes a single, integrated back-end to support multiple data models, such as document, graph, relational, and key-value. As more and more platforms are proposed to deal with multi-model data, it becomes crucial to establish a benchmark for evaluating the performance and usability of MMDBs. Previous benchmarks, however, are inadequate for such scenario because they lack a comprehensive consideration for multiple models of data. In this paper, we present a benchmark, called UniBench, with the goal of facilitating a holistic and rigorous evaluation of MMDBs. UniBench consists of a mixed data model, a synthetic multi-model data generator, and a set of core workloads. Specifically, the data model simulates an emerging application: Social Commerce, a Web-based application combining E-commerce and social media. The data generator provides diverse data format including JSON, XML, key-value, tabular, and graph. The workloads are comprised of a set of multi-model queries and transactions, aiming to cover essential aspects of multi-model data management. We implemented all workloads on ArangoDB and OrientDB to illustrate the feasibility of our proposed benchmarking system and show the learned lessons through the evaluation of these two multi-model databases. The source code and data of this benchmark can be downloaded at http://udbms.cs.helsinki.fi/bench/.Peer reviewe

Crossref

Helsingin yliopiston digitaalinen arkisto

A Scalable Smart Meter Data Generator Using Spark

Author: A Shamshad
AH Paassen van
GP Zhang
GP Zhang
J Wu
JG Gooijer De
K Black
K Breinl
KD Lawrence
M Parsian
M Poess
PL Anderson
R Weiers
TW Liao
Z Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Crossref

Online Research Database In Technology

Setting the Direction for Big Data Benchmark Standards

Author: K. Huppler
M. Poess
T. Hogan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Crossref

Design and implementation of a real-time interactive analytics system for large spatio-temporal data

Author: Dean J.
Hunt P.
Poess M.
Zaharia M.
Zaharia M.
Publication venue: 'VLDB Endowment'
Publication date
Field of study

Crossref

An Approach to Benchmarking Industrial Big Data Applications

Author: A Floratou
B Chowdhury
C Baru
C Baru
J-M Zhao
M Poess
M Poess
R Han
T Rabl
Y Zhu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Parameter Curation for Benchmark Queries

Author: Boncz Peter
Gubichev Andrey
Niambar R.
Poess M.
Publication venue
Publication date: 01/08/2014
Field of study

MUDD

Author: John M. Stephens
Kimball R.
Meikel Poess
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Algorithms and Architecture for Managing Evolving ETL Workflows

Author: A Wojciechowski
G Papastefanatos
J Trujillo
L Muñoz
M Poess
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2019
Field of study

ETL processes are responsible for extracting, transforming and loading data from data sources into a data warehouse. Currently, managing ETL workflows has some challenges. First, each ETL tool has its own model for specifying ETL processes. This makes it is difficult to specify ETL processes that are beyond the capabilities of a chosen tool or switch between ETL tools without having to redesign the entire ETL workflow again. Second, a change in structure of a data source leads to ETL workflows that can no longer be executed and yields errors. Therefore, we propose a logical model for ETL processes that makes it feasible to (semi-)automatically repair ETL workflows. Our first approach is to specify ETL processes using Relational Algebra extended with update operations. This way, ETL processes can be automatically translated into SQL queries to be executed into any relational database management system. Later, we will consider expressing ETL tasks by means of an Extensible Markup Language (XML) and other programming languages. We also propose the Extended Evolving-ETL (E3TL) framework in which we will develop algorithms for (semi-) automatic repair of ETL workflows upon data source schema changes.SCOPUS: cp.kinfo:eu-repo/semantics/publishe

Crossref

DI-fusion

Tractor pulling on data warehouses

Author: Kemper A.
Kersten M.L.
Markl V.
Nica A.
Poess M.
Sattler K.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2011
Field of study

International Migration, Integration and Social Cohesion online publications

Role of the TPC in the cloud age

Author: Boncz Peter
Crolotte Alain
Fei-Fei Li
Nambiar Raghunath
Poess M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/07/2020
Field of study

In recent year the TPC Technology Conference on Performance Evaluation and Benchmarking (TPCTC) series have had significant influence in defining industry standards. The 11th TPC Technology Conference on Performance Evaluation and Benchmarking (TPCTC 2019) organized an industry panel on the “Role of the TPC in the Cloud Age”. This paper summaries the panel discussions