Search CORE

815 research outputs found

On the Impact of Memory Allocation on High-Performance Query Processing

Author: Durner Dominik
Leis Viktor
Neumann Thomas
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

Somewhat surprisingly, the behavior of analytical query engines is crucially affected by the dynamic memory allocator used. Memory allocators highly influence performance, scalability, memory efficiency and memory fairness to other processes. In this work, we provide the first comprehensive experimental analysis on the impact of memory allocation for high-performance query engines. We test five state-of-the-art dynamic memory allocators and discuss their strengths and weaknesses within our DBMS. The right allocator can increase the performance of TPC-DS (SF 100) by 2.7x on a 4-socket Intel Xeon server

arXiv.org e-Print Archive

Crossref

ClusterMiner: High Performance for Data, Text and Web Mining

Author: Baião Fernanda
Costa Myriam
Ebecken Nelson
Evsukoff Alexandre
Mattoso Marta
Terra Guilherme
Zaverucha Gerson
Publication venue: 'Universidade Federal do Estado do Rio de Janeiro UNIRIO'
Publication date: 24/11/2008
Field of study

Universidade Federal do Estado do Rio de Janeiro: Portal de Revistas da UNIRIO

Data Warehouse and Business Intelligence: Comparative Analysis of Olap tools

Author: Bhetwal Mahesh Kumar
Publication venue: ePublications at Regis University
Publication date: 15/11/2011
Field of study

Data Warehouse applications are designed basically to provide the business communities with accurate and consolidated information. The objective of Data Warehousing applications are not just for collecting data and reporting, but rather for analyzing, it requires technical and business expertise tools. To achieve business intelligence it requires proper tools to be selected. The most commonly used Business intelligence (BI) technologies are Online Analytical Processing (OLAP) and Reporting tools for analyzing the data and to make tactical decision for the better performance of the organization, and more over to provide quick and fast access to end user request. This study will review data warehouse environment and architecture, business intelligence concepts, OLAP and the related theories involved on it. As well as the concept of data warehouse and OLAP, this study will also present comparative analysis of commonly used OLAP tools in Organization

ePublications at Regis University

Impliance: A Next Generation Information Management Appliance

Author: Bhattacharjee Bishwaranjan
Ercegovac Vuk
Glider Joseph
Golding Richard
Lohman Guy
Markl Volke
Pirahesh Hamid
Rao Jun
Rees Robert
Reiss Frederick
Shekita Eugene
Swart Garret
Publication venue
Publication date: 22/12/2006
Field of study

ably successful in building a large market and adapting to the changes of the last three decades, its impact on the broader market of information management is surprisingly limited. If we were to design an information management system from scratch, based upon today's requirements and hardware capabilities, would it look anything like today's database systems?" In this paper, we introduce Impliance, a next-generation information management system consisting of hardware and software components integrated to form an easy-to-administer appliance that can store, retrieve, and analyze all types of structured, semi-structured, and unstructured information. We first summarize the trends that will shape information management for the foreseeable future. Those trends imply three major requirements for Impliance: (1) to be able to store, manage, and uniformly query all data, not just structured records; (2) to be able to scale out as the volume of this data grows; and (3) to be simple and robust in operation. We then describe four key ideas that are uniquely combined in Impliance to address these requirements, namely the ideas of: (a) integrating software and off-the-shelf hardware into a generic information appliance; (b) automatically discovering, organizing, and managing all data - unstructured as well as structured - in a uniform way; (c) achieving scale-out by exploiting simple, massive parallel processing, and (d) virtualizing compute and storage resources to unify, simplify, and streamline the management of Impliance. Impliance is an ambitious, long-term effort to define simpler, more robust, and more scalable information systems for tomorrow's enterprises.Comment: This article is published under a Creative Commons License Agreement (http://creativecommons.org/licenses/by/2.5/.) You may copy, distribute, display, and perform the work, make derivative works and make commercial use of the work, but, you must attribute the work to the author and CIDR 2007. 3rd Biennial Conference on Innovative Data Systems Research (CIDR) January 710, 2007, Asilomar, California, US

arXiv.org e-Print Archive

CiteSeerX

Enabling instant- and interval-based semantics in multidimensional data models: the T+MultiDim Model

Author: Combi Carlo
Oliboni Barbara
Pozzi Giuseppe
Sabaini Alberto
Zim\ue1nyi Esteban
Publication venue: 'Elsevier BV'
Publication date: 01/01/2020
Field of study

Time is a vital facet of every human activity. Data warehouses, which are huge repositories of historical information, must provide analysts with rich mechanisms for managing the temporal aspects of information. In this paper, we (i) propose T+MultiDim, a multidimensional conceptual data model enabling both instant- and interval-based semantics over temporal dimensions, and (ii) provide suitable OLAP (On-Line Analytical Processing) operators for querying temporal information. T+MultiDim allows one to design typical concepts of a data warehouse including temporal dimensions, and provides one with the new possibility of conceptually connecting different temporal dimensions for exploiting temporally aggregated data. The proposed approach allows one to specify and to evaluate powerful OLAP queries over information from data warehouses. In particular, we define a set of OLAP operators to deal with interval-based temporal data. Such operators allow the user to derive new measure values associated to different intervals/instants, according to different temporal semantics. Moreover, we propose and discuss through examples from the healthcare domain the SQL specification of all the temporal OLAP operators we define. (C) 2019 Elsevier Inc. All rights reserved

Catalogo dei prodotti della ricerca

Multidimensional Range Queries on Modern Hardware

Author: Leser Ulf
Schäfer Patrick
Sprenger Stefan
Publication venue
Publication date: 14/05/2018
Field of study

Range queries over multidimensional data are an important part of database workloads in many applications. Their execution may be accelerated by using multidimensional index structures (MDIS), such as kd-trees or R-trees. As for most index structures, the usefulness of this approach depends on the selectivity of the queries, and common wisdom told that a simple scan beats MDIS for queries accessing more than 15%-20% of a dataset. However, this wisdom is largely based on evaluations that are almost two decades old, performed on data being held on disks, applying IO-optimized data structures, and using single-core systems. The question is whether this rule of thumb still holds when multidimensional range queries (MDRQ) are performed on modern architectures with large main memories holding all data, multi-core CPUs and data-parallel instruction sets. In this paper, we study the question whether and how much modern hardware influences the performance ratio between index structures and scans for MDRQ. To this end, we conservatively adapted three popular MDIS, namely the R*-tree, the kd-tree, and the VA-file, to exploit features of modern servers and compared their performance to different flavors of parallel scans using multiple (synthetic and real-world) analytical workloads over multiple (synthetic and real-world) datasets of varying size, dimensionality, and skew. We find that all approaches benefit considerably from using main memory and parallelization, yet to varying degrees. Our evaluation indicates that, on current machines, scanning should be favored over parallel versions of classical MDIS even for very selective queries

arXiv.org e-Print Archive

Crossref