Search CORE

2,852 research outputs found

STEPS Towards Cache-Resident Transaction Processing

Author: A AILAMAKI
S HARIZOPOULOS
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

Implications of non-volatile memory as primary storage for database management systems

Author: Armejach Sanosa Adrià
Cristal Kestelman Adrián
Ozturk Ozcan
Ul Mustafa Naveed
Unsal Osman Sabri
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/01/2017
Field of study

Traditional Database Management System (DBMS) software relies on hard disks for storing relational data. Hard disks are cheap, persistent, and offer huge storage capacities. However, data retrieval latency for hard disks is extremely high. To hide this latency, DRAM is used as an intermediate storage. DRAM is significantly faster than disk, but deployed in smaller capacities due to cost and power constraints, and without the necessary persistency feature that disks have. Non-Volatile Memory (NVM) is an emerging storage class technology which promises the best of both worlds. It can offer large storage capacities, due to better scaling and cost metrics than DRAM, and is non-volatile (persistent) like hard disks. At the same time, its data retrieval time is much lower than that of hard disks and it is also byte-addressable like DRAM. In this paper, we explore the implications of employing NVM as primary storage for DBMS. In other words, we investigate the modifications necessary to be applied on a traditional relational DBMS to take advantage of NVM features. As a case study, we have modified the storage engine (SE) of PostgreSQL enabling efficient use of NVM hardware. We detail the necessary changes and challenges such modifications entail and evaluate them using a comprehensive emulation platform. Results indicate that our modified SE reduces query execution time by up to 40% and 14.4% when compared to disk and NVM storage, with average reductions of 20.5% and 4.5%, respectively.The research leading to these results has received funding from the European Union’s 7th Framework Programme under grant agreement number 318633, the Ministry of Science and Technology of Spain under contract TIN2015-65316-P, and a HiPEAC collaboration grant awarded to Naveed Ul Mustafa.Peer ReviewedPostprint (author's final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

Scheduling Data-Intensive Tasks on Heterogeneous Many Cores

Author: Kotthaus Helena
Tözün Pinar
Publication venue
Publication date: 01/01/2019
Field of study

The IT University of Copenhagen's Repository

LogBase: A Scalable Log-structured Database System in the Cloud

Author: Agrawal Divyakant
Chen Gang
Ooi Beng Chin
Vo Hoang Tam
Wang Sheng
Publication venue
Publication date: 01/01/2012
Field of study

Numerous applications such as financial transactions (e.g., stock trading) are write-heavy in nature. The shift from reads to writes in web applications has also been accelerating in recent years. Write-ahead-logging is a common approach for providing recovery capability while improving performance in most storage systems. However, the separation of log and application data incurs write overheads observed in write-heavy environments and hence adversely affects the write throughput and recovery time in the system. In this paper, we introduce LogBase - a scalable log-structured database system that adopts log-only storage for removing the write bottleneck and supporting fast system recovery. LogBase is designed to be dynamically deployed on commodity clusters to take advantage of elastic scaling property of cloud environments. LogBase provides in-memory multiversion indexes for supporting efficient access to data maintained in the log. LogBase also supports transactions that bundle read and write operations spanning across multiple records. We implemented the proposed system and compared it with HBase and a disk-based log-structured record-oriented system modeled after RAMCloud. The experimental results show that LogBase is able to provide sustained write throughput, efficient data access out of the cache, and effective system recovery.Comment: VLDB201

arXiv.org e-Print Archive

CiteSeerX

ScholarBank@NUS

The End of Slow Networks: It's Time for a Redesign

Author: Binnig Carsten
Crotty Andrew
Galakatos Alex
Kraska Tim
Zamanian Erfan
Publication venue
Publication date: 19/12/2015
Field of study

Next generation high-performance RDMA-capable networks will require a fundamental rethinking of the design and architecture of modern distributed DBMSs. These systems are commonly designed and optimized under the assumption that the network is the bottleneck: the network is slow and "thin", and thus needs to be avoided as much as possible. Yet this assumption no longer holds true. With InfiniBand FDR 4x, the bandwidth available to transfer data across network is in the same ballpark as the bandwidth of one memory channel, and it increases even further with the most recent EDR standard. Moreover, with the increasing advances of RDMA, the latency improves similarly fast. In this paper, we first argue that the "old" distributed database design is not capable of taking full advantage of the network. Second, we propose architectural redesigns for OLTP, OLAP and advanced analytical frameworks to take better advantage of the improved bandwidth, latency and RDMA capabilities. Finally, for each of the workload categories, we show that remarkable performance improvements can be achieved

arXiv.org e-Print Archive

TUbiblio

A NoC-based hybrid message-passing/shared-memory approach to CMP design

Author: Agarwal
Daemen
Forsell
Grecu
Karniadakis
Lorensen
Mario R. Casu
Massimo Ruo Roch
Maurizio Zamboni
Owens
Paulin
Radulescu
Sergio V. Tota
Snir
Tota
Publication venue: Elsevier
Publication date: 01/01/2011
Field of study

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Finding Associations and Computing Similarity via Biased Pair Sampling

Author: Campagna Andrea
Pagh Rasmus
Publication venue
Publication date: 01/01/2009
Field of study

This version is ***superseded*** by a full version that can be found at http://www.itu.dk/people/pagh/papers/mining-jour.pdf, which contains stronger theoretical results and fixes a mistake in the reporting of experiments. Abstract: Sampling-based methods have previously been proposed for the problem of finding interesting associations in data, even for low-support items. While these methods do not guarantee precise results, they can be vastly more efficient than approaches that rely on exact counting. However, for many similarity measures no such methods have been known. In this paper we show how a wide variety of measures can be supported by a simple biased sampling method. The method also extends to find high-confidence association rules. We demonstrate theoretically that our method is superior to exact methods when the threshold for "interesting similarity/confidence" is above the average pairwise similarity/confidence, and the average support is not too low. Our method is particularly good when transactions contain many items. We confirm in experiments on standard association mining benchmarks that this gives a significant speedup on real data sets (sometimes much larger than the theoretical guarantees). Reductions in computation time of over an order of magnitude, and significant savings in space, are observed.Comment: This is an extended version of a paper that appeared at the IEEE International Conference on Data Mining, 2009. The conference version is (c) 2009 IEE

arXiv.org e-Print Archive

CiteSeerX

The IT University of Copenhagen's Repository

Database architecture evolution: Mammals flourished long before dinosaurs became extinct

Author: Boncz P.A. (Peter)
Kersten M.L. (Martin)
Manegold S. (Stefan)
Publication venue: 'VLDB Endowment'
Publication date: 01/08/2009
Field of study

The holy grail for database architecture research is to find a solution that is Scalable & Speedy, to run on anything from small ARM processors up to globally distributed compute clusters, Stable & Secure, to service a broad user community, Small & Simple, to be comprehensible to a small team of programmers, Self-managing, to let it run out-of-the-box without hassle. In this paper, we provide a trip report on this quest, covering both past experiences, ongoing research on hardware-conscious algorithms, and novel ways towards self-management specifically focused on column store solutions

CWI's Institutional Repository