Search CORE

14 research outputs found

Deferred lightweight indexing for log-structured key-value stores

Author: Fong L
Iyengar A
Liu L
Palanisamy B
Tan W
Tang Y
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/07/2015
Field of study

The recent shift towards write-intensive workload on bigdata (e.g., financial trading, social user-generated data streams)has pushed the proliferation of log-structured key-value stores, represented by Google's BigTable [1], Apache HBase [2] andCassandra [3]. While providing key-based data access with aPut/Get interface, these key-value stores do not support value-based access methods, which significantly limits their applicability in modern web and database applications. In this paper, we present DELI, a DEferred Lightweight Indexing scheme on the log-structured key-value stores. To index intensively updated bigdata in real time, DELI aims at making the index maintenance as lightweight as possible. The key idea is to apply an append-only design for online index maintenance and to collect index garbage at carefully chosen time. DELI optimizes the performance of index garbage collection through tightly coupling its execution with a native routine process called compaction. The DELI'ssystem design is fault-tolerant and generic (to most key-valuestores), we implemented a prototype of DELI based on HBasewithout internal code modification. Our experiments show that the DELI offers significant performance advantage for the write-intensive index maintenance

CiteSeerX

Crossref

D-Scholarship@Pitt

Designee of a Scalable Database Management Systems (DBMS)

Author: Parvez Md. Masud
Publication venue: Journal of Image Processing and Artificial Intelligence (e-ISSN: 2581-3803)
Publication date: 18/08/2017
Field of study

Scalable database management systems (DBMS)-both for update intensive application workloads as well as decision support systems for descriptive and deep analytics-are a critical part of the cloud infrastructure and play an important role in ensuring the smooth transition of applications from the traditional enterprise infrastructures to next generation cloud infrastructures. Though scalable data management has been a vision for more than three decades and much research has focused on large scale data management in traditional enterprise setting, cloud computing brings its own set of novel challenges that must be addressed to ensure the success of data management solutions in the cloud environment. This tutorial presents an organized picture of the challenges faced by application developers and DBMS designers in developing and deploying internet scale applications. Our background study encompasses both classes of systems: (I) for supporting update heavy applications and (II) for ad-hoc analytics and decision support. We then focus on providing an in-depth analysis of systems for supporting update intensive web-applications and provide a survey of the state-of-the-art in this domain. We crystallize the design choices made by some successful systems large scale database management systems, analyze the application demands and access patterns, and enumerate the desiderata for a cloud-bound DBMS

MAT Journals

Recommended from our members

Easy Freshness with Pequod Cache Joins

Author: Kate Bryan
Kester Michael S
Kohler Eddie W
Mao Yandong
Morris Robert
Narula Neha
Publication venue: USENIX
Publication date: 21/09/2015
Field of study

Pequod is a distributed application-level key-value cache that supports declaratively defined, incrementally maintained, dynamic, partially-materialized views. These views, which we call cache joins, can simplify application development by shifting the burden of view maintenance onto the cache. Cache joins define relationships among key ranges; using cache joins, Pequod calculates views on demand, incrementally updates them as required, and in many cases improves performance by reducing client communication. To build Pequod, we had to design a view abstraction for volatile, relationless key-value caches and make it work across servers in a distributed system. Pequod performs as well as other inmemory key-value caches and, like those caches, outperforms databases with view support.Engineering and Applied Science

Harvard University - DASH

Write-Optimized Indexing for Log-Structured Key-Value Stores

Author: Fong Liana
Iyengar Arun
Liu Ling
Tan Wei
Tang Yuzhe
Publication venue: Georgia Institute of Technology
Publication date: 01/01/2014
Field of study

The recent shift towards write-intensive workload on big data (e.g., financial trading, social user-generated data streams) has pushed the proliferation of the log-structured key-value stores, represented by Google’s BigTable, HBase and Cassandra; these systems optimize write performance by adopting a log-structured merge design. While providing key-based access methods based on a Put/Get interface, these key-value stores do not support value-based access methods, which significantly limits their applicability in many web and Internet applications, such as real-time search for all tweets or blogs containing “government shutdown”. In this paper, we present HINDEX, a write-optimized indexing scheme on the log-structured key-value stores. To index intensively updated big data in real time, the index maintenance is made lightweight by a design tailored to the unique characteristic of the underlying log-structured key-value stores. Concretely, HINDEX performs append-only index updates, which avoids the reading of historic data versions, an expensive operation in the log-structure store. To fix the potentially obsolete index entries, HINDEX proposes an offline index repair process through tight coupling with the routine compactions. HINDEX’s system design is generic to the Put/Get interface; we implemented a prototype of HINDEX based on HBase without internal code modification. Our experiments show that the HINDEX offers significant performance advantage for the write-intensive index maintenance

Scholarly Materials And Research @ Georgia Tech

Transaction Chains: Achieving Serializability with Low Latency in Geo-distributed Storage Systems. In:

Author: Jinyang Li
Marcos K Aguilera
Russell Power
Siyuan Zhou
Yair Sovran
Yang Zhang
Publication venue: 'American College of Medical Physics (ACMP)'
Publication date: 01/01/2013
Field of study

Abstract Currently, users of geo-distributed storage systems face a hard choice between having serializable transactions with high latency, or limited or no transactions with low latency. We show that it is possible to obtain both serializable transactions and low latency, under two conditions. First, transactions are known ahead of time, permitting an a priori static analysis of conflicts. Second, transactions are structured as transaction chains consisting of a sequence of hops, each hop modifying data at one server. To demonstrate this idea, we built Lynx, a geo-distributed storage system that offers transaction chains, secondary indexes, materialized join views, and geo-replication. Lynx uses static analysis to determine if each hop can execute separately while preserving serializability-if so, a client needs wait only for the first hop to complete, which occurs quickly. To evaluate Lynx, we built three applications: an auction service, a Twitter-like microblogging site and a social networking site. These applications successfully use chains to achieve low latency operation and good throughput

CiteSeerX

Materialisierte views in verteilten key-value stores

Author: Adler Jan
Publication venue: Technische Universität München
Publication date: 06/08/2020
Field of study

Distributed key-value stores have become the solution of choice for warehousing large volumes of data. However, their architecture is not suitable for real-time analytics. To achieve the required velocity, materialized views can be used to provide summarized data for fast access. The main challenge then, is the incremental, consistent maintenance of views at large scale. Thus, we introduce our View Maintenance System (VMS) to maintain SQL queries in a data-intensive real-time scenario.Verteilte key-value stores sind ein Typ moderner Datenbanken um große Mengen an Daten zu verarbeiten. Trotzdem erlaubt ihre Architektur keine analytischen Abfragen in Echtzeit. Materialisierte Views können diesen Nachteil ausgleichen, indem sie schnellen Zuriff auf Ergebnisse ermöglichen. Die Herausforderung ist dann, das inkrementelle und konsistente Aktualisieren der Views. Daher präsentieren wir unser View Maintenance System (VMS), das datenintensive SQL Abfragen in Echtzeit berechnet

MediaTUM

Design of efficient and elastic storage in the cloud

Author: VO HOANG TAM
Publication venue
Publication date: 14/08/2012
Field of study

Ph.DDOCTOR OF PHILOSOPH

ScholarBank@NUS

Large Scale Data Management for Enterprise Workloads

Author: Tapdiya Ashish
Publication venue: VANDERBILT
Publication date
Field of study