Search CORE

10 research outputs found

Storage and aggregation for fast analytics systems

Author: Amur Hrishikesh
Publication venue: Georgia Institute of Technology
Publication date: 13/01/2014
Field of study

Computing in the last decade has been characterized by the rise of data- intensive scalable computing (DISC) systems. In particular, recent years have wit- nessed a rapid growth in the popularity of fast analytics systems. These systems exemplify a trend where queries that previously involved batch-processing (e.g., run- ning a MapReduce job) on a massive amount of data, are increasingly expected to be answered in near real-time with low latency. This dissertation addresses the problem that existing designs for various components used in the software stack for DISC sys- tems do not meet the requirements demanded by fast analytics applications. In this work, we focus specifically on two components: 1. Key-value storage: Recent work has focused primarily on supporting reads with high throughput and low latency. However, fast analytics applications require that new data entering the system (e.g., new web-pages crawled, currently trend- ing topics) be quickly made available to queries and analysis codes. This means that along with supporting reads efficiently, these systems must also support writes with high throughput, which current systems fail to do. In the first part of this work, we solve this problem by proposing a new key-value storage system – called the WriteBuffer (WB) Tree – that provides up to 30× higher write per- formance and similar read performance compared to current high-performance systems. 2. GroupBy-Aggregate: Fast analytics systems require support for fast, incre- mental aggregation of data for with low-latency access to results. Existing techniques are memory-inefficient and do not support incremental aggregation efficiently when aggregate data overflows to disk. In the second part of this dis- sertation, we propose a new data structure called the Compressed Buffer Tree (CBT) to implement memory-efficient in-memory aggregation. We also show how the WB Tree can be modified to support efficient disk-based aggregation.Ph.D

Scholarly Materials And Research @ Georgia Tech

Towards Optimal Power Management: Estimation of Performance Degradation due to DVFS on Modern Processors

Author: Amur Hrishikesh
Prvulovic Milos
Schwan Karsten
Publication venue: Georgia Institute of Technology
Publication date: 01/01/2010
Field of study

The alarming growth of the power consumption of data centers coupled with low average utilization of servers suggests the use of power management strategies. Such actions however require the understanding of the effects of the power management actions on the performance of data center applications running on managed platforms. The goal of our research is to accurately estimate power savings and consequent performance degradation from DVFS and thereby better guide the optimization of a performance/power metric of a platform. Towards that end, this paper presents precise performance and power models for DVFS strategies. Precise models are attained by better modeling the performance behavior of modern out-of-order processors, by taking into account, for instance, the effects of cache miss overlapping. Models are validated using benchmarks from the SPEC CPU2006 suite, which show that the observed degradation always falls within the predicted bounds. Also, the upper bound degradation estimates were up to 43% less than those due to a linear degradation model which allows for the aggressive use of DVFS

Scholarly Materials And Research @ Georgia Tech

CiteSeerX

Robust and Flexible Power-Proportional Storage (CMU-PDL-10-106)

Author: Gregory Ganger (3885322)
Hrishikesh Amur (5426993)
James Cipar (5363999)
Karsten Schwan (5426657)
Michael A Kozuch (5358536)
Varun Gupta (5414879)
Publication venue
Publication date: 30/06/2018
Field of study

Power-proportional cluster-based storage is an important component of an overall cloud computing infrastructure. With it, substantial subsets of nodes in the storage cluster can be turned off to save power during periods of low utilization. Rabbit is a distributed file system that arranges its data-layout to provide ideal power-proportionality down to very low minimum number of powered-up nodes (enough to store a primary replica of available datasets). Rabbit addresses the node failure rates of large-scale clusters with data layouts that minimize the number of nodes that must be powered-up if a primary fails. Rabbit also allows different datasets to use different subsets of nodes as a building block for interference avoidance when the infrastructure is shared by multiple tenants. Experiments with a Rabbit prototype demonstrate its power-proportionality, and simulation experiments demonstrate its properties at scale

VM power metering

Author: Ada Gavrilovska
Bhavani Krishnan
BOHRA A.
CARTER J.
CHOU Y.
Hrishikesh Amur
Karsten Schwan
NATHUJI R.
STOESS J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Coordinated Optimization of Cooling and IT Power in Data Centers

Author: ASHRAE
Balkan
Bhavani Krishnan
Bhopte
Boucher
Chase
Chen
Crippen
Emad Samadiani
Greenberg
Heath
Hrishikesh Amur
Isci
Iyengar
Karsten Schwan
Kumar
Kumar
Kumar
Lewis
Mistree
Moore
Moore
Moore
Nathuji
Nathuji
Nathuji
Nathuji
Patel
Patel
Raghavendra
Rambo
Rambo
Rambo
Rolander
Samadiani
Samadiani
Samadiani
Samadiani
Schmidt
Schmidt
Schmit
Shah
Shah
Sharkey
Shrivastava
Tang
VanGilder
Yogendra Joshi
Zhu
Publication venue: 'ASME International'
Publication date
Field of study

Crossref