Search CORE

2 research outputs found

Elastic database systems

Author: Taft Rebecca (Rebecca Yale)
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2017
Field of study

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2017.Cataloged from PDF version of thesis.Includes bibliographical references (pages 131-139).Distributed on-line transaction processing (OLTP) database management systems (DBMSs) are a critical part of the operation of large enterprises. These systems often serve time-varying workloads due to daily, weekly or seasonal fluctuations in load, or because of rapid growth in demand due to a company's business success. In addition, many OLTP workloads are heavily skewed to "hot" tuples or ranges of tuples. For example, the majority of NYSE volume involves only 40 stocks. To manage such fluctuations, many companies currently provision database servers for peak demand. This approach is wasteful and not resilient to extreme skew or large workload spikes. To be both efficient and resilient, a distributed OLTP DBMS must be elastic; that is, it must be able to expand and contract its cluster of servers as demand fluctuates, and dynamically balance load as hot tuples vary over time. This thesis presents two elastic OLTP DBMSs, called E-Store and P-Store, which demonstrate the benefits of elasticity for distributed OLTP DBMSs on different types of workloads. E-Store automatically scales the database cluster in response to demand spikes, periodic events, and gradual changes in an application's workload, but it is particularly well-suited for managing hot spots. In contrast to traditional single-tier hash and range partitioning strategies, E-Store manages hot spots through a two-tier data placement strategy: cold data is distributed in large chunks, while smaller ranges of hot tuples are assigned explicitly to individual nodes. P-Store is an elastic OLTP DBMS that is designed for a subset of OLTP applications in which load varies predictably. For these applications, P-Store performs better than reactive systems like E-Store, because P-Store uses predictive modeling to reconfigure the system in advance of predicted load changes. The experimental evaluation shows the efficacy of the two systems under variations in load across a cluster of machines. Compared to single-tier approaches, E-Store improves throughput by up to 130% while reducing latency by 80%. On a predictable workload, P-Store outperforms a purely reactive system by causing 72% fewer latency violations, and achieves performance comparable to static allocation for peak demand while using 50% fewer servers.by Rebecca Taft.Ph. D

DSpace@MIT

Predictive modeling for management of database resources in the cloud

Author: Taft Rebecca (Rebecca Yale)
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2015
Field of study

Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2015.Cataloged from PDF version of thesis.Includes bibliographical references (pages [68]-[70]).Public cloud providers who support a Database-as-a-Service offering must efficiently allocate computing resources to each of their customers in order to reduce the total number of servers needed without incurring SLA violations. For example, Microsoft serves more than one million database customers on its Azure SQL Database platform. In order to avoid unnecessary expense and stay competitive in the cloud market, Microsoft must pack database tenants onto servers as efficiently as possible. This thesis examines a dataset which contains anonymized customer resource usage statistics from Microsoft's Azure SQL Database service over a three-month period in late 2014. Using this data, this thesis contributes several new algorithms to efficiently pack database tenants onto servers by collocating tenants with compatible usage patterns. An experimental evaluation shows that the placement algorithms, specifically the Scalar Static algorithm and the Dynamic algorithm, are able to pack databases onto half of the machines used in production while incurring fewer SLA violations. The evaluation also shows that with two different cost models these algorithms can save 80% of operational costs compared to the algorithms used in production in late 2014.by Rebecca Taft.S.M

DSpace@MIT