9 research outputs found
Scheduling for today’s computer systems: bridging theory and practice
Scheduling is a fundamental technique for improving performance in computer systems. From web servers
to routers to operating systems, how the bottleneck device is scheduled has an enormous impact on the performance of the system as a whole. Given the immense literature studying scheduling, it is easy to think that we already understand enough about scheduling. But, modern computer system designs have highlighted a number of disconnects between traditional analytic results and the needs of system designers.
In particular, the idealized policies, metrics, and models used by analytic researchers do not match the policies, metrics, and scenarios that appear in real systems.
The goal of this thesis is to take a step towards modernizing the theory of scheduling in order to provide
results that apply to today’s computer systems, and thus ease the burden on system designers. To accomplish
this goal, we provide new results that help to bridge each of the disconnects mentioned above. We will move beyond the study of idealized policies by introducing a new analytic framework where the focus is on scheduling heuristics and techniques rather than individual policies. By moving beyond the study of individual policies, our results apply to the complex hybrid policies that are often used in practice. For example, our results enable designers to understand how the policies that favor small job sizes are affected by the fact that real systems only have estimates of job sizes. In addition, we move beyond the study of mean response time
and provide results characterizing the distribution of response time and the fairness of scheduling policies.
These results allow us to understand how scheduling affects QoS guarantees and whether favoring small job sizes results in large job sizes being treated unfairly. Finally, we move beyond the simplified models traditionally used in scheduling research and provide results characterizing the effectiveness of scheduling in multiserver systems and when users are interactive. These results allow us to answer questions about the how to design multiserver systems and how to choose a workload generator when evaluating new scheduling designs
Query Interactions in Database Systems
The typical workload in a database system consists of a mix of multiple queries of different types, running concurrently and
interacting with each other. The same query may have different performance in different mixes. Hence, optimizing performance
requires reasoning about query mixes and their interactions, rather than considering individual queries or query types. In this
dissertation, we demonstrate how queries affect each other when they are executing concurrently in different mixes. We show the
significant impact that query interactions can have on the end-to-end workload performance. A major hurdle in the understanding of query interactions in
database systems is that there is a large spectrum of possible causes of interactions. For example, query interactions can happen
because of any of the resource-related, data-related or configuration-related dependencies that exist in the system. This
variation in underlying causes makes it very difficult to come up with robust analytical performance models to capture and model query
interactions. We present a new approach for modeling performance in the presence of interactions, based on conducting experiments to measure the effect of query interactions and fitting statistical models to the data collected in these experiments to capture the impact of query interactions. The experiments collect samples of the different possible query mixes, and measure the performance metrics of interest for the different queries in these sample mixes.
Statistical models such as simple regression and instance-based learning techniques are used to train models from these sample mixes. This approach requires no prior assumptions about the internal workings of the database system or the nature or cause of
the interactions, making it portable across systems. We demonstrate the potential of capturing, modeling, and exploiting query interactions by developing techniques to help in two database performance related tasks: workload scheduling and estimating the
completion time of a workload. These are important workload management problems that database administrators have to deal with
routinely. We consider the problem of scheduling a workload of
report-generation queries. Our scheduling algorithms employ statistical performance models to schedule appropriate query mixes
for the given workload. Our experimental evaluation demonstrates that our interaction-aware scheduling algorithms outperform scheduling policies that are typically used in database systems. The problem of estimating the completion time of a workload is an important problem, and the state of the art does not offer any systematic solution. Typically database administrators rely on heuristics or observations of past behavior to solve this problem. We propose a more rigorous solution to this problem, based on a workload simulator that employs performance models to simulate the execution of the different mixes that make up a workload. This mix-based simulator provides a systematic tool that can help database administrators in estimating workload completion time. Our
experimental evaluation shows that our approach can estimate the workload completion times with a high degree of accuracy. Overall, this dissertation demonstrates that reasoning about query
interactions holds significant potential for realizing performance improvements in database systems. The techniques developed in this work can be viewed as initial steps in this interesting area of research, with lots of potential for future work
Recommended from our members
Large deviations analysis of scheduling policies for a web server
With increasing demand and availability of bandwidth resources, there has been tremendous
growth in the scale and speed of web servers. In web servers, scheduling plays an important
role in resource allocation (for instance, bandwidth allocation, processor allocation,
etc). However, as the scale of a system increases so does the number of activities/events
in the system (e.g., job arrivals), as a consequence of which the analysis of scheduling
becomes increasingly harder. In particular, the possible ways in which scheduling failure
(e.g., queue overflow, excessively large delay, instability of a system) can occur becomes
increasingly greater, thus making it more difficult to understand the behavior and develop
design rules for scheduling algorithms. However, a well-known observation from large devi
viations theory that large scale systems fails in a “most likely way” can potentially be used
to simplify the design and analysis of scheduling. In this thesis, we study the implications
and applications of this effect on scheduling in a web server accessed by a large number of
sources.
We analyze the delay distribution of scheduling policies for web servers under a
many sources large deviation regime which models web servers in a large scale system
well. Due to the difficulties brought on considering a large number of sources, only a small
number of scheduling policies, such as First-Come-First-Serve (FCFS), General-ProcessorSharing
(GPS), and Priority Queueing policies have been analyzed under the many sources
regime. In particular, in a single queue single server setup the delay characteristics of only
FCFS, Shortest-Job-First (SJF), and Longest-Job-First (LJF) has been analyzed.
In this thesis, we study the Two-Dimensional-Queueing (2DQ) framework, a unifying
queueing framework that allows the identification of the “most likely way” in which
delay occurs, to analyze the delay of various unexplored scheduling policies. In conjunction
with the 2DQ framework, we develop a new “cycle based” technique for understanding the
large deviations tail probability of more complex policies.
Using the combination of the 2DQ framework and the cycle based analysis, we
first analyze two interesting scheduling policies, i.e., Shortest-Remaining-Processing-Time
(SRPT) policy (which is mean delay optimal) and Processer-Sharing (PS) policy (which is a
“fair” policy). We derive the asymptotic delay distributions (rate functions) of both policies
and study their behavior across job sizes. Next, we address three problems in implementing
the aforementioned scheduling policies: (i) end receivers may have bandwidth constraints
that are not taken account in SRPT, (ii) the remaining processing time information might
not be available to the web-server, and (iii) most actual implementations are variants of
SRPT to reflect other implementation constraints and/or to jointly optimize other metrics
in addition to delay, i.e., jitter, fairness, etc. To address these, we first develop finite-SRPT
that takes into account the bandwidth constraint at the end receiver, and show that the policy
shifts between SRPT and a PS-like policy depending on the bandwidth constraint. Second,
we study the Least-Attained-Service (LAS) policy which is viewed as a good substitute
for SRPT when the remaining job size is not available and we analyze the penalty associated
with not using the remaining size information directly. Lastly, we analyze a class of
scheduling policies known as SMART that contains many variants of SRPT with different
fairness properties and show that all policies in the class have the same tail probability of
delay across job sizes for a many sources regime. The results of this thesis facilitate the
understanding of various scheduling policies under the many sources regime and provides
an analytical queueing framework that can be used to understand other scheduling policies.Electrical and Computer Engineerin
Recommended from our members
Performance Implications of Using Diverse Redundancy for Database Replication
Using diverse redundancy for database replication is the focus of this thesis. Traditionally, database replication solutions have been built on the fail-stop failure assumption, i.e. that crashes are believed to cause a majority of failures. However, recent findings refuted this common assumption, showing that many of the faults cause systematic non-crash failures. These findings demonstrate that the existing, non-diverse database replication solutions, which use the same database server products, are ineffective fault-tolerant mechanisms. At the same time, the findings motivated the use of diverse redundancy (when different database server products are used) as a promising way of improving dependability. It seems that using a fault-tolerant server, built with diverse database servers, would deliver improvements in availability and failure rates compared with the individual database servers or their replicated, non-diverse configurations.
Besides the potential for improving dependability, one would like to evaluate the performance implications of using diverse redundancy in the context of database replication. This is the focal point of the research. The work performed to that end can be summarised as follows:
- We conducted a substantial performance evaluation of database replication using diverse redundancy. We compared its performance to the ones of various non-diverse configurations as well as non-replicated databases. The experiments revealed systematic differences in behaviour of diverse servers. They point to the potential for performance improvement when diverse servers are used. Under particular workloads diverse servers performed better than both non-diverse and non-replicated configurations.
- We devised a middleware-based database replication protocol, which provides dependability assurance and guarantees database consistency. It uses an eager update everywhere approach for replica control. Although we focus on the use of diverse database servers, the protocol can be used with the database servers from the same vendor too. We provide the correctness criteria of the protocol. Different regimes of operation of the protocol are defined, which would allow it to be dynamically optimised for either dependability or performance improvements. Additionally, it can be used in conjunction with high-performance replication solutions.
- We developed an experimental test harness for performance evaluation of different database replication solutions. It enabled us to evaluate the performance of the diverse database replication protocol, e.g. by comparing it against known replication solutions. We show that, as expected, the improved dependability exhibited by our replication protocol carries a performance overhead. Nevertheless, when optimised for performance improvement our protocol shows good performance.
- In order to minimise the performance penalty introduced by the replication we propose a scheme whereby the database server processes are prioritised to deliver performance improvements in cases of low to modest resource utilisation by the database servers.
- We performed an uncertainty-explicit assessment of database server products. Using an integrated approach, where both performance and reliability are considered, we rank different database server products to aid selection of the components for the fault-tolerant server built out of diverse databases
Adaptive Asynchronous Control and Consistency in Distributed Data Exploration Systems
Advances in machine learning and streaming systems provide a backbone to transform vast arrays of raw data into valuable information. Leveraging distributed execution, analysis engines can process this information effectively within an iterative data exploration workflow to solve problems at unprecedented rates. However, with increased input dimensionality, a desire to simultaneously share and isolate information, as well as overlapping and dependent tasks, this process is becoming increasingly difficult to maintain. User interaction derails exploratory progress due to manual oversight on lower level tasks such as tuning parameters, adjusting filters, and monitoring queries. We identify human-in-the-loop management of data generation and distributed analysis as an inhibiting problem precluding efficient online, iterative data exploration which causes delays in knowledge discovery and decision making. The flexible and scalable systems implementing the exploration workflow require semi-autonomous methods integrated as architectural support to reduce human involvement. We, thus, argue that an abstraction layer providing adaptive asynchronous control and consistency management over a series of individual tasks coordinated to achieve a global objective can significantly improve data exploration effectiveness and efficiency. This thesis introduces methodologies which autonomously coordinate distributed execution at a lower level in order to synchronize multiple efforts as part of a common goal. We demonstrate the impact on data exploration through serverless simulation ensemble management and multi-model machine learning by showing improved performance and reduced resource utilization enabling a more productive semi-autonomous exploration workflow. We focus on the specific genres of molecular dynamics and personalized healthcare, however, the contributions are applicable to a wide variety of domains
Improving Preemptive Prioritization via Statistical Characterization of OLTP Locking
OLTP and transactional workloads are increasingly common in computer systems, ranging from e-commerce to warehousing to inventory management. It is valuable to provide priority scheduling in these systems, to reduce the response time for the most important clients, e.g. the "big spenders". Two-phase locking, commonly used in DBMS, makes prioritization difficult, as transactions wait for locks held by others regardless of priority. Common lock scheduling solutions, including non-preemptive priority inheritance and preemptive abort, have performance drawbacks for TPC-C type workloads. The contributions of this paper are two-fold: (i) We provide a detailed statistical analysis of locking in TPC-C workloads with priorities under several common preemptive and non-preemptive lock prioritization policies. We determine why non-preemptive policies fail to sufficiently help high-priority transactions, and why preemptive policies excessively hurt low-priority transactions, (ii) We propose and implement a policy, POW, that provides all the benefits of preemptive prioritization without its penalties
Improving preemptive prioritization via statistical characterization of OLTP locking
OLTP and transactional workloads are increasingly common in computer systems, ranging from ecommerce to warehousing to inventory management. It is valuable to provide priority scheduling in these systems, to reduce the response time for the most important clients, e.g. the “big spenders”. Two-phase locking, commonly used in DBMS, makes prioritization difficult, as transactions wait for locks held by others regardless of priority. Common lock scheduling solutions, including non-preemptive priority inheritance and preemptive abort, have performance drawbacks for TPC-C type workloads. The contributions of this paper are two-fold: (i) We provide a detailed statistical analysis of locking in TPC-C workloads with priorities under several common preemptive and non-preemptive lock prioritization policies. We determine why non-preemptive policies fail to sufficiently help high-priority transactions, and why preemptive policies excessively hurt low-priority transactions. (ii) We propose and implement a policy, POW, that provides all the benefits of preemptive prioritization without its penalties.