839 research outputs found
A Framework for QoS-aware Execution of Workflows over the Cloud
The Cloud Computing paradigm is providing system architects with a new
powerful tool for building scalable applications. Clouds allow allocation of
resources on a "pay-as-you-go" model, so that additional resources can be
requested during peak loads and released after that. However, this flexibility
asks for appropriate dynamic reconfiguration strategies. In this paper we
describe SAVER (qoS-Aware workflows oVER the Cloud), a QoS-aware algorithm for
executing workflows involving Web Services hosted in a Cloud environment. SAVER
allows execution of arbitrary workflows subject to response time constraints.
SAVER uses a passive monitor to identify workload fluctuations based on the
observed system response time. The information collected by the monitor is used
by a planner component to identify the minimum number of instances of each Web
Service which should be allocated in order to satisfy the response time
constraint. SAVER uses a simple Queueing Network (QN) model to identify the
optimal resource allocation. Specifically, the QN model is used to identify
bottlenecks, and predict the system performance as Cloud resources are
allocated or released. The parameters used to evaluate the model are those
collected by the monitor, which means that SAVER does not require any
particular knowledge of the Web Services and workflows being executed. Our
approach has been validated through numerical simulations, whose results are
reported in this paper
Dependence-driven techniques in system design
Burstiness in workloads is often found in multi-tier architectures, storage systems, and communication networks. This feature is extremely important in system design because it can significantly degrade system performance and availability. This dissertation focuses on how to use knowledge of burstiness to develop new techniques and tools for performance prediction, scheduling, and resource allocation under bursty workload conditions.;For multi-tier enterprise systems, burstiness in the service times is catastrophic for performance. Via detailed experimentation, we identify the cause of performance degradation on the persistent bottleneck switch among various servers. This results in an unstable behavior that cannot be captured by existing capacity planning models. In this dissertation, beyond identifying the cause and effects of bottleneck switch in multi-tier systems, we also propose modifications to the classic TPC-W benchmark to emulate bursty arrivals in multi-tier systems.;This dissertation also demonstrates how burstiness can be used to improve system performance. Two dependence-driven scheduling policies, SWAP and ALoC, are developed. These general scheduling policies counteract burstiness in workloads and maintain high availability by delaying selected requests that contribute to burstiness. Extensive experiments show that both SWAP and ALoC achieve good estimates of service times based on the knowledge of burstiness in the service process. as a result, SWAP successfully approximates the shortest job first (SJF) scheduling without requiring a priori information of job service times. ALoC adaptively controls system load by infinitely delaying only a small fraction of the incoming requests.;The knowledge of burstiness can also be used to forecast the length of idle intervals in storage systems. In practice, background activities are scheduled during system idle times. The scheduling of background jobs is crucial in terms of the performance degradation of foreground jobs and the utilization of idle times. In this dissertation, new background scheduling schemes are designed to determine when and for how long idle times can be used for serving background jobs, without violating predefined performance targets of foreground jobs. Extensive trace-driven simulation results illustrate that the proposed schemes are effective and robust in a wide range of system conditions. Furthermore, if there is burstiness within idle times, then maintenance features like disk scrubbing and intra-disk data redundancy can be successfully scheduled as background activities during idle times
Run Time Approximation of Non-blocking Service Rates for Streaming Systems
Stream processing is a compute paradigm that promises safe and efficient
parallelism. Modern big-data problems are often well suited for stream
processing's throughput-oriented nature. Realization of efficient stream
processing requires monitoring and optimization of multiple communications
links. Most techniques to optimize these links use queueing network models or
network flow models, which require some idea of the actual execution rate of
each independent compute kernel within the system. What we want to know is how
fast can each kernel process data independent of other communicating kernels.
This is known as the "service rate" of the kernel within the queueing
literature. Current approaches to divining service rates are static. Modern
workloads, however, are often dynamic. Shared cloud systems also present
applications with highly dynamic execution environments (multiple users,
hardware migration, etc.). It is therefore desirable to continuously re-tune an
application during run time (online) in response to changing conditions. Our
approach enables online service rate monitoring under most conditions,
obviating the need for reliance on steady state predictions for what are
probably non-steady state phenomena. First, some of the difficulties associated
with online service rate determination are examined. Second, the algorithm to
approximate the online non-blocking service rate is described. Lastly, the
algorithm is implemented within the open source RaftLib framework for
validation using a simple microbenchmark as well as two full streaming
applications.Comment: technical repor
- …