374 research outputs found
Autonomic Provisioning with Self-Adaptive Neural Fuzzy Control for End-to-end Delay Guarantee
AbstractâAutonomic server provisioning for performance as-surance is a critical issue in data centers. It is important but challenging to guarantee an important performance metric, percentile-based end-to-end delay of requests flowing through a virtualized multi-tier server cluster. It is mainly due to dynamically varying workload and the lack of an accurate system performance model. In this paper, we propose a novel autonomic server allocation approach based on a model-independent and self-adaptive neural fuzzy control. There are model-independent fuzzy controllers that utilize heuristic knowledge in the form of rule base for performance assurance. Those controllers are designed manually on trial and error basis, often not effective in the face of highly dynamic workloads. We design the neural fuzzy controller as a hybrid of control theoretical and machine learning techniques. It is capable of self-constructing its structure and adapting its parameters through fast online learning. Unlike other supervised machine learning techniques, it does not require off-line training. We further enhance the neural fuzzy controller to compensate for the effect of server switching delays. Extensive simulations demonstrate the effectiveness of our new approach in achieving the percentile-based end-to-end delay guarantees. Com-pared to a rule-based fuzzy controller enabled server allocation approach, the new approach delivers superior performance in the face of highly dynamic workloads. It is robust to workload variation, change in delay target and server switching delays. I
Self-Learning Cloud Controllers: Fuzzy Q-Learning for Knowledge Evolution
Cloud controllers aim at responding to application demands by automatically
scaling the compute resources at runtime to meet performance guarantees and
minimize resource costs. Existing cloud controllers often resort to scaling
strategies that are codified as a set of adaptation rules. However, for a cloud
provider, applications running on top of the cloud infrastructure are more or
less black-boxes, making it difficult at design time to define optimal or
pre-emptive adaptation rules. Thus, the burden of taking adaptation decisions
often is delegated to the cloud application. Yet, in most cases, application
developers in turn have limited knowledge of the cloud infrastructure. In this
paper, we propose learning adaptation rules during runtime. To this end, we
introduce FQL4KE, a self-learning fuzzy cloud controller. In particular, FQL4KE
learns and modifies fuzzy rules at runtime. The benefit is that for designing
cloud controllers, we do not have to rely solely on precise design-time
knowledge, which may be difficult to acquire. FQL4KE empowers users to specify
cloud controllers by simply adjusting weights representing priorities in system
goals instead of specifying complex adaptation rules. The applicability of
FQL4KE has been experimentally assessed as part of the cloud application
framework ElasticBench. The experimental results indicate that FQL4KE
outperforms our previously developed fuzzy controller without learning
mechanisms and the native Azure auto-scaling
Auto-scaling techniques for cloud-based Complex Event Processing
One key topic in cloud computing is elasticity, which is the ability of the cloud environment to timely adapt the resource assignment along with the workload demand. According
to cloud on-demand model, the infrastructure should be able to scale up and down to unpredictable workloads, in order to achieve both a guaranteed service level and cost efficiency.
This work addresses the cloud elasticity problem, with particular reference to the Complex
Event Processing (CEP) systems.
CEP systems are designed to process large volumes of event-driven data streams and
continuously provide results with a low latency and in real-time. CEP systems need to
adapt to changing query and events loads. Because of the high computational requirements
and varying loads, CEP are distributed system and running on cloud infrastructures.
In this work we review the cloud computing auto-scaling solutions, and study their suit-
ability in the CEP model. We implement some solutions in a CEP prototype and evaluate
the experimental results
PERFUME: Power and performance guarantee with fuzzy MIMO control in virtualized servers
AbstractâIt is important but challenging to assure the per-formance of multi-tier Internet applications with the power consumption cap of virtualized server clusters mainly due to system complexity of shared infrastructure and dynamic and bursty nature of workloads. This paper presents PERFUME, a system that simultaneously guarantees power and performance targets with flexible tradeoffs while assuring control accuracy and system stability. Based on the proposed fuzzy MIMO control technique, it accurately controls both the throughput and percentile-based response time of multi-tier applications due to its novel fuzzy modeling that integrates strengths of fuzzy logic, MIMO control and artificial neural network. It is self-adaptive to highly dynamic and bursty workloads due to online learning of control model parameters using a computationally efficient weighted recursive least-squares method. We implement PERFUME in a testbed of virtualized blade servers hosting two multi-tier RUBiS applications. Experimental results demonstrate its control accuracy, system stability, flexibility in selecting trade-offs between conflicting targets and robustness against highly dynamic variation and burstiness in workloads. It outperforms a representative utility based approach in providing guarantee of the system throughput, percentile-based response time and power budget in the face of highly dynamic and bursty workloads. I
V-Cache: Towards Flexible Resource Provisioning for Multi-tier Applications in IaaS Clouds
AbstractâAlthough the resource elasticity offered by Infrastructure-as-a-Service (IaaS) clouds opens up opportunities for elastic application performance, it also poses challenges to application management. Cluster applications, such as multi-tier websites, further complicates the management requiring not only accurate capacity planning but also proper partitioning of the resources into a number of virtual machines. Instead of burdening cloud users with complex management, we move the task of determining the optimal resource configuration for cluster applications to cloud providers. We find that a structural reorganization of multi-tier websites, by adding a caching tier which runs on resources debited from the original resource budget, significantly boosts application performance and reduces resource usage. We propose V-Cache, a machine learning based approach to flexible provisioning of resources for multi-tier applications in clouds. V-Cache transparently places a caching proxy in front of the application. It uses a genetic algorithm to identify the incoming requests that benefit most from caching and dynamically resizes the cache space to accommodate these requests. We develop a reinforcement learning algorithm to optimally allocate the remaining capacity to other tiers. We have implemented V-Cache on a VMware-based cloud testbed. Exper-iment results with the RUBiS and WikiBench benchmarks show that V-Cache outperforms a representative capacity management scheme and a cloud-cache based resource provisioning approach by at least 15 % in performance, and achieves at least 11 % and 21 % savings on CPU and memory resources, respectively. I
Towards a novel biologically-inspired cloud elasticity framework
With the widespread use of the Internet, the popularity of web applications has
significantly increased. Such applications are subject to unpredictable workload
conditions that vary from time to time. For example, an e-commerce website may
face higher workloads than normal during festivals or promotional schemes. Such
applications are critical and performance related issues, or service disruption can
result in financial losses. Cloud computing with its attractive feature of dynamic
resource provisioning (elasticity) is a perfect match to host such applications.
The rapid growth in the usage of cloud computing model, as well as the rise in
complexity of the web applications poses new challenges regarding the effective
monitoring and management of the underlying cloud computational resources.
This thesis investigates the state-of-the-art elastic methods including the models
and techniques for the dynamic management and provisioning of cloud resources
from a service provider perspective.
An elastic controller is responsible to determine the optimal number of cloud resources,
required at a particular time to achieve the desired performance demands.
Researchers and practitioners have proposed many elastic controllers using versatile
techniques ranging from simple if-then-else based rules to sophisticated
optimisation, control theory and machine learning based methods. However,
despite an extensive range of existing elasticity research, the aim of implementing
an efficient scaling technique that satisfies the actual demands is still a challenge
to achieve. There exist many issues that have not received much attention from
a holistic point of view. Some of these issues include: 1) the lack of adaptability
and static scaling behaviour whilst considering completely fixed approaches; 2)
the burden of additional computational overhead, the inability to cope with the
sudden changes in the workload behaviour and the preference of adaptability
over reliability at runtime whilst considering the fully dynamic approaches; and 3)
the lack of considering uncertainty aspects while designing auto-scaling solutions.
This thesis seeks solutions to address these issues altogether using an integrated
approach. Moreover, this thesis aims at the provision of qualitative elasticity rules.
This thesis proposes a novel biologically-inspired switched feedback control
methodology to address the horizontal elasticity problem. The switched methodology
utilises multiple controllers simultaneously, whereas the selection of a
suitable controller is realised using an intelligent switching mechanism. Each
controller itself depicts a different elasticity policy that can be designed using the
principles of fixed gain feedback controller approach. The switching mechanism
is implemented using a fuzzy system that determines a suitable controller/-
policy at runtime based on the current behaviour of the system. Furthermore,
to improve the possibility of bumpless transitions and to avoid the oscillatory
behaviour, which is a problem commonly associated with switching based control
methodologies, this thesis proposes an alternative soft switching approach. This
soft switching approach incorporates a biologically-inspired Basal Ganglia based
computational model of action selection.
In addition, this thesis formulates the problem of designing the membership functions
of the switching mechanism as a multi-objective optimisation problem. The
key purpose behind this formulation is to obtain the near optimal (or to fine tune)
parameter settings for the membership functions of the fuzzy control system in
the absence of domain expertsâ knowledge. This problem is addressed by using
two different techniques including the commonly used Genetic Algorithm and
an alternative less known economic approach called the Taguchi method. Lastly,
we identify seven different kinds of real workload patterns, each of which reflects
a different set of applications. Six real and one synthetic HTTP traces, one for
each pattern, are further identified and utilised to evaluate the performance of
the proposed methods against the state-of-the-art approaches
A control theoretical view of cloud elasticity: taxonomy, survey and challenges
The lucrative features of cloud computing such as pay-as-you-go pricing model and dynamic resource provisioning (elasticity) attract clients to host their applications over the cloud to save up-front capital expenditure and to reduce the operational cost of the system. However, the efficient management of hired computational resources is a challenging task. Over the last decade, researchers and practitioners made use of various techniques to propose new methods to address cloud elasticity. Amongst many such techniques, control theory emerges as one of the popular methods to implement elasticity. A plethora of research has been undertaken on cloud elasticity including several review papers that summarise various aspects of elasticity. However, the scope of the existing review articles is broad and focused mostly on the high-level view of the overall research works rather than on the specific details of a particular implementation technique. While considering the importance, suitability and abundance of control theoretical approaches, this paper is a step forward towards a stand-alone review of control theoretic aspects of cloud elasticity. This paper provides a detailed taxonomy comprising of relevant attributes defining the following two perspectives, i.e., control-theory as an implementation technique as well as cloud elasticity as a target application domain. We carry out an exhaustive review of the literature by classifying the existing elasticity solutions using the attributes of control theoretic perspective. The summarized results are further presented by clustering them with respect to the type of control solutions, thus helping in comparison of the related control solutions. In last, a discussion summarizing the pros and cons of each type of control solutions are presented. This discussion is followed by the detail description of various open research challenges in the field
Towards a novel biologically-inspired cloud elasticity framework
With the widespread use of the Internet, the popularity of web applications has
significantly increased. Such applications are subject to unpredictable workload
conditions that vary from time to time. For example, an e-commerce website may
face higher workloads than normal during festivals or promotional schemes. Such
applications are critical and performance related issues, or service disruption can
result in financial losses. Cloud computing with its attractive feature of dynamic
resource provisioning (elasticity) is a perfect match to host such applications.
The rapid growth in the usage of cloud computing model, as well as the rise in
complexity of the web applications poses new challenges regarding the effective
monitoring and management of the underlying cloud computational resources.
This thesis investigates the state-of-the-art elastic methods including the models
and techniques for the dynamic management and provisioning of cloud resources
from a service provider perspective.
An elastic controller is responsible to determine the optimal number of cloud resources,
required at a particular time to achieve the desired performance demands.
Researchers and practitioners have proposed many elastic controllers using versatile
techniques ranging from simple if-then-else based rules to sophisticated
optimisation, control theory and machine learning based methods. However,
despite an extensive range of existing elasticity research, the aim of implementing
an efficient scaling technique that satisfies the actual demands is still a challenge
to achieve. There exist many issues that have not received much attention from
a holistic point of view. Some of these issues include: 1) the lack of adaptability
and static scaling behaviour whilst considering completely fixed approaches; 2)
the burden of additional computational overhead, the inability to cope with the
sudden changes in the workload behaviour and the preference of adaptability
over reliability at runtime whilst considering the fully dynamic approaches; and 3)
the lack of considering uncertainty aspects while designing auto-scaling solutions.
This thesis seeks solutions to address these issues altogether using an integrated
approach. Moreover, this thesis aims at the provision of qualitative elasticity rules.
This thesis proposes a novel biologically-inspired switched feedback control
methodology to address the horizontal elasticity problem. The switched methodology
utilises multiple controllers simultaneously, whereas the selection of a
suitable controller is realised using an intelligent switching mechanism. Each
controller itself depicts a different elasticity policy that can be designed using the
principles of fixed gain feedback controller approach. The switching mechanism
is implemented using a fuzzy system that determines a suitable controller/-
policy at runtime based on the current behaviour of the system. Furthermore,
to improve the possibility of bumpless transitions and to avoid the oscillatory
behaviour, which is a problem commonly associated with switching based control
methodologies, this thesis proposes an alternative soft switching approach. This
soft switching approach incorporates a biologically-inspired Basal Ganglia based
computational model of action selection.
In addition, this thesis formulates the problem of designing the membership functions
of the switching mechanism as a multi-objective optimisation problem. The
key purpose behind this formulation is to obtain the near optimal (or to fine tune)
parameter settings for the membership functions of the fuzzy control system in
the absence of domain expertsâ knowledge. This problem is addressed by using
two different techniques including the commonly used Genetic Algorithm and
an alternative less known economic approach called the Taguchi method. Lastly,
we identify seven different kinds of real workload patterns, each of which reflects
a different set of applications. Six real and one synthetic HTTP traces, one for
each pattern, are further identified and utilised to evaluate the performance of
the proposed methods against the state-of-the-art approaches
Autonomic management of virtualized resources in cloud computing
The last five years have witnessed a rapid growth of cloud computing in business, governmental and educational IT deployment. The success of cloud services depends critically on the effective management of virtualized resources. A key requirement of cloud management is the ability to dynamically match resource allocations to actual demands, To this end, we aim to design and implement a cloud resource management mechanism that manages underlying complexity, automates resource provisioning and controls client-perceived quality of service (QoS) while still achieving resource efficiency.
The design of an automatic resource management centers on two questions: when to adjust resource allocations and how much to adjust. In a cloud, applications have different definitions on capacity and cloud dynamics makes it difficult to determine a static resource to performance relationship. In this dissertation, we have proposed a generic metric that measures application capacity, designed model-independent and adaptive approaches to manage resources and built a cloud management system scalable to a cluster of machines.
To understand web system capacity, we propose to use a metric of
productivity index (PI), which is defined as the ratio of yield to
cost, to measure the system processing capability online. PI is a generic concept that can be applied to different levels to monitor system progress in order to identify if more capacity is needed. We applied the concept of PI to the problem of overload prevention in multi-tier websites. The overload predictor built on the PI metric shows more accurate and responsive overload prevention compared to conventional approaches.
To address the issue of the lack of accurate server model, we propose a model-independent fuzzy control based approach for CPU allocation. For adaptive and stable control performance, we embed the controller with self-tuning output amplification and flexible rule selection. Finally, we build a QoS provisioning framework that supports multi-objective QoS control and service differentiation. Experiments on a virtual cluster with two service classes show the effectiveness of our approach in both performance and power control.
To address the problems of complex interplay between resources and process delays in fine-grained multi-resource allocation, we consider capacity management as a decision-making problem and employ reinforcement learning (RL) to optimize the process. The optimization depends on the trial-and-error interactions with the cloud system. In order to improve the initial management performance, we propose a model-based RL algorithm. The neural network based environment model, which is learned from previous management history, generates simulated resource allocations for the RL agent. Experiment results on heterogeneous applications show that our approach makes efficient use of limited interactions and find near optimal resource configurations within 7 steps.
Finally, we present a distributed reinforcement learning approach to the cluster-wide cloud resource management. We decompose the cluster-wide resource allocation problem into sub-problems concerning individual VM resource configurations. The cluster-wide allocation is optimized if individual VMs meet their SLA with a high resource utilization. For scalability, we develop an efficient reinforcement learning approach with continuous state space. For adaptability, we use VM low-level runtime statistics to accommodate workload dynamics. Prototyped in a iBalloon system, the distributed learning approach successfully manages 128 VMs on a 16-node close correlated cluster
Monitoring and Optimization of ATLAS Tier 2 Center GoeGrid
The demand on computational and storage resources is growing along with the amount of infor-
mation that needs to be processed and preserved. In order to ease the provisioning of the digital
services to the growing number of consumers, more and more distributed computing systems and
platforms are actively developed and employed. The building block of the distributed computing
infrastructure are single computing centers, similar to the Worldwide LHC Computing Grid, Tier
2 centre GoeGrid. The main motivation of this thesis was the optimization of GoeGrid perfor-
mance by efficient monitoring. The goal has been achieved by means of the GoeGrid monitoring
information analysis. The data analysis approach was based on the adaptive-network-based
fuzzy inference system (ANFIS) and machine learning algorithm such as Linear Support Vector
Machine (SVM).
The main object of the research was the digital service, since availability, reliability and ser-
viceability of the computing platform can be measured according to the constant and stable
provisioning of the services. Due to the widely used concept of the service oriented architecture
(SOA) for large computing facilities, in advance knowing of the service state as well as the quick
and accurate detection of its disability allows to perform the proactive management of the com-
puting facility. The proactive management is considered as a core component of the computing
facility management automation concept, such as Autonomic Computing. Thus in time as well
as in advance and accurate identification of the provided service status can be considered as a
contribution to the computing facility management automation, which is directly related to the
provisioning of the stable and reliable computing resources.
Based on the case studies, performed using the GoeGrid monitoring data, consideration of the
approaches as generalized methods for the accurate and fast identification and prediction of the
service status is reasonable. Simplicity and low consumption of the computing resources allow
to consider the methods in the scope of the Autonomic Computing component
- âŠ