6,556 research outputs found
Self-Learning Cloud Controllers: Fuzzy Q-Learning for Knowledge Evolution
Cloud controllers aim at responding to application demands by automatically
scaling the compute resources at runtime to meet performance guarantees and
minimize resource costs. Existing cloud controllers often resort to scaling
strategies that are codified as a set of adaptation rules. However, for a cloud
provider, applications running on top of the cloud infrastructure are more or
less black-boxes, making it difficult at design time to define optimal or
pre-emptive adaptation rules. Thus, the burden of taking adaptation decisions
often is delegated to the cloud application. Yet, in most cases, application
developers in turn have limited knowledge of the cloud infrastructure. In this
paper, we propose learning adaptation rules during runtime. To this end, we
introduce FQL4KE, a self-learning fuzzy cloud controller. In particular, FQL4KE
learns and modifies fuzzy rules at runtime. The benefit is that for designing
cloud controllers, we do not have to rely solely on precise design-time
knowledge, which may be difficult to acquire. FQL4KE empowers users to specify
cloud controllers by simply adjusting weights representing priorities in system
goals instead of specifying complex adaptation rules. The applicability of
FQL4KE has been experimentally assessed as part of the cloud application
framework ElasticBench. The experimental results indicate that FQL4KE
outperforms our previously developed fuzzy controller without learning
mechanisms and the native Azure auto-scaling
Autonomic management of virtualized resources in cloud computing
The last five years have witnessed a rapid growth of cloud computing in business, governmental and educational IT deployment. The success of cloud services depends critically on the effective management of virtualized resources. A key requirement of cloud management is the ability to dynamically match resource allocations to actual demands, To this end, we aim to design and implement a cloud resource management mechanism that manages underlying complexity, automates resource provisioning and controls client-perceived quality of service (QoS) while still achieving resource efficiency.
The design of an automatic resource management centers on two questions: when to adjust resource allocations and how much to adjust. In a cloud, applications have different definitions on capacity and cloud dynamics makes it difficult to determine a static resource to performance relationship. In this dissertation, we have proposed a generic metric that measures application capacity, designed model-independent and adaptive approaches to manage resources and built a cloud management system scalable to a cluster of machines.
To understand web system capacity, we propose to use a metric of
productivity index (PI), which is defined as the ratio of yield to
cost, to measure the system processing capability online. PI is a generic concept that can be applied to different levels to monitor system progress in order to identify if more capacity is needed. We applied the concept of PI to the problem of overload prevention in multi-tier websites. The overload predictor built on the PI metric shows more accurate and responsive overload prevention compared to conventional approaches.
To address the issue of the lack of accurate server model, we propose a model-independent fuzzy control based approach for CPU allocation. For adaptive and stable control performance, we embed the controller with self-tuning output amplification and flexible rule selection. Finally, we build a QoS provisioning framework that supports multi-objective QoS control and service differentiation. Experiments on a virtual cluster with two service classes show the effectiveness of our approach in both performance and power control.
To address the problems of complex interplay between resources and process delays in fine-grained multi-resource allocation, we consider capacity management as a decision-making problem and employ reinforcement learning (RL) to optimize the process. The optimization depends on the trial-and-error interactions with the cloud system. In order to improve the initial management performance, we propose a model-based RL algorithm. The neural network based environment model, which is learned from previous management history, generates simulated resource allocations for the RL agent. Experiment results on heterogeneous applications show that our approach makes efficient use of limited interactions and find near optimal resource configurations within 7 steps.
Finally, we present a distributed reinforcement learning approach to the cluster-wide cloud resource management. We decompose the cluster-wide resource allocation problem into sub-problems concerning individual VM resource configurations. The cluster-wide allocation is optimized if individual VMs meet their SLA with a high resource utilization. For scalability, we develop an efficient reinforcement learning approach with continuous state space. For adaptability, we use VM low-level runtime statistics to accommodate workload dynamics. Prototyped in a iBalloon system, the distributed learning approach successfully manages 128 VMs on a 16-node close correlated cluster
Working Notes from the 1992 AAAI Spring Symposium on Practical Approaches to Scheduling and Planning
The symposium presented issues involved in the development of scheduling systems that can deal with resource and time limitations. To qualify, a system must be implemented and tested to some degree on non-trivial problems (ideally, on real-world problems). However, a system need not be fully deployed to qualify. Systems that schedule actions in terms of metric time constraints typically represent and reason about an external numeric clock or calendar and can be contrasted with those systems that represent time purely symbolically. The following topics are discussed: integrating planning and scheduling; integrating symbolic goals and numerical utilities; managing uncertainty; incremental rescheduling; managing limited computation time; anytime scheduling and planning algorithms, systems; dependency analysis and schedule reuse; management of schedule and plan execution; and incorporation of discrete event techniques
Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures
Cloud controllers support the operation and quality management of dynamic cloud architectures by automatically scaling the compute resources to meet performance guarantees and minimize resource costs. Existing cloud controllers often resort to scaling strategies that are codified as a set of architecture adaptation rules. However, for a cloud provider, deployed application architectures are black-boxes, making it difficult at design time to define optimal or pre-emptive adaptation rules. Thus, the burden of taking adaptation decisions often is delegated to the cloud application. We propose the dynamic learning of adaptation rules for deployed application architectures in the cloud. We introduce FQL4KE, a self-learning fuzzy controller that learns and modifies fuzzy rules at runtime. The benefit is that we do not have to rely solely on precise design-time knowledge, which may be difficult to acquire. FQL4KE empowers users to configure cloud controllers by simply adjusting weights representing priorities for architecture quality instead of defining complex rules. FQL4KE has been experimentally validated using the cloud application framework ElasticBench in Azure and OpenStack. The experimental results demonstrate that FQL4KE outperforms both a fuzzy controller without learning and the native Azure auto-scalin
What's unusual in online disease outbreak news?
Background: Accurate and timely detection of public health events of
international concern is necessary to help support risk assessment and response
and save lives. Novel event-based methods that use the World Wide Web as a
signal source offer potential to extend health surveillance into areas where
traditional indicator networks are lacking. In this paper we address the issue
of systematically evaluating online health news to support automatic alerting
using daily disease-country counts text mined from real world data using
BioCaster. For 18 data sets produced by BioCaster, we compare 5 aberration
detection algorithms (EARS C2, C3, W2, F-statistic and EWMA) for performance
against expert moderated ProMED-mail postings. Results: We report sensitivity,
specificity, positive predictive value (PPV), negative predictive value (NPV),
mean alerts/100 days and F1, at 95% confidence interval (CI) for 287
ProMED-mail postings on 18 outbreaks across 14 countries over a 366 day period.
Results indicate that W2 had the best F1 with a slight benefit for day of week
effect over C2. In drill down analysis we indicate issues arising from the
granular choice of country-level modeling, sudden drops in reporting due to day
of week effects and reporting bias. Automatic alerting has been implemented in
BioCaster available from http://born.nii.ac.jp. Conclusions: Online health news
alerts have the potential to enhance manual analytical methods by increasing
throughput, timeliness and detection rates. Systematic evaluation of health
news aberrations is necessary to push forward our understanding of the complex
relationship between news report volumes and case numbers and to select the
best performing features and algorithms
Energy-Efficient On-Board Radio Resource Management for Satellite Communications via Neuromorphic Computing
The latest satellite communication (SatCom) missions are characterized by a
fully reconfigurable on-board software-defined payload, capable of adapting
radio resources to the temporal and spatial variations of the system traffic.
As pure optimization-based solutions have shown to be computationally tedious
and to lack flexibility, machine learning (ML)-based methods have emerged as
promising alternatives. We investigate the application of energy-efficient
brain-inspired ML models for on-board radio resource management. Apart from
software simulation, we report extensive experimental results leveraging the
recently released Intel Loihi 2 chip. To benchmark the performance of the
proposed model, we implement conventional convolutional neural networks (CNN)
on a Xilinx Versal VCK5000, and provide a detailed comparison of accuracy,
precision, recall, and energy efficiency for different traffic demands. Most
notably, for relevant workloads, spiking neural networks (SNNs) implemented on
Loihi 2 yield higher accuracy, while reducing power consumption by more than
100 as compared to the CNN-based reference platform. Our findings point
to the significant potential of neuromorphic computing and SNNs in supporting
on-board SatCom operations, paving the way for enhanced efficiency and
sustainability in future SatCom systems.Comment: currently under review at IEEE Transactions on Machine Learning in
Communications and Networkin
The 1990 progress report and future plans
This document describes the progress and plans of the Artificial Intelligence Research Branch (RIA) at ARC in 1990. Activities span a range from basic scientific research to engineering development and to fielded NASA applications, particularly those applications that are enabled by basic research carried out at RIA. Work is conducted in-house and through collaborative partners in academia and industry. Our major focus is on a limited number of research themes with a dual commitment to technical excellence and proven applicability to NASA short, medium, and long-term problems. RIA acts as the Agency's lead organization for research aspects of artificial intelligence, working closely with a second research laboratory at JPL and AI applications groups at all NASA centers
- …