5,977 research outputs found
Monitoring Large-Scale Cloud Systems with Layered Gossip Protocols
Monitoring is an essential aspect of maintaining and developing computer
systems that increases in difficulty proportional to the size of the system.
The need for robust monitoring tools has become more evident with the advent of
cloud computing. Infrastructure as a Service (IaaS) clouds allow end users to
deploy vast numbers of virtual machines as part of dynamic and transient
architectures. Current monitoring solutions, including many of those in the
open-source domain rely on outdated concepts including manual deployment and
configuration, centralised data collection and adapt poorly to membership
churn.
In this paper we propose the development of a cloud monitoring suite to
provide scalable and robust lookup, data collection and analysis services for
large-scale cloud systems. In lieu of centrally managed monitoring we propose a
multi-tier architecture using a layered gossip protocol to aggregate monitoring
information and facilitate lookup, information collection and the
identification of redundant capacity. This allows for a resource aware data
collection and storage architecture that operates over the system being
monitored. This in turn enables monitoring to be done in-situ without the need
for significant additional infrastructure to facilitate monitoring services. We
evaluate this approach against alternative monitoring paradigms and demonstrate
how our solution is well adapted to usage in a cloud-computing context.Comment: Extended Abstract for the ACM International Symposium on
High-Performance Parallel and Distributed Computing (HPDC 2013) Poster Trac
A Semantic Web of Know-How: Linked Data for Community-Centric Tasks
This paper proposes a novel framework for representing community know-how on
the Semantic Web. Procedural knowledge generated by web communities typically
takes the form of natural language instructions or videos and is largely
unstructured. The absence of semantic structure impedes the deployment of many
useful applications, in particular the ability to discover and integrate
know-how automatically. We discuss the characteristics of community know-how
and argue that existing knowledge representation frameworks fail to represent
it adequately. We present a novel framework for representing the semantic
structure of community know-how and demonstrate the feasibility of our approach
by providing a concrete implementation which includes a method for
automatically acquiring procedural knowledge for real-world tasks.Comment: 6th International Workshop on Web Intelligence & Communities (WIC14),
Proceedings of the companion publication of the 23rd International Conference
on World Wide Web (WWW 2014
Autonomous Fault Detection in Self-Healing Systems using Restricted Boltzmann Machines
Autonomously detecting and recovering from faults is one approach for
reducing the operational complexity and costs associated with managing
computing environments. We present a novel methodology for autonomously
generating investigation leads that help identify systems faults, and extends
our previous work in this area by leveraging Restricted Boltzmann Machines
(RBMs) and contrastive divergence learning to analyse changes in historical
feature data. This allows us to heuristically identify the root cause of a
fault, and demonstrate an improvement to the state of the art by showing
feature data can be predicted heuristically beyond a single instance to include
entire sequences of information.Comment: Published and presented in the 11th IEEE International Conference and
Workshops on Engineering of Autonomic and Autonomous Systems (EASe 2014
Task Scheduling on the Cloud with Hard Constraints
Scheduling Bag-of-Tasks (BoT) applications on the cloud can be more
challenging than grid and cluster environ- ments. This is because a user may
have a budgetary constraint or a deadline for executing the BoT application in
order to keep the overall execution costs low. The research in this paper is
motivated to investigate task scheduling on the cloud, given two hard
constraints based on a user-defined budget and a deadline. A heuristic
algorithm is proposed and implemented to satisfy the hard constraints for
executing the BoT application in a cost effective manner. The proposed
algorithm is evaluated using four scenarios that are based on the trade-off
between performance and the cost of using different cloud resource types. The
experimental evaluation confirms the feasibility of the algorithm in satisfying
the constraints. The key observation is that multiple resource types can be a
better alternative to using a single type of resource.Comment: Visionary Track of the IEEE 11th World Congress on Services (IEEE
SERVICES 2015
Cloud Services Brokerage: A Survey and Research Roadmap
A Cloud Services Brokerage (CSB) acts as an intermediary between cloud
service providers (e.g., Amazon and Google) and cloud service end users,
providing a number of value adding services. CSBs as a research topic are in
there infancy. The goal of this paper is to provide a concise survey of
existing CSB technologies in a variety of areas and highlight a roadmap, which
details five future opportunities for research.Comment: Paper published in the 8th IEEE International Conference on Cloud
Computing (CLOUD 2015
Executing Bag of Distributed Tasks on Virtually Unlimited Cloud Resources
Bag-of-Distributed-Tasks (BoDT) application is the collection of identical
and independent tasks each of which requires a piece of input data located
around the world. As a result, Cloud computing offers an ef- fective way to
execute BoT application as it not only consists of multiple geographically
distributed data centres but also allows a user to pay for what she actually
uses only. In this paper, BoDT on the Cloud using virtually unlimited cloud
resources. A heuristic algorithm is proposed to find an execution plan that
takes budget constraints into account. Compared with other approaches, with the
same given budget, our algorithm is able to reduce the overall execution time
up to 50%
- …