research

Monitoring Large-Scale Cloud Systems with Layered Gossip Protocols

Abstract

Monitoring is an essential aspect of maintaining and developing computer systems that increases in difficulty proportional to the size of the system. The need for robust monitoring tools has become more evident with the advent of cloud computing. Infrastructure as a Service (IaaS) clouds allow end users to deploy vast numbers of virtual machines as part of dynamic and transient architectures. Current monitoring solutions, including many of those in the open-source domain rely on outdated concepts including manual deployment and configuration, centralised data collection and adapt poorly to membership churn. In this paper we propose the development of a cloud monitoring suite to provide scalable and robust lookup, data collection and analysis services for large-scale cloud systems. In lieu of centrally managed monitoring we propose a multi-tier architecture using a layered gossip protocol to aggregate monitoring information and facilitate lookup, information collection and the identification of redundant capacity. This allows for a resource aware data collection and storage architecture that operates over the system being monitored. This in turn enables monitoring to be done in-situ without the need for significant additional infrastructure to facilitate monitoring services. We evaluate this approach against alternative monitoring paradigms and demonstrate how our solution is well adapted to usage in a cloud-computing context.Comment: Extended Abstract for the ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC 2013) Poster Trac

    Similar works

    Full text

    thumbnail-image