Tablespoon - real-time system metric monitoring for Karamel

Abstract

System resource utilisation metrics is an important source of the decisionmaking process for a general-purpose auto-scaling solution in cloud computing. It is critical for a monitoring system to be light-weight in its usage of system resources. In this work, Tablespoon, a real-time monitoring system, is presented. It operates on a publish-subscribe architecture and is agent push-based. Tablespoon itself has a low bandwidth usage profile by using agent-side filtering and an inter-group aggregation mechanism. Our solution ensures that requested events are received at most once by the subscriber, whilst simultaneously minimising the loss of such events. The evaluation of Tablespoon shows that the average latency, in a limited use case, between publisher and subscriber, is around 500-700ms. Furthermore, the average CPU usage of the agents, depending on the use case, is about 0.5 percent when testing on Amazon Web Services

    Similar works