212 research outputs found

    Observing the clouds : a survey and taxonomy of cloud monitoring

    Get PDF
    This research was supported by a Royal Society Industry Fellowship and an Amazon Web Services (AWS) grant. Date of Acceptance: 10/12/2014Monitoring is an important aspect of designing and maintaining large-scale systems. Cloud computing presents a unique set of challenges to monitoring including: on-demand infrastructure, unprecedented scalability, rapid elasticity and performance uncertainty. There are a wide range of monitoring tools originating from cluster and high-performance computing, grid computing and enterprise computing, as well as a series of newer bespoke tools, which have been designed exclusively for cloud monitoring. These tools express a number of common elements and designs, which address the demands of cloud monitoring to various degrees. This paper performs an exhaustive survey of contemporary monitoring tools from which we derive a taxonomy, which examines how effectively existing tools and designs meet the challenges of cloud monitoring. We conclude by examining the socio-technical aspects of monitoring, and investigate the engineering challenges and practices behind implementing monitoring strategies for cloud computing.Publisher PDFPeer reviewe

    A Generic Development and Deployment Framework for Cloud Computing and Distributed Applications

    Get PDF
    Cloud computing have paved the way for advance of IT-based demand services. This technology helps decrease operation costs, solve scalability issue and many more user and provider constraints. However, development and deployment of distributed applications on cloud environment becomes a more and more complex tasks. Cloud users must spend a lot of time to prepare, install and configure their applications on clouds. In addition, after development and deployment, the applications almost cannot move from a cloud to others due to the lack of interoperability between them. To address these problems, we present in this paper a novel development and deployment framework for cloud distributed applications/services. Our approach is based on abstraction and object-oriented programming technique, allowing users to easily and rapidly develop and deploy their services into cloud environment. The approach also enables service migration and interoperability among the clouds

    CloudHealth: A Model-Driven Approach to Watch the Health of Cloud Services

    Full text link
    Cloud systems are complex and large systems where services provided by different operators must coexist and eventually cooperate. In such a complex environment, controlling the health of both the whole environment and the individual services is extremely important to timely and effectively react to misbehaviours, unexpected events, and failures. Although there are solutions to monitor cloud systems at different granularity levels, how to relate the many KPIs that can be collected about the health of the system and how health information can be properly reported to operators are open questions. This paper reports the early results we achieved in the challenge of monitoring the health of cloud systems. In particular we present CloudHealth, a model-based health monitoring approach that can be used by operators to watch specific quality attributes. The CloudHealth Monitoring Model describes how to operationalize high level monitoring goals by dividing them into subgoals, deriving metrics for the subgoals, and using probes to collect the metrics. We use the CloudHealth Monitoring Model to control the probes that must be deployed on the target system, the KPIs that are dynamically collected, and the visualization of the data in dashboards.Comment: 8 pages, 2 figures, 1 tabl

    A CIM framework for standard-based system monitoring using nagios plug-ins

    Get PDF
    The Common Information Model is a widely-accepted industry standard to model distributed system objects as well as their behaviors and interactions to realize system management tasks. It is endorsed by the Distributed Management Task Force and appears as the preferred manageability solution to deal with the ever increasing heterogeneity characterizing today’s datacenters. However, a number of enterprise-class system management products, like Nagios, are not compliant with this standard. Nagios is among the top open source monitoring tools with the power of a large community of developers producing plug-ins to manage a variety of enterprise systems. As part of the endeavor to accelerate CIM adoption, an extension framework, called Plugin Extension for CIM, has been developed in order to expose Nagios and other third-party plug-ins thru CIM, thus enhancing the capabilities of standard-based system management tools by the transparent use of the extensive variety of existing plug-ins. This paper describes the developed framework as well as its acceptance within the open source manageability community.IV Workshop Arquitectura, Redes y Sistemas Operativos (WARSO)Red de Universidades con Carreras en Informática (RedUNCI

    A Real Time Distributed Network Monitoring Platform (RTDNM)

    Get PDF
    Perkembangan geografi dan peningkatan saiz dalam rangkaian-rangkaian komputer menjadikan keperluan pemantauan terhadapnya menjadi semakin penting. As computer networks increase in size and expand geographically, the necessity to monitor them becomes increasingly important

    The new services in nagios network bandwidth utility email notification and sms alert in improving the network performance

    Get PDF
    A new feature of services in Nagios has been added to the existing system which has no such services. The bandwidth monitoring and notification system are configured for alerting the network administrators when the bandwidth of the network in an organization hits a certain threshold settings. The system sent an email alert and sms notification to the network administrator for taking further action in order to maintain the Quality of Service (QoS) in the network. All the logs file of the Nagios actions is saved in the Nagios File Logs. The analysis was conducted from the case study and problem statements. Network Development Life Cycle (NDLC) was chosen as a methodology for implementing this system in the network. Nagios is installed inside Ubuntu 10 Operating System along with Multi-Router Traffic Grapher (MRTG) and Mail Postfix. MRTG and Mail Postfix were configured to be integrated with the Nagios System. On the client side, NSClient++ has been installed, for monitoring the bandwidth and performance of windows based on operating system. The Nagios services have been improved with the implementation of sms and emails notifications since the existing services have no such utilities. With the implementation of these services to Nagios, the performance could be even better for the futur

    Monitoring Large-Scale Cloud Systems with Layered Gossip Protocols

    Full text link
    Monitoring is an essential aspect of maintaining and developing computer systems that increases in difficulty proportional to the size of the system. The need for robust monitoring tools has become more evident with the advent of cloud computing. Infrastructure as a Service (IaaS) clouds allow end users to deploy vast numbers of virtual machines as part of dynamic and transient architectures. Current monitoring solutions, including many of those in the open-source domain rely on outdated concepts including manual deployment and configuration, centralised data collection and adapt poorly to membership churn. In this paper we propose the development of a cloud monitoring suite to provide scalable and robust lookup, data collection and analysis services for large-scale cloud systems. In lieu of centrally managed monitoring we propose a multi-tier architecture using a layered gossip protocol to aggregate monitoring information and facilitate lookup, information collection and the identification of redundant capacity. This allows for a resource aware data collection and storage architecture that operates over the system being monitored. This in turn enables monitoring to be done in-situ without the need for significant additional infrastructure to facilitate monitoring services. We evaluate this approach against alternative monitoring paradigms and demonstrate how our solution is well adapted to usage in a cloud-computing context.Comment: Extended Abstract for the ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC 2013) Poster Trac

    Monitoring in fog computing: state-of-the-art and research challenges

    Get PDF
    Fog computing has rapidly become a widely accepted computing paradigm to mitigate cloud computing-based infrastructure limitations such as scarcity of bandwidth, large latency, security, and privacy issues. Fog computing resources and applications dynamically vary at run-time, and they are highly distributed, mobile, and appear-disappear rapidly at any time over the internet. Therefore, to ensure the quality of service and experience for end-users, it is necessary to comply with a comprehensive monitoring approach. However, the volatility and dynamism characteristics of fog resources make the monitoring design complex and cumbersome. The aim of this article is therefore three-fold: 1) to analyse fog computing-based infrastructures and existing monitoring solutions; 2) to highlight the main requirements and challenges based on a taxonomy; 3) to identify open issues and potential future research directions.This work has been (partially) funded by H2020 EU/TW 5G-DIVE (Grant 859881) and H2020 5Growth (Grant 856709). It has been also funded by the Spanish State Research Agency (TRUE5G project, PID2019-108713RB-C52 PID2019-108713RB-C52 / AEI / 10.13039/501100011033)
    corecore