6,124 research outputs found

    LIKWID Monitoring Stack: A flexible framework enabling job specific performance monitoring for the masses

    Full text link
    System monitoring is an established tool to measure the utilization and health of HPC systems. Usually system monitoring infrastructures make no connection to job information and do not utilize hardware performance monitoring (HPM) data. To increase the efficient use of HPC systems automatic and continuous performance monitoring of jobs is an essential component. It can help to identify pathological cases, provides instant performance feedback to the users, offers initial data to judge on the optimization potential of applications and helps to build a statistical foundation about application specific system usage. The LIKWID monitoring stack is a modular framework build on top of the LIKWID tools library. It aims on enabling job specific performance monitoring using HPM data, system metrics and application-level data for small to medium sized commodity clusters. Moreover, it is designed to integrate in existing monitoring infrastructures to speed up the change from pure system monitoring to job-aware monitoring.Comment: 4 pages, 4 figures. Accepted for HPCMASPA 2017, the Workshop on Monitoring and Analysis for High Performance Computing Systems Plus Applications, held in conjunction with IEEE Cluster 2017, Honolulu, HI, September 5, 201

    Bringing Introspection Into the BlobSeer Data-Management System Using the MonALISA Distributed Monitoring Framework

    Get PDF
    Held in conjunction with CISIS 2010 ConferenceInternational audienceIntrospection is the prerequisite of an autonomic behavior, the ïŹrst step towards a performance improvement and a resource-usage optimization for large-scale distributed systems. In grid environments, the task of observing the application behavior is assigned to monitoring systems. However, most of them are designed to provide general resource information and do not consider speciïŹc information for higher-level services. More specifically, in the context of data-intensive applications, a speciïŹc introspection layer is required in order to collect data about the usage of storage resources, about data access patterns, etc. This paper discusses the requirements for an introspection layer in a data-management system for large-scale distributed infrastructures. We focus on the case of BlobSeer, a large-scale distributed system for storing massive data. The paper explains why and how to enhance BlobSeer with introspective capabilities and proposes a three-layered architecture relying on the MonALISA monitoring framework. This approach has been evaluated on the Grid'5000 testbed, with experiments that prove the feasibility of generating relevant information related to the state and the behavior of the system

    Insight from a Containerized Kubernetes Workload Introspection

    Get PDF
    Developments in virtual containers, especially in the cloud infrastructure, have led to diversification of jobs that containers are being used to support, particularly in the big data and machine learning spaces. The diversification has been powered by the adoption of orchestration systems that marshal fleets of containers to accomplish complex programming tasks. The additional components in the vertical technology stack, plus the continued horizontal scaling have led to questions regarding how to forensically analyze complicated technology stacks. This paper proposed a solution through the use of introspection. An exploratory case study has been conducted on a bare-metal cloud that utilizes Kubernetes, the introspection tool Prometheus, and Apache Spark. The contribution of this research is two-fold. First, it provides empirical support that introspection tools can acquire forensically viable data from different levels of a technology stack. Second, it provides the ground work for comparisons between different virtual container platforms

    Leveraging the Grid to Provide a Global Platform for Ubiquitous Computing Research

    Get PDF
    The requirement for distributed systems support for Ubicomp has led to the development of numerous platforms, each addressing a subset of the overall requirements of ubiquitous systems. In contrast, many other scientiÔ¹Åc disciplines have embraced the vision of a global distributed computing platform, i.e. the Grid. We believe that the Grid has the potential to evolve into an ideal platform for building ubiquitous computing applications. In this paper we explore in detail the areas of synergy between Grid computing and ubiquitous computing and highlight a series of research challenges in this space

    Container and VM Visualization for Rapid Forensic Analysis

    Get PDF
    Cloud-hosted software such as virtual machines and containers are notoriously difficult to access, observe, and inspect during ongoing security events. This research describes a new, out-of-band forensic tool for rapidly analyzing cloud based software. The proposed tool renders two-dimensional visualizations of container contents and virtual machine disk images. The visualizations can be used to identify container / VM contents, pinpoint instances of embedded malware, and find modified code. The proposed new forensic tool is compared against other forensic tools in a double-blind experiment. The results confirm the utility of the proposed tool. Implications and future research directions are also described
    • 

    corecore