6,124 research outputs found
LIKWID Monitoring Stack: A flexible framework enabling job specific performance monitoring for the masses
System monitoring is an established tool to measure the utilization and
health of HPC systems. Usually system monitoring infrastructures make no
connection to job information and do not utilize hardware performance
monitoring (HPM) data. To increase the efficient use of HPC systems automatic
and continuous performance monitoring of jobs is an essential component. It can
help to identify pathological cases, provides instant performance feedback to
the users, offers initial data to judge on the optimization potential of
applications and helps to build a statistical foundation about application
specific system usage. The LIKWID monitoring stack is a modular framework build
on top of the LIKWID tools library. It aims on enabling job specific
performance monitoring using HPM data, system metrics and application-level
data for small to medium sized commodity clusters. Moreover, it is designed to
integrate in existing monitoring infrastructures to speed up the change from
pure system monitoring to job-aware monitoring.Comment: 4 pages, 4 figures. Accepted for HPCMASPA 2017, the Workshop on
Monitoring and Analysis for High Performance Computing Systems Plus
Applications, held in conjunction with IEEE Cluster 2017, Honolulu, HI,
September 5, 201
Bringing Introspection Into the BlobSeer Data-Management System Using the MonALISA Distributed Monitoring Framework
Held in conjunction with CISIS 2010 ConferenceInternational audienceIntrospection is the prerequisite of an autonomic behavior, the ïŹrst step towards a performance improvement and a resource-usage optimization for large-scale distributed systems. In grid environments, the task of observing the application behavior is assigned to monitoring systems. However, most of them are designed to provide general resource information and do not consider speciïŹc information for higher-level services. More specifically, in the context of data-intensive applications, a speciïŹc introspection layer is required in order to collect data about the usage of storage resources, about data access patterns, etc. This paper discusses the requirements for an introspection layer in a data-management system for large-scale distributed infrastructures. We focus on the case of BlobSeer, a large-scale distributed system for storing massive data. The paper explains why and how to enhance BlobSeer with introspective capabilities and proposes a three-layered architecture relying on the MonALISA monitoring framework. This approach has been evaluated on the Grid'5000 testbed, with experiments that prove the feasibility of generating relevant information related to the state and the behavior of the system
Insight from a Containerized Kubernetes Workload Introspection
Developments in virtual containers, especially in the cloud infrastructure, have led to diversification of jobs that containers are being used to support, particularly in the big data and machine learning spaces. The diversification has been powered by the adoption of orchestration systems that marshal fleets of containers to accomplish complex programming tasks. The additional components in the vertical technology stack, plus the continued horizontal scaling have led to questions regarding how to forensically analyze complicated technology stacks. This paper proposed a solution through the use of introspection. An exploratory case study has been conducted on a bare-metal cloud that utilizes Kubernetes, the introspection tool Prometheus, and Apache Spark. The contribution of this research is two-fold. First, it provides empirical support that introspection tools can acquire forensically viable data from different levels of a technology stack. Second, it provides the ground work for comparisons between different virtual container platforms
Leveraging the Grid to Provide a Global Platform for Ubiquitous Computing Research
The requirement for distributed systems support for Ubicomp has led to the development of numerous platforms, each addressing a subset of the overall requirements of ubiquitous systems. In contrast, many other scientiĂšĂ
c disciplines have embraced the vision of a global distributed computing platform, i.e. the Grid. We believe that the Grid has the potential to evolve into an ideal platform for building ubiquitous computing applications. In this paper we explore in detail the areas of synergy between Grid computing and ubiquitous computing and highlight a series of research challenges in this space
Container and VM Visualization for Rapid Forensic Analysis
Cloud-hosted software such as virtual machines and containers are notoriously difficult to access, observe, and inspect during ongoing security events. This research describes a new, out-of-band forensic tool for rapidly analyzing cloud based software. The proposed tool renders two-dimensional visualizations of container contents and virtual machine disk images. The visualizations can be used to identify container / VM contents, pinpoint instances of embedded malware, and find modified code. The proposed new forensic tool is compared against other forensic tools in a double-blind experiment. The results confirm the utility of the proposed tool. Implications and future research directions are also described
- âŠ