619 research outputs found

    WMTrace : a lightweight memory allocation tracker and analysis framework

    Get PDF
    The diverging gap between processor and memory performance has been a well discussed aspect of computer architecture literature for some years. The use of multi-core processor designs has, however, brought new problems to the design of memory architectures - increased core density without matched improvement in memory capacity is reduc- ing the available memory per parallel process. Multiple cores accessing memory simultaneously degrades performance as a result of resource con- tention for memory channels and physical DIMMs. These issues combine to ensure that memory remains an on-going challenge in the design of parallel algorithms which scale. In this paper we present WMTrace, a lightweight tool to trace and analyse memory allocation events in parallel applications. This tool is able to dynamically link to pre-existing application binaries requiring no source code modification or recompilation. A post-execution analysis stage enables in-depth analysis of traces to be performed allowing memory allocations to be analysed by time, size or function. The second half of this paper features a case study in which we apply WMTrace to five parallel scientific applications and benchmarks, demonstrating its effectiveness at recording high-water mark memory consumption as well as memory use per-function over time. An in-depth analysis is provided for an unstructured mesh benchmark which reveals significant memory allocation imbalance across its participating processes

    The SIOX architecture – coupling automatic monitoring and optimization of parallel I/O

    Get PDF
    Performance analysis and optimization of high-performance I/O systems is a daunting task. Mainly, this is due to the overwhelmingly complex interplay of the involved hardware and software layers. The Scalable I/O for Extreme Performance (SIOX) project provides a versatile environment for monitoring I/O activities and learning from this information. The goal of SIOX is to automatically suggest and apply performance optimizations, and to assist in locating and diagnosing performance problems. In this paper, we present the current status of SIOX. Our modular architecture covers instrumentation of POSIX, MPI and other high-level I/O libraries; the monitoring data is recorded asynchronously into a global database, and recorded traces can be visualized. Furthermore, we offer a set of primitive plug-ins with additional features to demonstrate the flexibility of our architecture: A surveyor plug-in to keep track of the observed spatial access patterns; an fadvise plug-in for injecting hints to achieve read-ahead for strided access patterns; and an optimizer plug-in which monitors the performance achieved with different MPI-IO hints, automatically supplying the best known hint-set when no hints were explicitly set. The presentation of the technical status is accompanied by a demonstration of some of these features on our 20 node cluster. In additional experiments, we analyze the overhead for concurrent access, for MPI-IO’s 4-levels of access, and for an instrumented climate application. While our prototype is not yet full-featured, it demonstrates the potential and feasibility of our approach

    Synapse: Synthetic Application Profiler and Emulator

    Full text link
    We introduce Synapse motivated by the needs to estimate and emulate workload execution characteristics on high-performance and distributed heterogeneous resources. Synapse has a platform independent application profiler, and the ability to emulate profiled workloads on a variety of heterogeneous resources. Synapse is used as a proxy application (or "representative application") for real workloads, with the added advantage that it can be tuned at arbitrary levels of granularity in ways that are simply not possible using real applications. Experiments show that automated profiling using Synapse represents application characteristics with high fidelity. Emulation using Synapse can reproduce the application behavior in the original runtime environment, as well as reproducing properties when used in a different run-time environments

    Monitoring data in R with the lumberjack package

    Get PDF
    Monitoring data while it is processed and transformed can yield detailed insight into the dynamics of a (running) production system. The lumberjack package is a lightweight package allowing users to follow how an R object is transformed as it is manipulated by R code. The package abstracts all logging code from the user, who only needs to specify which objects are logged and what information should be logged. A few default loggers are included with the package but the package is extensible through user-defined logger objects.Comment: Accepted for publication in the Journal of Statistical Softwar

    iLeak: A Lightweight System for Detecting Inadvertent Information Leaks

    Get PDF
    Data loss incidents, where data of sensitive nature are exposed to the public, have become too frequent and have caused damages of millions of dollars to companies and other organizations. Repeatedly, information leaks occur over the Internet, and half of the time they are accidental, caused by user negligence, misconfiguration of software, or inadequate understanding of an application's functionality. This paper presents iLeak, a lightweight, modular system for detecting inadvertent information leaks. Unlike previous solutions, iLeak builds on components already present in modern computers. In particular, we employ system tracing facilities and data indexing services, and combine them in a novel way to detect data leaks. Our design consists of three components: uaudits are responsible for capturing the information that exits the system, while Inspectors use the indexing service to identify if the transmitted data belong to files that contain potentially sensitive information. The Trail Gateway handles the communication and synchronization of uaudits and Inspectors. We implemented iLeak on Mac OS X using DTrace and the Spotlight indexing service. Finally, we show that iLeak is indeed lightweight, since it only incurs 4% overhead on protected applications

    GekkoFS: A temporary distributed file system for HPC applications

    Get PDF
    We present GekkoFS, a temporary, highly-scalable burst buffer file system which has been specifically optimized for new access patterns of data-intensive High-Performance Computing (HPC) applications. The file system provides relaxed POSIX semantics, only offering features which are actually required by most (not all) applications. It is able to provide scalable I/O performance and reaches millions of metadata operations already for a small number of nodes, significantly outperforming the capabilities of general-purpose parallel file systems.The work has been funded by the German Research Foundation (DFG) through the ADA-FS project as part of the Priority Programme 1648. It is also supported by the Spanish Ministry of Science and Innovation (TIN2015–65316), the Generalitat de Catalunya (2014–SGR–1051), as well as the European Union’s Horizon 2020 Research and Innovation Programme (NEXTGenIO, 671951) and the European Comission’s BigStorage project (H2020-MSCA-ITN-2014-642963). This research was conducted using the supercomputer MOGON II and services offered by the Johannes Gutenberg University Mainz.Peer ReviewedPostprint (author's final draft
    • …
    corecore