45,965 research outputs found

    LIKWID: Lightweight Performance Tools

    Full text link
    Exploiting the performance of today's microprocessors requires intimate knowledge of the microarchitecture as well as an awareness of the ever-growing complexity in thread and cache topology. LIKWID is a set of command line utilities that addresses four key problems: Probing the thread and cache topology of a shared-memory node, enforcing thread-core affinity on a program, measuring performance counter metrics, and microbenchmarking for reliable upper performance bounds. Moreover, it includes a mpirun wrapper allowing for portable thread-core affinity in MPI and hybrid MPI/threaded applications. To demonstrate the capabilities of the tool set we show the influence of thread affinity on performance using the well-known OpenMP STREAM triad benchmark, use hardware counter tools to study the performance of a stencil code, and finally show how to detect bandwidth problems on ccNUMA-based compute nodes.Comment: 12 page

    KAPow: A System Identification Approach to Online Per-Module Power Estimation in FPGA Designs

    Get PDF
    In a modern FPGA system-on-chip design, it is often insufficient to simply assess the total power consumption of the entire circuit by design-time estimation or runtime power rail measurement. Instead, to make better runtime decisions, it is desirable to understand the power consumed by each individual module in the system. In this work, we combine boardlevel power measurements with register-level activity counting to build an online model that produces a breakdown of power consumption within the design. Online model refinement avoids the need for a time-consuming characterisation stage and also allows the model to track long-term changes to operating conditions. Our flow is named KAPow, a (loose) acronym for ‘K’ounting Activity for Power estimation, which we show to be accurate, with per-module power estimates as close to ±5mW of true measurements, and to have low overheads. We also demonstrate an application example in which a permodule power breakdown can be used to determine an efficient mapping of tasks to modules and reduce system-wide power consumption by over 8%

    Alternate marking-based network telemetry for industrial WSNs

    Get PDF
    For continuous, persistent and problem-free operation of Industrial Wireless Sensor Networks (IWSN), it is critical to have visibility and awareness into what is happening on the network at any one time. Especially, for the use cases with strong needs for deterministic and real-time network services with latency and reliability guarantees, it is vital to monitor network devices continuously to guarantee their functioning, detect and isolate relevant problems and verify if all system requirements are being met simultaneously. In this context, this article investigates a light-weight telemetry solution for IWSNs, which enables the collection of accurate and continuous flowbased telemetry information, while adding no overhead on the monitored packets. The proposed monitoring solution adopts the recent Alternate Marking Performance Monitoring (AMPM) concept and mainly targets measuring end-to-end and hopby-hop reliability and delay performance in critical application flows. Besides, the technical capabilities and characteristics of the proposed solution are evaluated via a real-life implementation and practical experiments, validating its suitability for IWSNs
    • …
    corecore