Search CORE

70 research outputs found

Gestion de traces d'exécution pour le systèmes embarqués : contenu et stockage

Author: Marangozova-Martin Vania
Pagano Generoso
Publication venue: HAL CCSD
Publication date: 01/01/2013
Field of study

Ce rapport porte sur les systèmes de traces et catégorise leurs motivations et les fonctionnalités fournies. Il a pour objectif d'expliciter le lien entre objectifs de traçage et les types (contenu, format et stockage) de traces d'exécution manipulées. Il identifie les besoins en termes d'exploitation de traces dans le domaine des systèmes embarqués et présente notre proposition de solution dans le cadre du projet SoC-TRACE

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

07341 Abstracts Collection -- Code Instrumentation and Modeling for Parallel Performance Analysis

Author: Hoisie Adolfy
Miller Barton P.
Mohr Bernd
Publication venue: Dagstuhl Seminar Proceedings. 07341 - Code Instrumentation and Modeling for Parallel Performance Analysis
Publication date: 01/01/2007
Field of study

From 20th to 24th August 2007, the Dagstuhl Seminar 07341 ``Code Instrumentation and Modeling for Parallel Performance Analysis\u27\u27 was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

Dagstuhl Research Online Publication Server

Design Evaluation of a Performance Analysis Trace Repository

Author: Grunzke Richard
Hartmann Volker
Ilsche Thomas
Jejkal Thomas
Knüpfer Andreas
Nagel Wolfgang E.
Neumann Maximilian
Stotzka Rainer
Publication venue: Elsevier
Publication date: 29/08/2017
Field of study

KITopen

Evaluation of Profiling Tools for the Acquisition of Time Independent Traces

Author: Desprez Frédéric
Markomanolis George
Suter Frédéric
Publication venue: HAL CCSD
Publication date: 08/07/2013
Field of study

In a previous work, we proposed a framework for the off-line simulation of MPI applications. Its main originality with regard to the literature is to rely on time-independent execution traces. Time-independent traces are an original way to estimate the performance of parallel applications. To acquire time-independent traces of the execution of MPI applications, we have to instrument them to log the necessary information. There exist many profiling tools which can instrument an application. In this report we propose a scoring system that corresponds to our framework specific requirements and evaluate the most well-known and open source profiling tools according to it. Furthermore we introduce an original tool called Minimal Instrumentation that was designed to fulfill the requirements of our framework.Dans nos précédents travaux, nous avons proposé un environnement pour la simulation hors-ligne d'applications MPI. Sa principale originalité vis-à-vis de la littérature est de s'appuyer sur des traces d'exécution indépendantes du temps. Cela constitue une manière originale d'estimer les performances d'applications parallèles. Pour acquérir de telles traces indépendantes du temps lors de l'exécution d'applications MPI, nous devns les instrumenter afin de recueillir toutes les informations nécessaires. Il existe de nombreux outils de profiling permettant d'instrumenter une application. Dans ce rapport, nous proposons une méthode de notation correspondant aux besoins spécifiques de notre environnement et évaluons les outils de profiling open-source les plus connus selon cette méthode. De plus, nous introduisons un outil original, appelé Minimal Instrumentation, spécialement conçu pour répondre aux besoins de notre environnement

HAL-ENS-LYON

HAL-IN2P3

INRIA a CCSD electronic archive server

Hal-Diderot

Tools for analyzing parallel I/O

Author: A Knüpfer
A Peters
AJ Peters
C Karbach
E Betke
IF Adams
JM Kunkel
P Carns
P Gomez-Sanchez
SA Wright
T Benson
T Ludwig
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2018
Field of study

Parallel application I/O performance often does not meet user expectations. Additionally, slight access pattern modifications may lead to significant changes in performance due to complex interactions between hardware and software. These issues call for sophisticated tools to capture, analyze, understand, and tune application I/O. In this paper, we highlight advances in monitoring tools to help address these issues. We also describe best practices, identify issues in measure- ment and analysis, and provide practical approaches to translate parallel I/O analysis into actionable outcomes for users, facility operators, and researchers

arXiv.org e-Print Archive

Central Archive at the University of Reading

Crossref

Juelich Shared Electronic Resources

Profiling MPI applications with mixed instrumentation

Author: Balladini Javier
Castillo Rodolfo del
Castro Silvia Mabel
Grosclaude Eduardo
Zanellato Claudio
Publication venue
Publication date: 30/07/2012
Field of study

Our research project intends to build knowledge about HPC problems to be able to help local researchers. In order to advise users in choosing parallel machines to run their applications, we want to establish a general methodology, requiring as shallow information as possible, to characterize parallel applications. To draw a profile of a closed, message-passing application, we look for convenient tools for inspection on the distribution of communication primitives. We show a feasible way to do black-box instrumentation of closed MPI applications.Presentado en el X Workshop Procesamiento Distribuido y Paralelo (WPDP)Red de Universidades con Carreras en Informática (RedUNCI

Servicio de Difusión de la Creación Intelectual

Concepts for In-memory Event Tracing: Runtime Event Reduction with Hierarchical Memory Buffers

Author: Wagner Michael
Publication venue
Publication date: 03/07/2015
Field of study

This thesis contributes to the field of performance analysis in High Performance Computing with new concepts for in-memory event tracing. Event tracing records runtime events of an application and stores each with a precise time stamp and further relevant metrics. The high resolution and detailed information allows an in-depth analysis of the dynamic program behavior, interactions in parallel applications, and potential performance issues. For long-running and large-scale parallel applications, event-based tracing faces three challenges, yet unsolved: the number of resulting trace files limits scalability, the huge amounts of collected data overwhelm file systems and analysis capabilities, and the measurement bias, in particular, due to intermediate memory buffer flushes prevents a correct analysis. This thesis proposes concepts for an in-memory event tracing workflow. These concepts include new enhanced encoding techniques to increase memory efficiency and novel strategies for runtime event reduction to dynamically adapt trace size during runtime. An in-memory event tracing workflow based on these concepts meets all three challenges: First, it not only overcomes the scalability limitations due to the number of resulting trace files but eliminates the overhead of file system interaction altogether. Second, the enhanced encoding techniques and event reduction lead to remarkable smaller trace sizes. Finally, an in-memory event tracing workflow completely avoids intermediate memory buffer flushes, which minimizes measurement bias and allows a meaningful performance analysis. The concepts further include the Hierarchical Memory Buffer data structure, which incorporates a multi-dimensional, hierarchical ordering of events by common metrics, such as time stamp, calling context, event class, and function call duration. This hierarchical ordering allows a low-overhead event encoding, event reduction and event filtering, as well as new hierarchy-aided analysis requests. An experimental evaluation based on real-life applications and a detailed case study underline the capabilities of the concepts presented in this thesis. The new enhanced encoding techniques reduce memory allocation during runtime by a factor of 3.3 to 7.2, while at the same do not introduce any additional overhead. Furthermore, the combined concepts including the enhanced encoding techniques, event reduction, and a new filter based on function duration within the Hierarchical Memory Buffer remarkably reduce the resulting trace size up to three orders of magnitude and keep an entire measurement within a single fixed-size memory buffer, while still providing a coarse but meaningful analysis of the application. This thesis includes a discussion of the state-of-the-art and related work, a detailed presentation of the enhanced encoding techniques, the event reduction strategies, the Hierarchical Memory Buffer data structure, and a extensive experimental evaluation of all concepts

Technische Universität Dresden: Qucosa

Efficient Analysis Methodology for Huge Application Traces

Author: Dosimont Damien
Huard Guillaume
Marangozova-Martin Vania
Pagano Generoso
Vincent Jean-Marc
Publication venue: HAL CCSD
Publication date: 01/07/2014
Field of study

International audienceThe growing complexity of computer system hard- ware and software makes their behavior analysis a challenging task. In this context, tracing appears to be a promising solution as it provides relevant information about the system execution. However, trace analysis techniques and tools lack in providing the analyst the way to perform an efficient analysis flow because of several issues. First, traces contain a huge volume of data difficult to store, load in memory and work with. Then, the analysis flow is hindered by various result formats, provided by different analysis techniques, often incompatible. Last, analysis frameworks lack an entry point to understand the traced application general behavior. Indeed, traditional visualization techniques suffer from time and space scalability issues due to screen size, and are not able to represent the full trace. In this article, we present how to do an efficient analysis by using the Shneiderman's mantra: "Overview first, zoom and filter, then details on demand". Our methodology is based on FrameSoC, a trace management infrastructure that provides solutions for trace storage, data access, and analysis flow, managing analysis results and tool. Ocelotl, a visualization tool, takes advantage of FrameSoC and shows a synthetic representa- tion of a trace by using a time aggregation. This visualization solves scalability issues and provides an entry point for the analysis by showing phases and behavior disruptions, with the objective of getting more details by focusing on the interesting trace parts

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

QoS-Driven Reconfigurable Parallel Computing for NoC-Based Clustered MPSoCs

Author: Angiolini Federico
Bagdia Akash
Carrabina Jordi
Castells-Rufas David
De Micheli Giovanni
Fernandez-Alonso Eduard
Joven Jaume
Strid Per
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2013
Field of study

Reconfigurable parallel computing is required to provide high-performance embedded computing, hide hardware complexity, boost software development, and manage multiple workloads when multiple applications are running simultaneously on the emerging network-on-chip (NoC)-based multiprocessor systems-on-chip (MPSoCs) platforms. In these type of systems, the overall system performance may be affected due to congestion, and therefore parallel programming stacks must be assisted by quality-of-service (QoS) support to meet application requirements and to deal with application dynamism. In this paper, we present a hardware-software QoS-driven reconfigurable parallel computing framework, i.e., the NoC services, the runtime QoS middleware API and our ocMPI library and its tracing support which has been tailored for a distributed-shared memory ARM clustered NoC-based MPSoC platform. The experimental results show the efficiency of our software stack under a broad range of parallel kernels and benchmarks, in terms of low-latency interprocess communication, good application scalability, and most important, they demonstrate the ability to enable runtime reconfiguration to manage workloads in message-passing parallel applications

Infoscience - École polytechnique fédérale de Lausanne