7 research outputs found

    Critical path profiling of message passing and shared-memory programs

    Full text link

    A timed-automata approach for critical path detection in a soft real-time application

    Get PDF
    In this paper, we report preliminary ideas from our project called “Time Performance Improvement With Parallel Processing Systems” (TIPS). In the TIPS project, we plan to take advantage of multi-core platforms for performance improvement by parallelizing a complex soft real-time application. In order to increase the timing performance, one needs to adapt the optimizations on the critical execution paths of an application which are both significantly time consuming and important from user requirements' perspective. In this work, we present an approach how to detect critical paths in a target application

    Detecting and Using Critical Paths at Runtime in Message Driven Parallel Programs

    Full text link

    HDOT — An approach towards productive programming of hybrid applications

    Get PDF
    bulk synchronous parallel (BSP) communication model can hinder performance increases. This is due to the complexity to handle load imbalances, to reduce serialisation imposed by blocking communication patterns, to overlap communication with computation and, finally, to deal with increasing memory overheads. The MPI specification provides advanced features such as non-blocking calls or shared memory to mitigate some of these factors. However, applying these features efficiently usually requires significant changes on the application structure. Task parallel programming models are being developed as a means of mitigating the abovementioned issues but without requiring extensive changes on the application code. In this work, we present a methodology to develop hybrid applications based on tasks called hierarchical domain over-decomposition with tasking (HDOT). This methodology overcomes most of the issues found on MPI-only and traditional hybrid MPI+OpenMP applications. However, by emphasising the reuse of data partition schemes from process-level and applying them to task-level, it enables a natural coexistence between MPI and shared-memory programming models. The proposed methodology shows promising results in terms of programmability and performance measured on a set of applications.This work has been developed with the support of the European Union H2020 program through the INTERTWinE project (agreement number 671602); the Severo Ochoa Program awarded by the Spanish Government (SEV-2015-0493); the Generalitat de Catalunya (contract 2017-SGR-1414); and the Spanish Ministry of Science and Innovation (TIN2015-65316-P, Computaci on de Altas Prestaciones VII). The authors gratefully acknowledge Dr. Arnaud Mura, CNRS researcher at Institut PPRIME in France, for the numerical tool CREAMS. Finally, the manuscript has greatly bene ted from the precise comments of the reviewers.Peer ReviewedPostprint (author's final draft
    corecore