3 research outputs found

    Performance Estimation for Task Graphs Combining Sequential Path Profiling and Control Dependence Regions

    Get PDF
    The speed-up estimation of parallelized code is crucial to efficiently compare different parallelization techniques or task graph transformations. Unfortunately, most of the time, during the parallelization of a specification, the information that can be extracted by profiling the corresponding sequential code (e.g. the most executed paths) are not properly taken into account. In particular, correlating sequential path profiling with the corresponding parallelized code can help in the identification of code hot spots, opening new possibilities for automatic parallelization. For this reason, starting from a well-known profiling technique, the Efficient Path Profiling, we propose a methodology that estimates the speed-up of a parallelized specification, just using the corresponding hierarchical task graph representation and the information coming from the dynamic profiling of the initial sequential specification. Experimental results show that the proposed solution outperforms existing approaches

    Extending Path Profiling across Loop Backedges and Procedure Boundaries £

    No full text
    Since their introduction, path profiles have been used to guide the application of aggressive code optimizations and performing instruction scheduling. However, for optimization and scheduling, it is often desirable to obtain frequency counts of paths that extend across loop iterations and cross procedure boundaries. These longer paths, referred to as interesting paths in this paper, account for over 75 % of the flow in a subset of SPEC benchmarks. Although the frequency counts of interesting paths can be estimated from path profiles, the degree of imprecision of these estimates is very high. We extend Ball Larus (BL) paths to create slightly longer overlapping paths and develop an instrumentation algorithm to collect their frequencies. While these paths are slightly longer than BL paths, they enable very precise estimation of frequencies of potentially much longer interesting paths. Our experiments show that the average cost of collecting frequencies of overlapping paths is 86.8 % which is 4.2 times that of BL paths. However, while the average imprecision in estimated total flow of interesting paths derived from BL path frequencies ranges from-38 % to +138 %, the average imprecision in flow estimates derived from overlapping path frequencies ranges only from-4 % to +8%. Keywords- path profiles, overlapping path profiles, profile guided optimization, and instruction scheduling.
    corecore