Search CORE

4 research outputs found

Recommended from our members

Dynamic Trace Analysis with Zero-Suppressed BDDs

Author: Price Graham David
Publication venue: CU Scholar
Publication date: 01/01/2011
Field of study

Instruction level parallelism (ILP) limitations have forced processor manufacturers to develop multi-core platforms with the expectation that programs will be able to exploit thread level parallelism (TLP). Multi-core programming shifts the burden of locating additional performance away from computer hardware to the software developers, who often attempt high-level redesigns focused on exposing thread level parallelism, as well as explore aggressive optimizations for sequential codes. Precise dynamic analysis can provide useful guidance for program optimization efforts, including efforts to find and extract thread level parallelism. Unfortunately, finding regions of code amenable to further optimization efforts requires analyzing traces that can quickly grow in size. Analysis of large dynamic traces (e.g. one billion instructions or more) is often impractical for commodity hardware. An ideal representation for dynamic trace data would provide compression. However, decompressing large software traces, even if decompressed data is never permanently stored, would make many analysis impractical. A better solution would allow analysis of the compressed data, without a costly decompression step. Prior works have developed trace compressors that generate an analyzable representation, but often limit the precision or scope of analyses. Zero-suppressed binary decision diagram (ZDDs) exhibit many of the desired properties of an ideal trace representation. This thesis shows: (1) dynamic trace data may be represented by zero-suppressed binary decision diagrams (ZDDs); (2) ZDDs allow many analyses to scale; (3) encoding traces as ZDDs can be performed in a reasonable amount of time; and, (4) ZDD-based analyses, such as irrelevant instruction detection and potential coarse-grained thread level parallelism extraction, can reveal a number of performanc

CU Scholar Institutional Repository

Hybrid analysis of memory references and its application to automatic parallelization

Author: Rus Silvius Vasile
Publication venue
Publication date: 15/05/2009
Field of study

Executing sequential code in parallel on a multithreaded machine has been an elusive goal of the academic and industrial research communities for many years. It has recently become more important due to the widespread introduction of multicores in PCs. Automatic multithreading has not been achieved because classic, static compiler analysis was not powerful enough and program behavior was found to be, in many cases, input dependent. Speculative thread level parallelization was a welcome avenue for advancing parallelization coverage but its performance was not always optimal due to the sometimes unnecessary overhead of checking every dynamic memory reference. In this dissertation we introduce a novel analysis technique, Hybrid Analysis, which unifies static and dynamic memory reference techniques into a seamless compiler framework which extracts almost maximum available parallelism from scientific codes and incurs close to the minimum necessary run time overhead. We present how to extract maximum information from the quantities that could not be sufficiently analyzed through static compiler methods, and how to generate sufficient conditions which, when evaluated dynamically, can validate optimizations. Our techniques have been fully implemented in the Polaris compiler and resulted in whole program speedups on a large number of industry standard benchmark applications

Texas A&M Repository

Data Dependence Profiling for Speculative Optimizations

Author: C. Dulong
D. Callahan
M. Wolfe
R. Ghiya
R. Kennedy
R.L. Kennell
U. Banerjee
W. Pugh
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2004
Field of study

Crossref