The design and evaluation of microprocessor architecture is a difficult and time-consuming task. Although small, hand-coded, microbenchmarks can be used to accelerate architecture design space exploration, these programs are usually too simplistic to stress the entire architecture design, which is becoming increasing complex. Larger and more complex real-world workloads should be employed to measure the performance of a given design or to evaluate the efficiency of various design alternatives. Nevertheless, these applications can take days or weeks if run to completion on a detailed architecture simulator. In the past, researchers have applied machine learning and statistical sampling methods to reduce the average number of instructions required for detailed simulation. Others have proposed statistical simulation and workload synthesis techniques, which can produce programs that emulate the execution characteristics of the application that they are derived from but have a much shorter execution period than the original. However, existing methodologies are difficult to apply to multi-threaded programs and can result in simplifications that miss the complex interactions between multiple, concurrently running threads. This study focuses on developing new techniques for accurate and effective multi-threaded workload synthesis, which can effectively accelerate the design and optimization of multi-core architectures. We propose to construct statistical flow graphs that incorporate inter-thread synchronization and sharing characteristics to capture the behavior and interactions of each thread and to generate accurate workload characterizations. A walk of these graphs is used to generate a synthetic program that maintains these characteristics but has reduced runtime. 1
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.