Microprocessor simulators are very popular in research and teaching environments. For example, functional simulators are often used to perform architectural studies, to fast-forward over uninteresting code, to generate program traces, and to warm up tables before switching to a more detailed but slower simulator. Unfortunately, most portable functional simulators are on the order of 100 times slower than native execution. This paper describes a set of novel techniques and optimizations to synthesize portable functional simulators that are only 6.6 times slower on average (16 times in the worst case) than native execution and 19 times faster than SimpleScalar’s sim-fast on the SPECcpu2000 programs. When simulating a memory hierarchy, the synthesized code is 2.6 times faster than the equivalent ATOM code. Our fully automated synthesis approach works without access to source/assembly code or debug information. It generates C code, integrates optional user-provided code, performs unwanted-code removal, preserves basic blocks, generates low-overhead profiles, employs a simple heuristic to determine potential jump targets, only compiles important instructions, and utilizes mixed-mode execution, i.e., it interleaves compiled and interpreted simulation to maximize performance
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.