In response to current technology scaling trends, architects are developing a new style of processor, known as spatial computers. A spatial computer is composed of hundreds or even thousands of simple, replicated processing elements (or PEs), frequently organized into a grid. Several current spatial computers, such as TRIPS, RAW, SmartMemories, nanoFabrics and WaveScalar, explicitly place a program’s instructions onto the grid. Designing instruction placement algorithms is an enormous challenge, as there are an exponential (in the size of the application) number of different mappings of instructions to PEs, and the choice of mapping greatly affects program performance. In this paper we develop an instruction placement performance model which can inform instruction placement. The model comprises three components, each of which captures a different aspect of spatial computing performance: inter-instruction operand latency, data cache coherence overhead, and contention for processing element resources. We evaluate the model on one spatial computer, WaveScalar, and find that predicted and actual performance correlate with a coefficient of −0.90. We demonstrate the model’s utility by using it to design a new placement algorithm, which outperforms our previous algorithms. Although developed in the context of WaveScalar, the model can serve as a foundation for tuning code, compiling software, and understanding the microarchitectural trade-offs of spatial computers in general
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.