Tradeoffs in Designing Accelerator Architectures for Visual Computing by Mahesri, Aqeel et al.
May 2008 UILU-ENG-08-2208
CRHC-08-04
TRADEOFFS IN DESIGNING 
ACCELERATOR ARCHITECTURES 
FOR VISUAL COMPUTING
Aqeel Mahesri, Daniel Johnson, Neal Crago and Sanjay 
J. Patel
Coordinated Science Laboratory
1308 West Main Street, Urbana, IL 61801
University of Illinois at Urbana-Champaign
REPORT DOCUMENTATION PAGE Form Approved OMB NO. 0704-0188
Public reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, 
gathering and maintaining the data needed, and completing and reviewing the collection of information. Send comment regarding this burden estimate or any other aspect of this 
collection of information, including suggestions for reducing this burden, to Washington Headquarters Services. Directorate for information Operations and Reports, 1215 Jefferson 
Davis Highway, Suite 1204, Arlington, VA 22202-4302, and to the Office of Management and Budget, Paperwork Reduction Project (0704-0188), Washington, DC 20503.
1. AGENCY USE ONLY (Leave blank) 2. REPORT DATE 3. REPORT TYPE AND DATES COVERED
May 2008
4. TITLE AND SUBTITLE
Tradeoffs in Designing Accelerator Architectures for Visual Computing
5. FUNDING NUMBERS
Carnegie 1040271-147720
6. AUTHOR(S)
Aqeel Mahesri, Daniel Johnson, Neal Crago, and Sanjay J. Patel
7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) 
Coordinated Science Lab 
University of Illinois 
1308 W. Main St.
Urbana, IL 61802
8. PERFORMING RGANIZATION 
REPORT NUMBER
UILU-ENG-08-2208
CRHC-08-04
9. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES) 10. SPONSORING/MONITORING  
AGENCY REPORT NUMBER
11. SUPPLEMENTARY NOTES
12a. DISTRIBUTION/A VAILABILITY STA TEMENT 
Approved for public release; distribution unlimited.
12b. DISTRIBUTION CODE
13. ABSTRACT (Maximum 200 words)
Visualization, interaction, and simulation (VIS) constitute a class of applications that is growing in importance. This class 
includes applications such as graphics rendering, video encoding, simulation, and computer vision. These applications are 
ideally suited for accelerators because of their parallelizability and demand for high throughput. We compile a benchmark 
suite, VISBench, to serve as a proxy for this application class.
We use VISBench to examine some important high level decisions for an accelerator architecture. We propose a highly 
parallel base architecture. We examine the need for synchronization and data communication. We also examine GPU- 
style SIMD execution and find that a MIMD architecture is usually preferable.
Given these high level choices, we use VISBench to explore the microarchitectural design space. We analyze area versus 
performance tradeoffs in designing individual cores and the memory hierarchy. We find that a design made of small, 
simple cores, achieves much higher throughput than a general purpose uniprocessor. Further, we find that a limited 
amount of support for ILP within each core aids overall performance. We find that fine-grained multithreading improves 
performance, but only up to a point. We find that word-level (SSE-style) SIMD provides a poor performance to area ratio.
14. SUBJECT TERMS
benchmarking, parallel architecture, graphics architecture, accelerators
15. NUMBER OF PAGES 
17
16. PRICE CODE
17. SECURITY CLASSIFICATION 18. SECURITY CLASSIFICATION 19. SECURITY CLASSIFICATION 20. LIMITATION OF ABSTRACT 
OF REPORT OF THIS PAGE OF ABSTRACT
UNCLASSIFIED UNCLASSIFIED UNCLASSIFIED UL
NSN 7540-01-280-5500 Standard Form 298 (Rev. 2-89)
Prescribed by ANSI Std. 239-18 
298-102

















