Search CORE

2 research outputs found

Synthesis of heterogeneous distributed architectures for memory-intensive applications

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2003
Field of study

Synthesis of heterogeneous distributed architectures for memory-intensive applications

Author: Chao Huang
Niraj K. Jha
Srivaths Ravi
Publication venue
Publication date: 01/01/2003
Field of study

Abstract — Memory-intensive applications present unique challenges to an ASIC designer in terms of the choice of memory organization, memory size requirements, bandwidth and access latencies, etc. The high potential of single-chip distributed logicmemory architectures in addressing many of these issues has been recognized in general-purpose computing, and more recently in ASIC design. However, such architectures will be adopted widely by designers only when general techniques and tools for efficient high-level synthesis (HLS) of multi-partitioned ASICs become available. The techniques presented in this paper are motivated by the fact that many memoryintensive applications exhibit irregular array data access patterns (due to conditionals in loop nests, etc.). Synthesis should, therefore, be capable of determining a partitioned architecture, wherein array data and computations may have to be heterogeneously distributed for achieving the best performance speedup. Furthermore, the synthesis methodology should not be restricted by the nature of array index functions (affine or otherwise) in a behavior. Therefore, our methodology employs simulation to provide information about the access patterns of array data references in a behavior, which is used by the rest of our analysis. We use a combination of clustering and min-cut style partitioning techniques to partition the behavior into sub-behaviors while considering various factors including data access locality, balanced workloads, inter-partition communication, etc. Finally, we also employ an iterative improvement strategy to determine the best way of distributing array data into physical memory in each partition. Our experiments with several benchmark applications show that the proposed techniques can yield partitioned architectures that can achieve upto performance speed-up over conventional HLS solutions, while achieving upto performance speedup over the best homogeneous partitioning solution feasible. I

CiteSeerX