2 research outputs found
Compilation for Delay Impact Minimization in VLIW Embedded Systems
Tomorrow’s embedded devices need to run high resolution multimedia as well as need to support multistandard wireless systems which require an enormous computational complexity with a very low energy consumption and very high performance constraints. In this context, the register file is one of the key sources of power consumption and performance bottleneck, and its inappropriate design and management can severely affect the performance of the system. In this paper, we present a new compilation approach to mitigate the performance implications of technology variation in the shared register file in upcoming embedded VLIW architectures with several processing units. The compilation approach is based on a redefined register assignment policy and a set of architectural modifications to this device. Experimental results show up to a 67% performance improvement with our technique
Memory hierarchy and data communication in heterogeneous reconfigurable SoCs
The miniaturization race in the hardware industry aiming at continuous increasing
of transistor density on a die does not bring respective application performance
improvements any more. One of the most promising alternatives is to
exploit a heterogeneous nature of common applications in hardware. Supported by
reconfigurable computation, which has already proved its efficiency in accelerating
data intensive applications, this concept promises a breakthrough in contemporary
technology development.
Memory organization in such heterogeneous reconfigurable architectures becomes
very critical. Two primary aspects introduce a sophisticated trade-off. On
the one hand, a memory subsystem should provide well organized distributed data
structure and guarantee the required data bandwidth. On the other hand, it should
hide the heterogeneous hardware structure from the end-user, in order to support
feasible high-level programmability of the system.
This thesis work explores the heterogeneous reconfigurable hardware architectures
and presents possible solutions to cope the problem of memory organization
and data structure. By the example of the MORPHEUS heterogeneous platform,
the discussion follows the complete design cycle, starting from decision making
and justification, until hardware realization. Particular emphasis is made on the
methods to support high system performance, meet application requirements, and
provide a user-friendly programmer interface.
As a result, the research introduces a complete heterogeneous platform enhanced
with a hierarchical memory organization, which copes with its task by
means of separating computation from communication, providing reconfigurable
engines with computation and configuration data, and unification of heterogeneous
computational devices using local storage buffers. It is distinguished from the
related solutions by distributed data-flow organization, specifically engineered
mechanisms to operate with data on local domains, particular communication infrastructure
based on Network-on-Chip, and thorough methods to prevent computation
and communication stalls. In addition, a novel advanced technique to accelerate
memory access was developed and implemented