15,649 research outputs found
Hierarchical clustered register file organization for VLIW processors
Technology projections indicate that wire delays will become one of the biggest constraints in future microprocessor designs. To avoid long wire delays and therefore long cycle times, processor cores must be partitioned into components so that most of the communication is done locally. In this paper, we propose a novel register file organization for VLIW cores that combines clustering with a hierarchical register file organization. Functional units are organized in clusters, each one with a local first level register file. The local register files are connected to a global second level register file, which provides access to memory. All intercluster communications are done through the second level register file. This paper also proposes MIRS-HC, a novel modulo scheduling technique that simultaneously performs instruction scheduling, cluster selection, inserts communication operations, performs register allocation and spill insertion for the proposed organization. The results show that although more cycles are required to execute applications, the execution time is reduced due to a shorter cycle time. In addition, the combination of clustering and hierarchy provides a larger design exploration space that trades-off performance and technology requirements.Peer ReviewedPostprint (published version
Recommended from our members
A survey of behavioral-level partitioning systems
Many approaches have been developed to partition a system's behavioral description before a structural implementation is synthesized. We highlight the foundations and motivations for behavioral partitioning. We survey behavioral partitioning approaches, discussing abstraction levels, goals, major steps, and key assumptions in each
Instruction replication for clustered microarchitectures
This work presents a new compilation technique that uses instruction replication in order to reduce the number of communications executed on a clustered microarchitecture. For such architectures, the need to communicate values between clusters can result in a significant performance loss. Inter-cluster communications can be reduced by selectively replicating an appropriate set of instructions. However, instruction replication must be done carefully since it may also degrade performance due to the increased contention it can place on processor resources. The proposed scheme is built on top of a previously proposed state-of-the-art modulo scheduling algorithm that effectively reduces communications. Results show that the number of communications can decrease using replication, which results in significant speed-ups. IPC is increased by 25% on average for a 4-cluster microarchitecture and by as mush as 70% for selected programs.Peer ReviewedPostprint (published version
5G Positioning and Mapping with Diffuse Multipath
5G mmWave communication is useful for positioning due to the geometric
connection between the propagation channel and the propagation environment.
Channel estimation methods can exploit the resulting sparsity to estimate
parameters(delay and angles) of each propagation path, which in turn can be
exploited for positioning and mapping. When paths exhibit significant spread in
either angle or delay, these methods breakdown or lead to significant biases.
We present a novel tensor-based method for channel estimation that allows
estimation of mmWave channel parameters in a non-parametric form. The method is
able to accurately estimate the channel, even in the absence of a specular
component. This in turn enables positioning and mapping using only diffuse
multipath. Simulation results are provided to demonstrate the efficacy of the
proposed approach
- …