2 research outputs found
Code transformations based on speculative SDC scheduling
Code motion and speculations are usually exploited in the High Level Synthesis of control dominated applications to improve the performances of the synthesized designs. Selecting the transformations to be applied is not a trivial task: their effects can indeed indirectly spread across the whole design, potentially worsening the quality of the results.
In this paper we propose a code transformation flow, based on a new extension of the System of Difference Constraints (SDC) scheduling algorithm, which introduces a large number of transformations, whose profitability is guaranteed by SDC formulation. Experimental results show that the proposed technique in average reduces the execution time of control dominated applications by 37% with respect to a commercial tool without increasing the area usage
A Case for Reversible Coherence Protocol
We propose the first Reversible Coherence Protocol (RCP), a new protocol
designed from ground up that enables invisible speculative load. RCP takes a
bold approach by including the speculative loads and merge/purge operation in
the interface between processor and cache coherence, and allowing them to
participate in the coherence protocol. It means, speculative load, ordinary
load/store, and merge/purge can all affect the state of a given cache line. RCP
is the first coherence protocol that enables the commit and squash of the
speculative load among distributed cache components in a general memory
hierarchy. RCP incurs an average slowdown of (3.0%,8.3%,7.4%) on
(SPEC2006,SPEC2017,PARSEC), which is lower compared to (26.5%,12%,18.3%) in
InvisiSpec and (3.2%,9.4%,24.2%) in CleanupSpec. The coherence traffic overhead
is on average 46%, compared to 40% and 27% of InvisiSpec and CleanupSpec,
respectively. Even with higher traffic overhead (~46%), the performance
overhead of RCP is lower than InvisiSpec and comparable to CleanupSpec. It
reveals a key advantage of RCP: the coherence actions triggered by the merge
and purge operations are not in the critical path of the execution and can be
performed in the cache hierarchy concurrently with processor executio