70 research outputs found
Rapid Recovery of Program Execution Under Power Failures for Embedded Systems with NVM
After power is switched on, recovering the interrupted program from the
initial state can cause negative impact. Some programs are even unrecoverable.
To rapid recovery of program execution under power failures, the execution
states of checkpoints are backed up by NVM under power failures for embedded
systems with NVM. However, frequent checkpoints will shorten the lifetime of
the NVM and incur significant write overhead. In this paper, the technique of
checkpoint setting triggered by function calls is proposed to reduce the write
on NVM. The evaluation results show an average of 99.8% and 80.5$% reduction on
NVM backup size for stack backup, compared to the log-based method and
step-based method. In order to better achieve this, we also propose
pseudo-function calls to increase backup points to reduce recovery costs, and
exponential incremental call-based backup methods to reduce backup costs in the
loop. To further avoid the content on NVM is cluttered and out of NVM, a method
to clean the contents on the NVM that are useless for restoration is proposed.
Based on aforementioned problems and techniques, the recovery technology is
proposed, and the case is used to analyze how to recover rapidly under
different power failures.Comment: This paper has been accepted for publication to Microprocessors and
Microsystems in March 15, 202
Loop distribution and fusion with timing and code size optimization for embedded dsps
Abstract. Loop distribution and loop fusion are two effective loop transformation techniques to optimize the execution of the programs in DSP applications. In this paper, we propose a new technique combining loop distribution with direct loop fusion, which will improve the timing performance without jeopardizing the code size. We first develop the loop distribution theorems that state the legality conditions of loop distribution for multi-level nested loops. We show that if the summation of the edge weights of the dependence cycle satisfies a certain condition, then the statements involved in the dependence cycle can be distributed; otherwise, they should be put in the same loop after loop distribution. Then, we propose the technique of maximum loop distribution with direct loop fusion. The experimental results show that the execution time of the transformed loops by our technique is reduced 21.0% on average compared to the original loops and the code size of the transformed loops is reduced 7.0% on average compared to the original loops
Game and Balance Multicast Architecture Algorithms for Sensor Grid
We propose a scheme to attain shorter multicast delay and higher efficiency in the data transfer of sensor grid. Our scheme, in one cluster, seeks the central node, calculates the space and the data weight vectors. Then we try to find a new vector composed by linear combination of the two old ones. We use the equal correlation coefficient between the new and old vectors to find the point of game and balance of the space and data factorsbuild a binary simple equation, seek linear parameters, and generate a least weight path tree. We handled the issue from a quantitative way instead of a qualitative way. Based on this idea, we considered the scheme from both the space and data factor, then we built the mathematic model, set up game and balance relationship and finally resolved the linear indexes, according to which we improved the transmission efficiency of sensor grid. Extended simulation results indicate that our scheme attains less average multicast delay and number of links used compared with other well-known existing schemes
Variable Partitioning and Scheduling of Multiple Memory Architectures for DSP
Multiple memory module architecture enjoys higher memory access bandwidth and thus higher performance. Two key problems in gaining high performance in this kind of architecture are variable partitioning and scheduling. However, there’s little research work that has been done on these problems. In this paper, we present a new graph model for tackling the variable partitioning problem, namely, Variable Independence Graph (VIG), which provides more precise information for variable partitioning compared to the previous graph models. We also present a scheduling algorithm that takes advantages of multiple memory modules, Rotation Scheduling with Variable Re-partition (RSVR). It’s a new scheduling technique based on retiming and software pipelining. It may re-partition the variables if necessary during the scheduling process. The experiment results show that the average improvement on schedule length by using the algorithm is 44.8%. Another major contribution of this paper is that we invent an algorithm for design space exploration on multiple memory architecture. It produces more feasible solutions on a set of schedule length requirement. And our solution have less functional units that Interference Graph model.
Scheduling and Partitioning for Multiple Loop Nests
This paper presents the multiple loop partition scheduling technique, which combines the loop partition and prefetching. It can exploit the data locality better than the traditional loop partition, which only focus on a singleton nested loop, and loop fusion. Moreover, multiple loop partition scheduling balances the computation and memory loading, such that the long memory latency can be hidden e ectively. The experiments shows that multiple loop partition scheduling can achieve the significant improvement over the existed methods
- …