1,661 research outputs found
Power and memory optimization techniques in embedded systems design
Embedded systems incur tight constraints on power consumption and memory (which impacts size) in addition to other constraints such as weight and cost. This dissertation addresses two key factors in embedded system design, namely minimization of power consumption and memory requirement. The first part of this dissertation considers the problem of optimizing power consumption (peak power as well as average power) in high-level synthesis (HLS). The second part deals with memory usage optimization mainly targeting a restricted class of computations expressed as loops accessing large data arrays that arises in scientific computing such as the coupled cluster and configuration interaction methods in quantum chemistry. First, a mixed-integer linear programming (MILP) formulation is presented for the scheduling problem in HLS using multiple supply-voltages in order to optimize peak power as well as average power and energy consumptions. For large designs, the MILP formulation may not be suitable; therefore, a two-phase iterative linear programming formulation and a power-resource-saving heuristic are presented to solve this problem. In addition, a new heuristic that uses an adaptation of the well-known force-directed scheduling heuristic is presented for the same problem. Next, this work considers the problem of module selection simultaneously with scheduling for minimizing peak and average power consumption. Then, the problem of power consumption (peak and average) in synchronous sequential designs is addressed. A solution integrating basic retiming and multiple-voltage scheduling (MVS) is proposed and evaluated. A two-stage algorithm namely power-oriented retiming followed by a MVS technique for peak and/or average power optimization is presented. Memory optimization is addressed next. Dynamic memory usage optimization during the evaluation of a special class of interdependent large data arrays is considered. Finally, this dissertation develops a novel integer-linear programming (ILP) formulation for static memory optimization using the well-known fusion technique by encoding of legality rules for loop fusion of a special class of loops using logical constraints over binary decision variables and a highly effective approximation of memory usage
Mix & Latch: An Optimization Flow for High-Performance Designs with Single-Clock Mixed-Polarity Latches and Flip-Flops
Flip-flops are the most used sequential elements in synchronous circuits, but designs based on latches can operate at higher frequencies and occupy less area. Techniques to increase the maximum operating frequency of flip-flop based designs, such as time-borrowing, rely on tight hold constraints that are difficult to satisfy using traditional back-end optimization techniques. We propose Mix & Latch , a methodology to increase the operating frequency of synchronous digital circuits using a single clock tree and a mixed distribution of positive- and negative-edge-triggered flops, and positive- and negative-level-sensitive latches. An efficient mathematical model is proposed to optimize the type and location of the sequential elements of the circuit. We ensure that the initial registers are not moved from their initial location, although they may change type, thus allowing the use of equivalence checking and static timing analysis to verify formally the correctness of the transformation. The technique is validated using a 28nm CMOS FDSOI technology, obtaining 1.33X post-layout average operating frequency improvement on a broad set of benchmarks over a standard commercial design flow. Additionally, the circuit area was also reduced by more than 1.19X on average for the same benchmarks, although the overall area reduction is not a goal of the optimization algorithm. To the best of our knowledge, this is the first work that proposes combining mixed-polarity flip-flops and latches to improve the circuit performance
A general model for performance optimization of sequential systems
Retiming, c-slow retiming and recycling are different transformations for the performance optimization of sequential circuits. For retiming and c-slow retiming, different models that provide exact solutions have already been proposed. An exact model for recycling was yet unknown. This paper presents a general formulation that covers the combination of the three schemes for performance optimization. It provides an exact model based on integer linear programming that resorts to the structural theory of marked graphs. A set of experiments has been designed to show the benefits in performance obtained by combining retiming and recycling. The results also show the applicability of the method in large circuits.Peer ReviewedPostprint (published version
An FSM Re-Engineering Approach to Sequential Circuit Synthesis by State Splitting
We propose Finite State Machine (FSM) re-engineering, a
performance enhancement framework for FSM synthesis and
optimization. It starts with the traditional FSM synthesis procedure,
then proceeds to re-construct a functionally equivalent
but topologically different FSM based on the optimization
objective, and concludes with another round of FSM synthesis
on the re-constructed FSM. This approach explores a larger
solution space that consists of a set of FSMs functionally
equivalent to the original one, making it possible to obtain
better solutions than in the original FSM. Guided by the result
from the #2;rst round of synthesis, the solution space exploration
process can be rapid and cost-ef#2;cient.
We apply this framework to FSM state encoding for power
minimization and area minimization. The FSM is #2;rst minimized
and encoded using existing state encoding algorithms.
Then we develop both a heuristic algorithm and a genetic
algorithm to re-construct the FSM. Finally, the FSM is reencoded
by the same encoding algorithms. To demonstrate
the effectiveness of this framework, we conduct experiments
on MCNC91 sequential circuit benchmarks. The circuits are
read in and synthesized in SIS environment. After FSM
re-engineering are performed, we measure the power, area
and delay in the newly synthesized circuits. In the powerdriven
synthesis, we observe an average 5.5% of total power
reduction with 1.3% area increase and 1.3% delay increase.
This results are in general better than other low power state
encoding techniques on comparable cases. In the area-driven
synthesis, we observe an average 2.7% area reduction, 1.8%
delay reduction, and 0.4% power increase. Finally, we use
integer linear programming to obtain the optimal low power
state encoding for benchmarks of small size. We #2;nd that the
optimal solutions in the re- engineered FSMs are 1% to 8%
better than the optimal solutions in the original FSMs in terms
of power minimization
Optimizing the flash-RAM energy trade-off in deeply embedded systems
Deeply embedded systems often have the tightest constraints on energy
consumption, requiring that they consume tiny amounts of current and run on
batteries for years. However, they typically execute code directly from flash,
instead of the more energy efficient RAM. We implement a novel compiler
optimization that exploits the relative efficiency of RAM by statically moving
carefully selected basic blocks from flash to RAM. Our technique uses integer
linear programming, with an energy cost model to select a good set of basic
blocks to place into RAM, without impacting stack or data storage.
We evaluate our optimization on a common ARM microcontroller and succeed in
reducing the average power consumption by up to 41% and reducing energy
consumption by up to 22%, while increasing execution time. A case study is
presented, where an application executes code then sleeps for a period of time.
For this example we show that our optimization could allow the application to
run on battery for up to 32% longer. We also show that for this scenario the
total application energy can be reduced, even if the optimization increases the
execution time of the code
Energy-Efficient Digital Circuit Design using Threshold Logic Gates
abstract: Improving energy efficiency has always been the prime objective of the custom and automated digital circuit design techniques. As a result, a multitude of methods to reduce power without sacrificing performance have been proposed. However, as the field of design automation has matured over the last few decades, there have been no new automated design techniques, that can provide considerable improvements in circuit power, leakage and area. Although emerging nano-devices are expected to replace the existing MOSFET devices, they are far from being as mature as semiconductor devices and their full potential and promises are many years away from being practical.
The research described in this dissertation consists of four main parts. First is a new circuit architecture of a differential threshold logic flipflop called PNAND. The PNAND gate is an edge-triggered multi-input sequential cell whose next state function is a threshold function of its inputs. Second a new approach, called hybridization, that replaces flipflops and parts of their logic cones with PNAND cells is described. The resulting \hybrid circuit, which consists of conventional logic cells and PNANDs, is shown to have significantly less power consumption, smaller area, less standby power and less power variation.
Third, a new architecture of a field programmable array, called field programmable threshold logic array (FPTLA), in which the standard lookup table (LUT) is replaced by a PNAND is described. The FPTLA is shown to have as much as 50% lower energy-delay product compared to conventional FPGA using well known FPGA modeling tool called VPR.
Fourth, a novel clock skewing technique that makes use of the completion detection feature of the differential mode flipflops is described. This clock skewing method improves the area and power of the ASIC circuits by increasing slack on timing paths. An additional advantage of this method is the elimination of hold time violation on given short paths.
Several circuit design methodologies such as retiming and asynchronous circuit design can use the proposed threshold logic gate effectively. Therefore, the use of threshold logic flipflops in conventional design methodologies opens new avenues of research towards more energy-efficient circuits.Dissertation/ThesisDoctoral Dissertation Computer Science 201
- …