10 research outputs found
Containing the Nanometer “Pandora-Box”: Cross-Layer Design Techniques for Variation Aware Low Power Systems
The demand for richer multimedia services, multifunctional portable devices and high data rates can only been visioned due to the improvement in semiconductor technology. Unfortunately, sub-90 nm process nodes uncover the nanometer Pandora-box exposing the barriers of technology scaling—parameter variations, that threaten the correct operation of circuits, and increased energy consumption, that limits the operational lifetime of today’s systems. The contradictory design requirements for low-power and system robustness, is one of the most challenging design problems of today. The design efforts are further complicated due to the heterogeneous types of designs (logic, memory, mixed-signal) that are included in today’s complex systems and are characterized by different design requirements. This paper presents an overview of techniques at various levels of design abstraction that lead to low power and variation aware logic, memory and mixed-signal circuits and can potentially assist in meeting the strict power budgets and yield/quality requirements of future systems
Implementing Energy Parsimonious Circuits through Inexact Designs
Inexact Circuits or circuits in which accuracy of the output can be traded for cost (energy, delay and/or area) savings, have been receiving increasing attention of late due to invariable inaccuracies in nanometer-scale circuits and a concomitant growing desire for ultra low energy embedded systems. Most of the previous approaches to realize inexact circuits relied on scaling of circuit-level operational parameters (such as supply voltage) to achieve the cost and accuracy tradeoffs, and suffered from serious drawbacks of significant implementation overheads that drastically reduced the gains. In this thesis, two novel architecture-level approaches called Probabilisttc Pruning and Probabilistic Logic Minimization are proposed to realize inexact circuits with zero overhead. Extensive simulations on various architectures of datapath elements and a prototype chip fabrication demonstrate that normalized gains as large as 2X-9.5X in Energy-Delay-Area product can be obtained for relative error as low as 10 -6 % - 1% compared to corresponding conventional correct designs
Spectral-energy efficiency trade-off of relay-aided cellular networks
Wireless communication networks are traditionally designed to operate at high spectral
e ciency with less emphasis on power consumption as it is assumed that endless
power supply is available through the power grid where the cells are connected to. As
new generations of mobile networks exhibit decreasing gains in spectral e ciency, the
mobile industry is forced to consider energy reform policies in order to sustain the
economic growth of itself and other industries relying on it. Consequently, the energy
e ciency of conventional direct transmission cellular networks is being examined
while alternative green network architectures are also explored. The relay-aided cellular
network is being considered as one of the potential network architecture for energy
e cient transmission. However, relaying transmission incurs multiplexing loss due to
its multi-hop protocol. This, in turn, reduces network spectral e ciency. Furthermore,
interference is also expected to increase with the deployment of Relay Stations
(RSs) in the network. This thesis examines the power consumption of the conventional
direct transmission cellular network and contributes to the development of the
relay-aided cellular network.
Firstly, the power consumption of the direct transmission cellular network is investigated.
While most work considered transmitter side strategies, the impact of the
receiver on the Base Station (BS) total power consumption is investigated here. Both
the zero-forcing and minimum mean square error weight optimisation approaches are
considered for both the conventional linear and successive interference cancellation
receivers. The power consumption model which includes both the radio frequency
transmit power and circuit power is described. The in
uence of the receiver interference
cancellation techniques, the number of transceiver antennas, circuit power
consumption and inter-cell interference on the BS total power consumption is investigated.
Secondly, the spectral-energy e ciency trade-o in the relay-aided cellular network is
investigated. The signal forwarding and interference forwarding relaying paradigms
are considered with the direct transmission cellular network taken as the baseline.
This investigation serves to understand the dynamics in the performance trade-o .
To select a suitable balance point in the trade-o , the economic e ciency metric is
proposed whereby the spectral-energy e ciency pair which maximises the economic
pro tability is found. Thus, the economic e ciency metric can be utilised as an alternative
means to optimise the relay-aided cellular network while taking into account
the inherent spectral-energy e ciency trade-o .
Finally, the method of mitigating interference in the relay-aided cellular network is
demonstrated by means of the proposed relay cooperation scheme. In the proposed
scheme, both joint RS decoding and independent RS decoding approaches are considered
during the broadcast phase while joint relay transmission is employed in the
relay phase. Two user selection schemes requiring global Channel State Information
(CSI) are considered. The partial semi-orthogonal user selection method with reduced
CSI requirement is then proposed. As the cooperative cost limits the practicality of
cooperative schemes, the cost incurred at the cooperative links between the RSs is
investigated for varying degrees of RS cooperation. The performance of the relay
cooperation scheme with di erent relay frequency reuse patterns is considered as well.
In a nutshell, the research presented in this thesis reveals the impact of the receiver on
the BS total power consumption in direct transmission cellular networks. The relayaided
cellular network is then presented as an alternative architecture for energy
e cient transmission. The economic e ciency metric is proposed to maximise the
economic pro tability of the relay network while taking into account the existing
spectral-energy e ciency trade-o . To mitigate the interference from the RSs, the
relay cooperation scheme for advanced relay-aided cellular networks is proposed
Recommended from our members
Modeling and synthesis of approximate digital circuits
textEnergy minimization has become an ever more important concern in the design of very large scale integrated circuits (VLSI). In recent years, approximate computing, which is based on the idea of trading off computational accuracy for improved energy efficiency, has attracted significant attention. Applications that are both compute-intensive and error-tolerant are most suitable to adopt approximation strategies. This includes digital signal processing, data mining, machine learning or search algorithms. Such approximations can be achieved at several design levels, ranging from software, algorithm and architecture, down to logic or transistor levels. This dissertation investigates two research threads for the derivation of approximate digital circuits at the logic level: 1) modeling and synthesis of fundamental arithmetic building blocks; 2) automated techniques for synthesizing arbitrary approximate logic circuits under general error specifications. The first thread investigates elementary arithmetic blocks, such as adders and multipliers, which are at the core of all data processing and often consume most of the energy in a circuit. An optimal strategy is developed to reduce energy consumption in timing-starved adders under voltage over-scaling. This allows a formal demonstration that, under quadratic error measures prevalent in signal processing applications, an adder design strategy that separates the most significant bits (MSBs) from the least significant bits (LSBs) is optimal. An optimal conditional bounding (CB) logic is further proposed for the LSBs, which selectively compensates for the occurrence of errors in the MSB part. There is a rich design space of optimal adders defined by different CB solutions. The other thread considers the problem of approximate logic synthesis (ALS) in two-level form. ALS is concerned with formally synthesizing a minimum-cost approximate Boolean function, whose behavior deviates from a specified exact Boolean function in a well-constrained manner. It is established that the ALS problem un-constrained by the frequency of errors is isomorphic to a Boolean relation (BR) minimization problem, and hence can be efficiently solved by existing BR minimizers. An efficient heuristic is further developed which iteratively refines the magnitude-constrained solution to arrive at a two-level representation also satisfying error frequency constraints. To extend the two-level solution into an approach for multi-level approximate logic synthesis (MALS), Boolean network simplifications allowed by external don't cares (EXDCs) are used. The key contribution is in finding non-trivial EXDCs that can maximally approach the external BR and, when applied to the Boolean network, solve the MALS problem constrained by magnitude only. The algorithm then ensures compliance to error frequency constraints by recovering the correct outputs on the sought number of error-producing inputs while aiming to minimize the network cost increase. Experiments have demonstrated the effectiveness of the proposed techniques in deriving approximate circuits. The approximate adders can save up to 60% energy compared to exact adders for a reasonable accuracy. When used in larger systems implementing image-processing algorithms, energy savings of 40% are possible. The logic synthesis approaches generally can produce approximate Boolean functions or networks with complexity reductions ranging from 30% to 50% under small error constraints.Electrical and Computer Engineerin
Robust and reliable decision-making systems and algorithms
We investigate robustness and reliability in decision-making systems and algorithms based on the tradeoff between cost and performance. We propose two abstract frameworks to investigate robustness and reliability concerns, which critically impact the design and analysis of systems and algorithms based on unreliable components.
We consider robustness in online systems and algorithms under the framework of online optimization subject to adversarial perturbations. The framework of online optimization models a rich class of problems from information theory, machine learning, game theory, optimization, and signal processing. This is a repeated game framework where, on each round, a player selects an action from a decision set using a randomized strategy, and then Nature reveals a loss function for this action, for which the player incurs a loss. Through a worst case adversary framework to model the perturbations, we introduce a randomized algorithm that is provably robust even against such adversarial attacks. In particular, we show that this algorithm is Hannan-consistent with respect to a rich class of randomized strategies under mild regularity conditions.
We next focus on reliability of decision-making systems and algorithms based on the problem of fusing several unreliable computational units that perform the same task under cost and fidelity constraints. In particular, we model the relationship between the fidelity of the outcome and the cost of computing it as an additive perturbation. We analyze performance of repetition-based strategies that distribute cost across several unreliable units and fuse their outcomes. When the cost is a convex function of fidelity, the optimal repetition-based strategy in terms of minimizing total incurred cost while achieving a target mean-square error performance may fuse several computational units. For concave and linear costs, a single more reliable unit incurs lower cost compared to fusion of several lower cost and less reliable units while achieving the same mean-square error (MSE) performance. We show how our results give insight into problems from theoretical neuroscience, circuits, and crowdsourcing.
We finally study an application of a partial information extension of the cost-fidelity framework of this dissertation to a stochastic gradient descent problem, where the underlying cost-fidelity function is assumed to be unknown. We present a generic framework for trading off fidelity and cost in computing stochastic gradients when the costs of acquiring stochastic gradients of different quality are not known a priori. We consider a mini-batch oracle that distributes a limited query budget over a number of stochastic gradients and aggregates them to estimate the true gradient. Since the optimal mini-batch size depends on the unknown cost fidelity function, we propose an algorithm, EE-Grad, that sequentially explores the performance of mini-batch oracles and exploits the accumulated knowledge to estimate the one achieving the best performance in terms of cost efficiency. We provide performance guarantees for EE-Grad with respect to the optimal mini-batch oracle, and illustrate these results in the case of strongly convex objectives
Timing-Error Tolerance Techniques for Low-Power DSP: Filters and Transforms
Low-power Digital Signal Processing (DSP) circuits are critical to commercial System-on-Chip design for battery powered devices. Dynamic Voltage Scaling (DVS) of digital circuits can reclaim worst-case supply voltage margins for delay variation, reducing power consumption. However, removing static margins without compromising robustness is tremendously challenging, especially in an era of escalating reliability concerns due to continued process scaling. The Razor DVS scheme addresses these concerns, by ensuring robustness using explicit timing-error detection and correction circuits. Nonetheless, the design of low-complexity and low-power error correction is often challenging. In this thesis, the Razor framework is applied to fixed-precision DSP filters and transforms. The inherent error tolerance of many DSP algorithms is exploited to achieve very low-overhead error correction. Novel error correction schemes for DSP datapaths are proposed, with very low-overhead circuit realisations. Two new approximate error correction approaches are proposed. The first is based on an adapted sum-of-products form that prevents errors in intermediate results reaching the output, while the second approach forces errors to occur only in less significant bits of each result by shaping the critical path distribution. A third approach is described that achieves exact error correction using time borrowing techniques on critical paths. Unlike previously published approaches, all three proposed are suitable for high clock frequency implementations, as demonstrated with fully placed and routed FIR, FFT and DCT implementations in 90nm and 32nm CMOS. Design issues and theoretical modelling are presented for each approach, along with SPICE simulation results demonstrating power savings of 21 – 29%. Finally, the design of a baseband transmitter in 32nm CMOS for the Spectrally Efficient FDM (SEFDM) system is presented. SEFDM systems offer bandwidth savings compared to Orthogonal FDM (OFDM), at the cost of increased complexity and power consumption, which is quantified with the first VLSI architecture
Optimization Techniques for Minimizing Energy Consumption in Approximate Circuits
This work presents different global and local optimization techniques for designing "approximate" circuits which decrease energy consumption, one of the most important criteria in present day circuit design. The concept of "approximate" circuits which trades off energy consumption to output quality, thus creating a new dimension to the design space, is radically different from the conventional design principle in which all circuits operate correctly all the time. But efficient and intelligent designs have to be realized to tap its full potential. These techniques, which have not been explored till date, are based on a rigorous mathematical model and target to improve the output quality of a given circuit keeping the energy consumption to a minimum. They use the value of information and the architecture of the circuit to maximize efficiency. They have been applied to digital signal processing circuits to realize energy savings up to 2X the conventional value
Harnessing noise to enhance robustness vs. efficiency trade-off in machine learning
While deep nets have achieved human-comparable accuracy in various classification tasks, they fall short significantly in terms of the robustness and cost metrics. For example, tiny engineered corruptions in deep net inputs can reduce their accuracy to zero. Furthermore, deep nets also require millions of trainable parameters, resulting in significant training and inference costs. These robustness and cost challenges are well recognized today. In response, there have been a plethora of works focusing on improving either the accuracy vs. robustness trade-off, or the accuracy vs. cost trade-off. However, simultaneous consideration of accuracy, robustness, and cost metrics is largely absent today, in part, because far fewer works have explored the robustness vs. cost trade-off. This dissertation aims to fill this gap by focusing explicitly on the robustness vs. cost trade-off in the presence of data noise, as well as hardware noise. Specifically, we explore how to harness the noise in order to enhance this trade-off. We characterize and improve robustness vs. cost trade-offs across diverse problem settings, ranging from beyond-CMOS hardware implementations of machine learning (ML) classifiers to efficient training of deep nets that are robust to multiple types of corruptions in their inputs. This dissertation can be roughly divided into two part, one focusing on hardware noise and the other on data noise.
In the first part, we start by focusing on harnessing noise in spintronic hardware implementations, where the logic gates become error prone when operated at lower switching energy/delay. We propose techniques to shape the resulting hardware noise distribution and to efficiently compensate it at the system-level output. As a result, we observe 1000x improvement intolerance to gate-level switching error rates, while keeping the area/energy overhead of compensation circuits to as low as 15%. These robustness enhancements further enable 3× reduction in iso-throughput energy consumption of a binary ML classifier employed for EEG-based seizure detection. Building on this work, we propose spintronic channel networks, exponential decay of spin current to efficiently realize multi-bit dot product computation. We employ error-prone nanomagnets as efficient stochastic slicers biased by spin currents proportional to the likelihood of the classification decision. We achieve 112x-to-22.5x and 14x-to-2.5x higher energy-efficiency over conventional spin-based and 20 nm CMOS designs, respectively, when realizing 10-to-100-dimensional binary classifiers. Furthermore, we also consider the impact of hardware noise originated from process variations and readout circuits in in-memory computing implementations employing non-volatile resistive crossbar arrays. Based on our analysis, we identify design configurations achieving the highest signal-to-noise ratio (SNR), and further estimate how such robustness trades off with the array energy consumption.
In the second part, we switch gears to improve the robustness vs. cost trade-off for deep nets in the presence of data noise. Specifically, we focus on the impact of adversarial perturbations in the deep nets inputs. We propose and validate the hypotheses about orientations of dominant subspaces of adversarial perturbations. We demonstrate how changes in the curvature of decision boundary of the deep nets affects the orientations of the adversarial perturbations. Based on these insights we demonstrate how shaped noise can be introduced as a feature to enhance robustness vs. cost trade-off in deep nets. Specifically, we propose shaped noise augmented processing (SNAP), a method to efficiently train deep nets that are robust to multiple types of adversarial perturbations, simultaneously. SNAP prepends a deep net with a shaped noise augmentation layer whose distribution is learned along with the network parameters using any established robust training framework. Based on extensive comparisons with nine state-of-the-art (SOTA) robust training frameworks, we show that SNAP achieves the best robustness vs. training cost trade-off. In particular, it enables 4x reduction in the training cost compared to the SOTA approach published just this last year. Furthermore, thanks to the computational simplicity of SNAP, it is the first technique of its kind that is scalable to large datasets, such as ImageNet