55 research outputs found

    Transmission gate based dual rail logic for differential power analysis resistant circuits

    Get PDF
    Cryptographic devices with hardware implementation of the algorithms are increasingly being used in various applications. As a consequence, there is an increased need for security against the attacks on the cryptographic system. Among various attack techniques, side channel attacks pose a significant threat to the hardware implementation. Power analysis attacks are a type of side channel attack where the power leakage from the underlying hardware is used to eavesdrop on the hardware operation. Wave pipelined differential and dynamic logic (WDDL) has been found to be an effective countermeasure to power analysis. This thesis studies the use of transmission gate based WDDL implementation for the differential and dynamic logic. Although WDDL is an effective defense against power analysis, the number of gates needed for the design of a secure implementation is double the number of gates used for non-secure operations. In this thesis we propose transmission gate based structures for implementation of wave pipelined dynamic and differential logic to minimize the overhead of this defense against power analysis attacks. A transmission gate WDDL design methodology is presented, and the design and analysis of a secure multiplier is given. The adder structures are compared in terms of security effectiveness and silicon area overhead for three cases: unsecured logic implementation, standard gate WDDL, and transmission gate WDDL. In simulation, the transmission gate WDDL design is seen to have similar power consumption results compared to the standard gate WDDL; however, the transmission gate based circuit uses 10-50% fewer gates compared to the static WDDL

    Minimization Of Power Dissipation In Digital Circuits Using Pipelining And A Study Of Clock Gating Technique

    Get PDF
    Power dissipation is one of the major design issues of digital circuits. The power dissipated by a circuit affects its speed and performance. Multiplier is one of the most commonly used circuits in the digital devices. There are various types of multipliers available depending upon the application in which they are used. In the present thesis report, the importance of power dissipation in today\u27s digital technology is discussed and the various types and sources of power dissipation have been elaborated. Different types of multipliers have been designed which vary in their structure and amount of power dissipation. The concept of pipelining is explained and the reduction in the power dissipation of the multipliers after pipelining is experimentally determined. Clock gating is a very important technique used in the design of digital circuits to reduce power dissipation. Various types of clock gating techniques have been presented as a case study. The technology used in the simulation of these circuits is 0.35µm CMOS and the simulator used is SPECTRE S

    Energy efficient hybrid computing systems using spin devices

    Get PDF
    Emerging spin-devices like magnetic tunnel junctions (MTJ\u27s), spin-valves and domain wall magnets (DWM) have opened new avenues for spin-based logic design. This work explored potential computing applications which can exploit such devices for higher energy-efficiency and performance. The proposed applications involve hybrid design schemes, where charge-based devices supplement the spin-devices, to gain large benefits at the system level. As an example, lateral spin valves (LSV) involve switching of nanomagnets using spin-polarized current injection through a metallic channel such as Cu. Such spin-torque based devices possess several interesting properties that can be exploited for ultra-low power computation. Analog characteristic of spin current facilitate non-Boolean computation like majority evaluation that can be used to model a neuron. The magneto-metallic neurons can operate at ultra-low terminal voltage of ∼20mV, thereby resulting in small computation power. Moreover, since nano-magnets inherently act as memory elements, these devices can facilitate integration of logic and memory in interesting ways. The spin based neurons can be integrated with CMOS and other emerging devices leading to different classes of neuromorphic/non-Von-Neumann architectures. The spin-based designs involve `mixed-mode\u27 processing and hence can provide very compact and ultra-low energy solutions for complex computation blocks, both digital as well as analog. Such low-power, hybrid designs can be suitable for various data processing applications like cognitive computing, associative memory, and currentmode on-chip global interconnects. Simulation results for these applications based on device-circuit co-simulation framework predict more than ∼100x improvement in computation energy as compared to state of the art CMOS design, for optimal spin-device parameters

    Baseband analog front-end and digital back-end for reconfigurable multi-standard terminals

    Get PDF
    Multimedia applications are driving wireless network operators to add high-speed data services such as Edge (E-GPRS), WCDMA (UMTS) and WLAN (IEEE 802.11a,b,g) to the existing GSM network. This creates the need for multi-mode cellular handsets that support a wide range of communication standards, each with a different RF frequency, signal bandwidth, modulation scheme etc. This in turn generates several design challenges for the analog and digital building blocks of the physical layer. In addition to the above-mentioned protocols, mobile devices often include Bluetooth, GPS, FM-radio and TV services that can work concurrently with data and voice communication. Multi-mode, multi-band, and multi-standard mobile terminals must satisfy all these different requirements. Sharing and/or switching transceiver building blocks in these handsets is mandatory in order to extend battery life and/or reduce cost. Only adaptive circuits that are able to reconfigure themselves within the handover time can meet the design requirements of a single receiver or transmitter covering all the different standards while ensuring seamless inter-interoperability. This paper presents analog and digital base-band circuits that are able to support GSM (with Edge), WCDMA (UMTS), WLAN and Bluetooth using reconfigurable building blocks. The blocks can trade off power consumption for performance on the fly, depending on the standard to be supported and the required QoS (Quality of Service) leve

    Power efficient resilient microarchitectures for PVT variability mitigation

    Get PDF
    Nowadays, the high power density and the process, voltage, and temperature variations became the most critical issues that limit the performance of the digital integrated circuits because of the continuous scaling of the fabrication technology. Dynamic voltage and frequency scaling technique is used to reduce the power consumption while different time relaxation techniques and error recovery microarchitectures are used to tolerate the process, voltage, and temperature variations. These techniques reduce the throughput by scaling down the frequency or flushing and restarting the errant pipeline. This thesis presents a novel resilient microarchitecture which is called ERSUT-based resilient microarchitecture to tolerate the induced delays generated by the voltage scaling or the process, voltage, and temperature variations. The resilient microarchitecture detects and recovers the induced errors without flushing the pipeline and without scaling down the operating frequency. An ERSUT-based resilient 16 × 16 bit MAC unit, implemented using Global Foundries 65 nm technology and ARM standard cells library, is introduced as a case study with 18.26% area overhead and up to 1.5x speedup. At the typical conditions, the maximum frequency of the conventional MAC unit is about 375 MHz while the resilient MAC unit operates correctly at a frequency up to 565 MHz. In case of variations, the resilient MAC unit tolerates induced delays up to 50% of the clock period while keeping its throughput equal to the conventional MAC unit’s maximum throughput. At 375 MHz, the resilient MAC unit is able to scale down the supply voltage from 1.2 V to 1.0 V saving about 29% of the power consumed by the conventional MAC unit. A double-edge-triggered microarchitecture is also introduced to reduce the power consumption extremely by reducing the frequency of the clock tree to the half while preserving the same maximum throughput. This microarchitecture is applied to different ISCAS’89 benchmark circuits in addition to the 16x16 bit MAC unit and the average power reduction of all these circuits is 63.58% while the average area overhead is 31.02%. All these circuits are designed using Global Foundries 65nm technology and ARM standard cells library. Towards the full automation of the ERSUT-based resilient microarchitecture, an ERSUT-based algorithm is introduced in C++ to accelerate the design process of the ERSUT-based microarchitecture. The developed algorithm reduces the design-time efforts dramatically and allows the ERSUT-based microarchitecture to be adopted by larger industrial designs. Depending on the ERSUT-based algorithm, a validation study about applying the ERSUT-based microarchitecture on the MAC unit and different ISCAS’89 benchmark circuits with different complexity weights is introduced. This study shows that 72% of these circuits tolerates more than 14% of their clock periods and 54.5% of these circuits tolerates more than 20% while 27% of these circuits tolerates more than 30%. Consequently, the validation study proves that the ERSUT-based resilient microarchitecture is a valid applicable solution for different circuits with different complexity weights

    Synthesis Of Self-resetting Stage Logic Pipelines

    Get PDF
    As designers began to pack multi-million transistors onto a single chip, their reliance on a global clocking signal to orchestrate the operations of the chip has started to face almost insurmountable difficulties. As a result, designers started to explore clockless circuits to avoid the global clocking problem. Recently, self-resetting circuits implemented in dynamic logic families have been proposed as viable clockless alternatives. While these circuits can produce excellent performances, they display serious limitations in terms of area cost and power consumption. A middle-of-the-road alternative, which can provide a good performance and avoid the limitations seen in dynamic self-resetting circuits, would be to implement self-resetting behavior in static circuits. This alternative has been introduced recently as Self-Resetting Stage Logic and used to propose three types of clockless pipelines. Experimental studies show that these pipelines have the potential to produce high throughputs with a minimum area overhead if a suitable synthesis methodology is available. This thesis proposes a novel synthesis methodology to design and verify clockless pipelines implemented in SRSL by taking advantage of the maturity of current CAD tools. This methodology formulates the synthesis problem as a combinatorial analytical problem for which a run-time efficient exact solution is difficult to derive. Consequently, a two-phase algorithm is proposed to synthesize these pipelines from gate netlists subject to user-specified constraints. The first phase is a heuristic based on the as-soon-as-possible scheduling strategy in which each gate of the netlist is assigned to a single pipeline stage without violating the period constraint of each pipeline stage. On the other hand, the second phase consists of a heuristic, based on the Kernighan-Lin partitioning strategy, to minimize the number of nets crossing each pair of adjacent pipeline stages. The objective of this optimization is to reduce the number of latches separating pipeline stages since these latches tend to occupy large areas. Experiments conducted on a prototype of the synthesis algorithm reveal that these self-resetting stage logic pipelines can easily reach throughputs higher than 1 GHz. Furthermore, these experiments reveal that the area overhead needed to implement the self-resetting circuitry of these pipelines can be easily amortized over the area of the logic embedded in the pipeline stages. In the overall, the synthesis methods developed for SRSL produce low area overhead pipelines for wide and deep gate netlists while it tends to produce high throughput pipelines for wide and shallow gate netlists. This shows that these pipelines are mostly suitable for coarse-grain datapaths

    Low power digital signal processing

    Get PDF

    A wave pipeline-based WCDMA multipath searcher for a high speed operation

    Get PDF
    The multiplexing technique of the Wideband-Code Division Multiple Access (WCDMA) is widely applied in the third generation (3G) of cellular systems. The WCDMA uses scrambling codes to differentiate the mobile terminals. In a channel, multipaths may occur when the transmitted signal is reflected from objects in the receiver's environment, so that multiple copies of the signal arrive at the antenna at different moments. Thus, a wideband signal may suffer frequency selective fading due to the multipath propagations. A Rake receiver is often used to combine the energies received on different paths, and a multipath searcher is needed to identify the multipath components and their associated delays. Correlating some shifted versions of the scrambling code with an incoming signal results in energy peaks at the multipath locations, when the locally generated scrambling sequence is aligned with the scrambling sequence of the incoming signal. A path acquisition in such a process requires a speed of millions of Multiply-Accumulate (MAC) cycles per second. The performances of the multipath searcher are mainly determined by the resolution and the acquisition time, which are often limited by the operation speed of the hardware resources. This thesis presents the design of a multipath searcher with a high resolution and a short acquisition time. The design consists of two aspects. The first aspect is of the searching algorithm. It is based on a double-dwell algorithm and a verification stage is introduced to lower the rate of false alarms. The second aspect in the design is the circuit of the searcher. This circuit is expected to operate at the chip rate of 3.84 MHz and the search period is chosen to be equal to the time interval of 5 slots, which requires a high operation speed of the computation units employed in the circuit. Moreover, in order to reduce the circuit complexity, only one Complex Multiplier-Accumulator (CMAC), instead of several ones in many existing searcher circuits, is employed to perform all the computation tasks without extending the search period, which make the computation time in the circuit more critical. Aiming at this challenge of the high speed requirement, a structure of the CMAC cell is designed with the technique of the wave pipeline, which permits the signal propagation through the circuit stages without constraints of clocks. For a good use of this technique, the circuit blocks are made to have equalized delay, by means of pass transistor logic cells, and by keeping such a delay short, the total computation time of the CMAC can be made within the required time limit of the searching. A complete circuit of the CMAC has been developed. It has two versions, with the Normal Process Complementary Pass Logic (NPCPL) and the Complementary Pass-Logic Transmission-Gates (CPL-TG), respectively. The structures of the arithmetic units have been chosen carefully so that the fan-in/fan-out constraints of the NPCPL and the CPL-TG logics are taken into consideration. The results of the simulation with a 0.18 om models have shown that this wave pipelined CMAC can process four inputs of 8 bits at a rate of 830 Mb/s. In order to evaluate the effectiveness of the searching algorithm, a Matlab simulation of the searcher circuit has been conducted. It has been observed that the proposed multipath searcher can lead to low probabilities of misdetection and false alarm for the test cases recommended by the 3 rd Generation Partnership Project (3GPP) standard. A test chip of the CMAC circuit has been fabricated in a CMOS 0.18 om technology process. The circuit is currently under test

    REAL-TIME ADAPTIVE PULSE COMPRESSION ON RECONFIGURABLE, SYSTEM-ON-CHIP (SOC) PLATFORMS

    Get PDF
    New radar applications need to perform complex algorithms and process a large quantity of data to generate useful information for the users. This situation has motivated the search for better processing solutions that include low-power high-performance processors, efficient algorithms, and high-speed interfaces. In this work, hardware implementation of adaptive pulse compression algorithms for real-time transceiver optimization is presented, and is based on a System-on-Chip architecture for reconfigurable hardware devices. This study also evaluates the performance of dedicated coprocessors as hardware accelerator units to speed up and improve the computation of computing-intensive tasks such matrix multiplication and matrix inversion, which are essential units to solve the covariance matrix. The tradeoffs between latency and hardware utilization are also presented. Moreover, the system architecture takes advantage of the embedded processor, which is interconnected with the logic resources through high-performance buses, to perform floating-point operations, control the processing blocks, and communicate with an external PC through a customized software interface. The overall system functionality is demonstrated and tested for real-time operations using a Ku-band testbed together with a low-cost channel emulator for different types of waveforms

    Gallium arsenide bit-serial integrated circuits

    Get PDF
    corecore