453 research outputs found

    Graphene-based current mode logic circuits: a simulation study for an emerging technology

    Get PDF
    In this paper, the usage of graphene transistors is introduced to be a suitable solution for extending low power designs. Static and current mode logic (CML) styles on both nanoscale graphene and silicon FINFET technologies are compared. Results show that power in CML styles approximately are independent of frequency and the graphene-based CML (G-CML) designs are more power-efficient as the frequency and complexity increase. Compared to silicon-based CML (Si-CML) standard cells, there is 94% reduction in power consumption for G-CML counterparts. Furthermore, a G-CML 4-bit adder respectively offers 8.9 and 1.7 times less power and delay than the Si-CML adder

    Low-power emerging memristive designs towards secure hardware systems for applications in internet of things

    Get PDF
    Emerging memristive devices offer enormous advantages for applications such as non-volatile memories and in-memory computing (IMC), but there is a rising interest in using memristive technologies for security applications in the era of internet of things (IoT). In this review article, for achieving secure hardware systems in IoT, low-power design techniques based on emerging memristive technology for hardware security primitives/systems are presented. By reviewing the state-of-the-art in three highlighted memristive application areas, i.e. memristive non-volatile memory, memristive reconfigurable logic computing and memristive artificial intelligent computing, their application-level impacts on the novel implementations of secret key generation, crypto functions and machine learning attacks are explored, respectively. For the low-power security applications in IoT, it is essential to understand how to best realize cryptographic circuitry using memristive circuitries, and to assess the implications of memristive crypto implementations on security and to develop novel computing paradigms that will enhance their security. This review article aims to help researchers to explore security solutions, to analyze new possible threats and to develop corresponding protections for the secure hardware systems based on low-cost memristive circuit designs

    A Structured Design Methodology for High Performance VLSI Arrays

    Get PDF
    abstract: The geometric growth in the integrated circuit technology due to transistor scaling also with system-on-chip design strategy, the complexity of the integrated circuit has increased manifold. Short time to market with high reliability and performance is one of the most competitive challenges. Both custom and ASIC design methodologies have evolved over the time to cope with this but the high manual labor in custom and statistic design in ASIC are still causes of concern. This work proposes a new circuit design strategy that focuses mostly on arrayed structures like TLB, RF, Cache, IPCAM etc. that reduces the manual effort to a great extent and also makes the design regular, repetitive still achieving high performance. The method proposes making the complete design custom schematic but using the standard cells. This requires adding some custom cells to the already exhaustive library to optimize the design for performance. Once schematic is finalized, the designer places these standard cells in a spreadsheet, placing closely the cells in the critical paths. A Perl script then generates Cadence Encounter compatible placement file. The design is then routed in Encounter. Since designer is the best judge of the circuit architecture, placement by the designer will allow achieve most optimal design. Several designs like IPCAM, issue logic, TLB, RF and Cache designs were carried out and the performance were compared against the fully custom and ASIC flow. The TLB, RF and Cache were the part of the HEMES microprocessor.Dissertation/ThesisPh.D. Electrical Engineering 201

    Low energy digital circuits in advanced nanometer technologies

    Get PDF
    The demand for portable devices and the continuing trend towards the Internet ofThings (IoT) have made of energy consumption one of the main concerns in the industry and researchers. The most efficient way of reducing the energy consump-tion of digital circuits is decreasing the supply voltage (Vdd) since the dynamicenergy quadratically depends onVdd. Several works have shown that an optimumsupply voltage exists that minimizes the energy consumption of digital circuits. This optimum supply voltage is usually around 200 mV and 400 mV dependingon the circuit and technology used. To obtain these low supply voltages, on-chipdc-dc converters with high efficiency are needed.This thesis focuses on the study of subthreshold digital systems in advancednanometer technologies. These systems usually can be divided into a Power Man-agement Unit (PMU) and a digital circuit operating at the subthreshold regime.In particular, while considering the PMU, one of the key circuits is the dc-dcconverter. This block converts the voltage from the power source (battery, supercapacitor or wireless power transfer link) to a voltage between 200 mV and 400mV in order to power the digital circuit. In this thesis, we developed two chargerecycling techniques in order to improve the efficiency of switched capacitors dc-dcconverters. The first one is based on a technique used in adiabatic circuits calledstepwise charging. This technique was used in circuits and applications wherethe switching consumption of a big capacitance is very important. We analyzedthe possibility of using this technique in switched capacitor dc-dc converters withintegrated capacitors. We showed through measurements that a 29% reductionin the gate drive losses can be obtained with this technique. The second one isa simplification of stepwise charging which can be applied in some architecturesof switched capacitors dc-dc converters. We also fabricated and tested a dc-dcconverter with this technique and obtained a 25% energy reduction in the drivingof the switches that implement the converter.Furthermore, we studied the digital circuit working in the subthreshold regime,in particular, operating at the minimum energy point. We studied different modelsfor circuits working in these conditions and improved them by considering thedifferences between the NMOS and PMOS transistors. We obtained an optimumNMOS/PMOS leakage current imbalance that minimizes the total leakage energy per operation. This optimum depends on the architecture of the digital circuitand the input data. However, we also showed that important energy reductionscan be obtained by operating at a mean optimum imbalance. We proposed two techniques to achieve the optimum imbalance. We used aFully Depleted Silicon on Insulator (FD-SOI) 28 nm technology for most of the simulations, but we also show that these techniques can be applied in traditionalbulk CMOS technologies. The first one consists in using the back plane voltage of the transistors (or bulk voltage in traditional CMOS) to adjust independently theleakage current of the NMOS and PMOS transistor to work under the optimum NMOS/PMOS leakage current imbalance. We called this approach the OptimumBack Plane Biasing (OBB). A second technique consists of using the length of the transistors to adjust this leakage current imbalance. In the subthreshold regimeand in advanced nanometer technologies a moderate increase in the length has little impact in the output capacitance of the gates and thus in the dynamic energy.We called this approach an Asymmetric Length Biasing (ALB). Finally, we use these techniques in some basic circuits such as adders. We show that around 50% energy reduction can be obtained, in a wide range of frequency while working near the minimum energy point and using these techniques. The main contributions of this thesis are: • Analysis of the stepwise charging technique in small capacitances. •Implementation of stepwise charging technique as a charge recycling tech-nique for efficiency improvement in switched capacitor dc-dc converters. • Development of a charge sharing technique for efficiency improvement inswitched capacitor dc-dc converters. • Analysis of minimum operating voltage of digital circuits due to intrinsicnoise and the impact of technology scaling in this minimum. • Improvement in the modeling of the minimum energy point while considering NMOS and PMOS transistors difference. • Demonstration of the existence of an optimum leakage current imbalance be-tween the NMOS and PMOS transistors that minimizes energy consumptionin the subthreshold regiion. • Development of a back plane (bulk) voltage strategy for working in this optimum.• Development of a sizing strategy for working in the aforementioned optimum. • Analysis of the impact of architecture and input data on the optimum im-balance. The thesis is based on the publications [1–8]. During the Ph.D. program, other publications were generated [9–16] that are partially related with the thesis butwere not included in it.La constante demanda de dispositivos portables y los avances hacia la Internet de las Cosas han hecho del consumo de energía uno de los mayores desafíos y preocupación en la industria y la academia. La forma más eficiente de reducir el consumo de energía de los circuitos digitales es reduciendo su voltaje de alimentación ya que la energía dinámica depende de manera cuadrática con dicho voltaje. Varios trabajos demostraron que existe un voltaje de alimentación óptimo, que minimiza la energía consumida para realizar cierta operación en un circuito digital, llamado punto de mínima energía. Este óptimo voltaje se encuentra usualmente entre 200 mV y 400 mV dependiendo del circuito y de la tecnología utilizada. Para obtener estos voltajes de alimentación de la fuente de energía, se necesitan conversores dc-dc integrados con alta eficiencia. Esta tesis se concentra en el estudio de sistemas digitales trabajando en la región sub umbral diseñados en tecnologías nanométricas avanzadas (28 nm). Estos sistemas se pueden dividir usualmente en dos bloques, uno llamado bloque de manejo de potencia, y el segundo, el circuito digital operando en la region sub umbral. En particular, en lo que corresponde al bloque de manejo de potencia, el circuito más crítico es en general el conversor dc-dc. Este circuito convierte el voltaje de una batería (o super capacitor o enlace de transferencia inalámbrica de energía o unidad de cosechado de energía) en un voltaje entre 200 mV y 400 mV para alimentar el circuito digital en su voltaje óptimo. En esta tesis desarrollamos dos técnicas que, mediante el reciclado de carga, mejoran la eficiencia de los conversores dc-dc a capacitores conmutados. La primera es basada en una técnica utilizada en circuitos adiabáticos que se llama carga gradual o a pasos. Esta técnica se ha utilizado en circuitos y aplicaciones en donde el consumo por la carga y descarga de una capacidad grande es dominante. Nosotros analizamos la posibilidad de utilizar esta técnica en conversores dc-dc a capacitores conmutados con capacitores integrados. Se demostró a través de medidas que se puede reducir en un 29% el consumo debido al encendido y apagado de las llaves que implementan el conversor dc-dc. La segunda técnica, es una simplificación de la primera, la cual puede ser aplicada en ciertas arquitecturas de conversores dc-dc a capacitores conmutados. También se fabricó y midió un conversor con esta técnica y se obtuvo una reducción del 25% en la energía consumida por el manejo de las llaves del conversor. Por otro lado, estudiamos los circuitos digitales operando en la región sub umbral y en particular cerca del punto de mínima energía. Estudiamos diferentes modelos para circuitos operando en estas condiciones y los mejoramos considerando las diferencias entre los transistores NMOS y PMOS. Mediante este modelo demostramos que existe un óptimo en la relación entre las corrientes de fuga de ambos transistores que minimiza la energía de fuga consumida por operación. Este óptimo depende de la arquitectura del circuito digital y ademas de los datos de entrada del circuito. Sin embargo, demostramos que se puede reducir el consumo de manera considerable al operar en un óptimo promedio. Propusimos dos técnicas para alcanzar la relación óptima. Utilizamos una tecnología FD-SOI de 28nm para la mayoría de las simulaciones, pero también mostramos que estas técnicas pueden ser utilizadas en tecnologías bulk convencionales. La primer técnica, consiste en utilizar el voltaje de la puerta trasera (o sustrato en CMOS convencional) para ajustar de manera independiente las corrientes del NMOS y PMOS para que el circuito trabaje en el óptimo de la relación de corrientes. Esta técnica la llamamos polarización de voltaje de puerta trasera óptimo. La segunda técnica, consiste en utilizar los largos de los transistores para ajustar las corrientes de fugas de cada transistor y obtener la relación óptima. Trabajando en la región sub umbral y en tecnologías avanzadas, incrementar moderadamente el largo del transistor tiene poco impacto en la energía dinámica y es por eso que se puede utilizar. Finalmente, utilizamos estas técnicas en circuitos básicos como sumadores y mostramos que se puede obtener una reducción de la energía consumida de aproximadamente 50%, en un amplio rango de frecuencias, mientras estos circuitos trabajan cerca del punto de energía mínima. Las principales contribuciones de la tesis son: • Análisis de la técnica de carga gradual o a pasos en capacidades pequeñas. • Implementación de la técnica de carga gradual para la mejora de eficiencia de conversores dc-dc a capacitores conmutados. • Simplificación de la técnica de carga gradual para mejora de la eficiencia en algunas arquitecturas de conversores dc-dc de capacitores conmutados. • Análisis del mínimo voltaje de operación en circuitos digitales debido al ruido intrínseco del dispositivo y el impacto del escalado de las tecnologías en el mismo. • Mejoras en el modelado del punto de energía mínima de operación de un circuito digital en el cual se consideran las diferencias entre el transistor PMOS y NMOS. • Demostración de la existencia de un óptimo en la relación entre las corrientes de fuga entre el NMOS y PMOS que minimiza la energía de fugas consumida en la región sub umbral. • Desarrollo de una estrategia de polarización del voltaje de puerta trasera para que el circuito digital trabaje en el óptimo antes mencionado. • Desarrollo de una estrategia para el dimensionado de los transistores que componen las compuertas digitales que permite al circuito digital operar en el óptimo antes mencionado. • Análisis del impacto de la arquitectura del circuito y de los datos de entrada del mismo en el óptimo antes mencionado

    Digital and analog TFET circuits: Design and benchmark

    Get PDF
    In this work, we investigate by means of simulations the performance of basic digital, analog, and mixed-signal circuits employing tunnel-FETs (TFETs). The analysis reviews and complements our previous papers on these topics. By considering the same devices for all the analysis, we are able to draw consistent conclusions for a wide variety of circuits. A virtual complementary TFET technology consisting of III-V heterojunction nanowires is considered. Technology Computer Aided Design (TCAD) models are calibrated against the results of advanced full-quantum simulation tools and then used to generate look-up-tables suited for circuit simulations. The virtual complementary TFET technology is benchmarked against predictive technology models (PTM) of complementary silicon FinFETs for the 10 nm node over a wide range of supply voltages (VDD) in the sub-threshold voltage domain considering the same footprint between the vertical TFETs and the lateral FinFETs and the same static power. In spite of the asymmetry between p- and n-type transistors, the results show clear advantages of TFET technology over FinFET for VDDlower than 0.4 V. Moreover, we highlight how differences in the I-V characteristics of FinFETs and TFETs suggest to adapt the circuit topologies used to implement basic digital and analog blocks with respect to the most common CMOS solutions

    Digital and analog TFET circuits: Design and benchmark

    Get PDF
    In this work, we investigate by means of simulations the performance of basic digital, analog, and mixed-signal circuits employing tunnel-FETs (TFETs). The analysis reviews and complements our previous papers on these topics. By considering the same devices for all the analysis, we are able to draw consistent conclusions for a wide variety of circuits. A virtual complementary TFET technology consisting of III-V heterojunction nanowires is considered. Technology Computer Aided Design (TCAD) models are calibrated against the results of advanced full-quantum simulation tools and then used to generate look-up-tables suited for circuit simulations. The virtual complementary TFET technology is benchmarked against predictive technology models (PTM) of complementary silicon FinFETs for the 10 nm node over a wide range of supply voltages (VDD) in the sub-threshold voltage domain considering the same footprint between the vertical TFETs and the lateral FinFETs and the same static power. In spite of the asymmetry between p- and n-type transistors, the results show clear advantages of TFET technology over FinFET for VDDlower than 0.4 V. Moreover, we highlight how differences in the I-V characteristics of FinFETs and TFETs suggest to adapt the circuit topologies used to implement basic digital and analog blocks with respect to the most common CMOS solutions

    Doctor of Philosophy

    Get PDF
    dissertationAdvancements in process technology and circuit techniques have enabled the creation of small chemical microsystems for use in a wide variety of biomedical and sensing applications. For applications requiring a small microsystem, many components can be integrated onto a single chip. This dissertation presents many low-power circuits, digital and analog, integrated onto a single chip called the Utah Microcontroller. To guide the design decisions for each of these components, two specific microsystems have been selected as target applications: a Smart Intravaginal Ring (S-IVR) and an NO releasing catheter. Both of these applications share the challenging requirements of integrating a large variety of low-power mixed-signal circuitry onto a single chip. These applications represent the requirements of a broad variety of small low-power sensing systems. In the course of the development of the Utah Microcontroller, several unique and significant contributions were made. A central component of the Utah Microcontroller is the WIMS Microprocessor, which incorporates a low-power feature called a scratchpad memory. For the first time, an analysis of scaling trends projected that scratchpad memories will continue to save power for the foreseeable future. This conclusion was bolstered by measured data from a fabricated microcontroller. In a 32 nm version of the WIMS Microprocessor, the scratchpad memory is projected to save ~10-30% of memory access energy depending upon the characteristics of the embedded program. Close examination of application requirements informed the design of an analog-to-digital converter, and a unique single-opamp buffered charge scaling DAC was developed to minimize power consumption. The opamp was designed to simultaneously meet the varied demands of many chip components to maximize circuit reuse. Each of these components are functional, have been integrated, fabricated, and tested. This dissertation successfully demonstrates that the needs of emerging small low-power microsystems can be met in advanced process nodes with the incorporation of low-power circuit techniques and design choices driven by application requirements

    ENERGY-EFFICIENT AND SECURE HARDWARE FOR INTERNET OF THINGS (IoT) DEVICES

    Get PDF
    Internet of Things (IoT) is a network of devices that are connected through the Internet to exchange the data for intelligent applications. Though IoT devices provide several advantages to improve the quality of life, they also present challenges related to security. The security issues related to IoT devices include leakage of information through Differential Power Analysis (DPA) based side channel attacks, authentication, piracy, etc. DPA is a type of side-channel attack where the attacker monitors the power consumption of the device to guess the secret key stored in it. There are several countermeasures to overcome DPA attacks. However, most of the existing countermeasures consume high power which makes them not suitable to implement in power constraint devices. IoT devices are battery operated, hence it is important to investigate the methods to design energy-efficient and secure IoT devices not susceptible to DPA attacks. In this research, we have explored the usefulness of a novel computing platform called adiabatic logic, low-leakage FinFET devices and Magnetic Tunnel Junction (MTJ) Logic-in-Memory (LiM) architecture to design energy-efficient and DPA secure hardware. Further, we have also explored the usefulness of adiabatic logic in the design of energy-efficient and reliable Physically Unclonable Function (PUF) circuits to overcome the authentication and piracy issues in IoT devices. Adiabatic logic is a low-power circuit design technique to design energy-efficient hardware. Adiabatic logic has reduced dynamic switching energy loss due to the recycling of charge to the power clock. As the first contribution of this dissertation, we have proposed a novel DPA-resistant adiabatic logic family called Energy-Efficient Secure Positive Feedback Adiabatic Logic (EE-SPFAL). EE-SPFAL based circuits are energy-efficient compared to the conventional CMOS based design because of recycling the charge after every clock cycle. Further, EE-SPFAL based circuits consume uniform power irrespective of input data transition which makes them resilience against DPA attacks. Scaling of CMOS transistors have served the industry for more than 50 years in providing integrated circuits that are denser, and cheaper along with its high performance, and low power. However, scaling of the transistors leads to increase in leakage current. Increase in leakage current reduces the energy-efficiency of the computing circuits,and increases their vulnerability to DPA attack. Hence, it is important to investigate the crypto circuits in low leakage devices such as FinFET to make them energy-efficient and DPA resistant. In this dissertation, we have proposed a novel FinFET based Secure Adiabatic Logic (FinSAL) family. FinSAL based designs utilize the low-leakage FinFET device along with adiabatic logic principles to improve energy-efficiency along with its resistance against DPA attack. Recently, Magnetic Tunnel Junction (MTJ)/CMOS based Logic-in-Memory (LiM) circuits have been explored to design low-power non-volatile hardware. Some of the advantages of MTJ device include non-volatility, near-zero leakage power, high integration density and easy compatibility with CMOS devices. However, the differences in power consumption between the switching of MTJ devices increase the vulnerability of Differential Power Analysis (DPA) based side-channel attack. Further, the MTJ/CMOS hybrid logic circuits which require frequent switching of MTJs are not very energy-efficient due to the significant energy required to switch the MTJ devices. In the third contribution of this dissertation, we have investigated a novel approach of building cryptographic hardware in MTJ/CMOS circuits using Look-Up Table (LUT) based method where the data stored in MTJs are constant during the entire encryption/decryption operation. Currently, high supply voltage is required in both writing and sensing operations of hybrid MTJ/CMOS based LiM circuits which consumes a considerable amount of energy. In order to meet the power budget in low-power devices, it is important to investigate the novel design techniques to design ultra-low-power MTJ/CMOS circuits. In the fourth contribution of this dissertation, we have proposed a novel energy-efficient Secure MTJ/CMOS Logic (SMCL) family. The proposed SMCL logic family consumes uniform power irrespective of data transition in MTJ and more energy-efficient compared to the state-of-art MTJ/ CMOS designs by using charge sharing technique. The other important contribution of this dissertation is the design of reliable Physical Unclonable Function (PUF). Physically Unclonable Function (PUF) are circuits which are used to generate secret keys to avoid the piracy and device authentication problems. However, existing PUFs consume high power and they suffer from the problem of generating unreliable bits. This dissertation have addressed this issue in PUFs by designing a novel adiabatic logic based PUF. The time ramp voltages in adiabatic PUF is utilized to improve the reliability of the PUF along with its energy-efficiency. Reliability of the adiabatic logic based PUF proposed in this dissertation is tested through simulation based temperature variations and supply voltage variations

    Energy-efficient embedded machine learning algorithms for smart sensing systems

    Get PDF
    Embedded autonomous electronic systems are required in numerous application domains such as Internet of Things (IoT), wearable devices, and biomedical systems. Embedded electronic systems usually host sensors, and each sensor hosts multiple input channels (e.g., tactile, vision), tightly coupled to the electronic computing unit (ECU). The ECU extracts information by often employing sophisticated methods, e.g., Machine Learning. However, embedding Machine Learning algorithms poses essential challenges in terms of hardware resources and energy consumption because of: 1) the high amount of data to be processed; 2) computationally demanding methods. Leveraging on the trade-off between quality requirements versus computational complexity and time latency could reduce the system complexity without affecting the performance. The objectives of the thesis are to develop: 1) energy-efficient arithmetic circuits outperforming state of the art solutions for embedded machine learning algorithms, 2) an energy-efficient embedded electronic system for the \u201celectronic-skin\u201d (e-skin) application. As such, this thesis exploits two main approaches: Approximate Computing: In recent years, the approximate computing paradigm became a significant major field of research since it is able to enhance the energy efficiency and performance of digital systems. \u201cApproximate Computing\u201d(AC) turned out to be a practical approach to trade accuracy for better power, latency, and size . AC targets error-resilient applications and offers promising benefits by conserving some resources. Usually, approximate results are acceptable for many applications, e.g., tactile data processing,image processing , and data mining ; thus, it is highly recommended to take advantage of energy reduction with minimal variation in performance . In our work, we developed two approximate multipliers: 1) the first one is called \u201cMETA\u201d multiplier and is based on the Error Tolerant Adder (ETA), 2) the second one is called \u201cApproximate Baugh-Wooley(BW)\u201d multiplier where the approximations are implemented in the generation of the partial products. We showed that the proposed approximate arithmetic circuits could achieve a relevant reduction in power consumption and time delay around 80.4% and 24%, respectively, with respect to the exact BW multiplier. Next, to prove the feasibility of AC in real world applications, we explored the approximate multipliers on a case study as the e-skin application. The e-skin application is defined as multiple sensing components, including 1) structural materials, 2) signal processing, 3) data acquisition, and 4) data processing. Particularly, processing the originated data from the e-skin into low or high-level information is the main problem to be addressed by the embedded electronic system. Many studies have shown that Machine Learning is a promising approach in processing tactile data when classifying input touch modalities. In our work, we proposed a methodology for evaluating the behavior of the system when introducing approximate arithmetic circuits in the main stages (i.e., signal and data processing stages) of the system. Based on the proposed methodology, we first implemented the approximate multipliers on the low-pass Finite Impulse Response (FIR) filter in the signal processing stage of the application. We noticed that the FIR filter based on (Approx-BW) outperforms state of the art solutions, while respecting the tradeoff between accuracy and power consumption, with an SNR degradation of 1.39dB. Second, we implemented approximate adders and multipliers respectively into the Coordinate Rotational Digital Computer (CORDIC) and the Singular Value Decomposition (SVD) circuits; since CORDIC and SVD take a significant part of the computationally expensive Machine Learning algorithms employed in tactile data processing. We showed benefits of up to 21% and 19% in power reduction at the cost of less than 5% accuracy loss for CORDIC and SVD circuits when scaling the number of approximated bits. 2) Parallel Computing Platforms (PCP): Exploiting parallel architectures for near-threshold computing based on multi-core clusters is a promising approach to improve the performance of smart sensing systems. In our work, we exploited a novel computing platform embedding a Parallel Ultra Low Power processor (PULP), called \u201cMr. Wolf,\u201d for the implementation of Machine Learning (ML) algorithms for touch modalities classification. First, we tested the ML algorithms at the software level; for RGB images as a case study and tactile dataset, we achieved accuracy respectively equal to 97% and 83.5%. After validating the effectiveness of the ML algorithm at the software level, we performed the on-board classification of two touch modalities, demonstrating the promising use of Mr. Wolf for smart sensing systems. Moreover, we proposed a memory management strategy for storing the needed amount of trained tensors (i.e., 50 trained tensors for each class) in the on-chip memory. We evaluated the execution cycles for Mr. Wolf using a single core, 2 cores, and 3 cores, taking advantage of the benefits of the parallelization. We presented a comparison with the popular low power ARM Cortex-M4F microcontroller employed, usually for battery-operated devices. We showed that the ML algorithm on the proposed platform runs 3.7 times faster than ARM Cortex M4F (STM32F40), consuming only 28 mW. The proposed platform achieves 15 7 better energy efficiency than the classification done on the STM32F40, consuming 81mJ per classification and 150 pJ per operation

    Approximate 32-Bit Floating-Point Unit Design with 53% Power-Area Product Reduction

    Get PDF
    The floating-point unit is one of the most common building block in any computing system and is used for a huge number of applications. By combining two state-of-the-art techniques of imprecise hardware, namely Gate-Level Pruning and Inexact Speculative Adder, and by introducing a novel Inexact Speculative Multiplier architecture, three different approximate FPUs and one reference IEEE-754 compliant FPU have been integrated in a 65 nm CMOS process within a low-power multi-core processor. Silicon measurements show up to 27% power, 36% area and 53%power-area product savings compared to the IEEE-754 single-precision FPU. Accuracy loss has been evaluated with a high-dynamic-range image tone-mapping algorithm, resulting in small but non-visible errors with image PSNR value of 90 dB
    • …
    corecore