Search CORE

146 research outputs found

Techniques for Aging, Soft Errors and Temperature to Increase the Reliability of Embedded On-Chip Systems

Author: Amrouch Hussam
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2015
Field of study

This thesis investigates the challenge of providing an abstracted, yet sufficiently accurate reliability estimation for embedded on-chip systems. In addition, it also proposes new techniques to increase the reliability of register files within processors against aging effects and soft errors. It also introduces a novel thermal measurement setup that perspicuously captures the infrared images of modern multi-core processors

KITopen

Evaluation of Features Extraction and Classification Techniques for Offline Handwritten Tifinagh Recognition

Author: Mouhcine Rabi
Mustapha Amrouch
Publication venue: Global Journals Inc. (US)
Publication date: 22/04/2016
Field of study

This paper presents a review on different features extraction and classification methods for off-line handwritten Amazigh characters (called Tifinagh) recognition. The features extraction methods are discussed based on Statistical, Structural, Global transformation and moments.Although a number of techniques are available for feature extraction and classification,but the choice of an excellent technique decides the degree of accuracy of recognition. A series of experimentswere performed on AMHCD databaseallowing to evaluate the effectiveness of different techniques of extraction features based on Hidden Markov models, Neural network and Support vector Machine classifiers. The statistical techniques giveencouraging results

Global Journal of Computer Science and Technology (GJCST)

Design automation of approximate circuits with runtime reconfigurable accuracy

Author: Amrouch Hussam
Henkel Jörg
Zervakis Georgios
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 27/04/2020
Field of study

Leveraging the inherent error tolerance of a vast number of application domains that are rapidly growing, approximate computing arises as a design alternative to improve the efficiency of our computing systems by trading accuracy for energy savings. However, the requirement for computational accuracy is not fixed. Controlling the applied level of approximation dynamically at runtime is a key to effectively optimize energy, while still containing and bounding the induced errors at runtime. In this paper, we propose and implement an automatic and circuit independent design framework that generates approximate circuits with dynamically reconfigurable accuracy at runtime. The generated circuits feature varying accuracy levels, supporting also accurate execution. Extensive experimental evaluation, using industry strength flow and circuits, demonstrates that our generated approximate circuits improve the energy by up to 41% for 2% error bound and by 17.5% on average under a pessimistic scenario that assumes full accuracy requirement in the 33% of the runtime. To demonstrate further the efficiency of our framework, we considered two state-of-the-art technology libraries which are a 7nm conventional FinFET and an emerging technology that boosts performance at a high cost of increased dynamic power

KITopen

Brain-Inspired Hyperdimensional Computing: How Thermal-Friendly for Edge Computing?

Author: Amrouch Hussam
Genssler Paul R.
Vas Austin
Publication venue
Publication date: 05/04/2022
Field of study

Brain-inspired hyperdimensional computing (HDC) is an emerging machine learning (ML) methods. It is based on large vectors of binary or bipolar symbols and a few simple mathematical operations. The promise of HDC is a highly efficient implementation for embedded systems like wearables. While fast implementations have been presented, other constraints have not been considered for edge computing. In this work, we aim at answering how thermal-friendly HDC for edge computing is. Devices like smartwatches, smart glasses, or even mobile systems have a restrictive cooling budget due to their limited volume. Although HDC operations are simple, the vectors are large, resulting in a high number of CPU operations and thus a heavy load on the entire system potentially causing temperature violations. In this work, the impact of HDC on the chip's temperature is investigated for the first time. We measure the temperature and power consumption of a commercial embedded system and compare HDC with conventional CNN. We reveal that HDC causes up to 6.8{\deg}C higher temperatures and leads to up to 47% more CPU throttling. Even when both HDC and CNN aim for the same throughput (i.e., perform a similar number of classifications per second), HDC still causes higher on-chip temperatures due to the larger power consumption.Comment: 4 pages, 3 figure

arXiv.org e-Print Archive

Energy Optimization in NCFET-based Processors

Author: Amrouch Hussam
Gerstlauer Andreas
Henkel Jörg
Rapp Martin
Salamin Sami
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 01/01/2020
Field of study

Energy consumption is a key optimization goal for all modern processors. Negative Capacitance Field-Effect Transistors (NCFETs) are a leading emerging technology that promises outstanding performance in addition to better energy efficiency. Thickness of the additional ferroelectric layer, frequency, and voltage are the key parameters in NCFET technology that impact the power and frequency of processors. However, their joint impact on energy optimization has not been investigated yet.In this work, we are the first to demonstrate that conventional (i.e., NCFET-unaware) dynamic voltage/frequency scaling (DVFS) techniques to minimize energy are sub-optimal when applied to NCFET-based processors. We further demonstrate that state-of-the-art NCFET-aware voltage scaling for power minimization is also sub-optimal when it comes to energy. This work provides the first NCFET-aware DVFS technique that optimizes the processor\u27s energy through optimal runtime frequency/voltage selection. In NCFETs, energy-optimal frequency and voltage are dependent on the workload and technology parameters. Our NCFET-aware DVFS technique considers these effects to perform optimal voltage/frequency selection at runtime depending on workload characteristics. Results show up to 90 % energy savings compared to conventional DVFS techniques. Compared to state-of-the-art NCFET-aware power management, our technique provides up to 72 % energy savings along with 3.7x higher performance

Crossref

KITopen

Impact of Negative Capacitance Field-Effect Transistor (NCFET) on Many-Core Systems

Author: D Kwon
G Pahwa
H Amrouch
H Amrouch
J Müller
LB Kish
RH Dennard
S Salahuddin
S Salamin
TE Carlson
VM van Santen
VV Zhirnov
Publication venue: Springer International Publishing
Publication date: 01/01/2020
Field of study

Crossref

KITopen

Memory Awareness

Author: Amrouch Hussam
Buschjäger Sebastian
Chen Kuan-Hsun
Kotthaus Helena
Marwedel Peter
Yayla Mikail
Publication venue: De Gruyter Mouton
Publication date: 19/12/2022
Field of study

University of Twente Research Information

Compact and High-Performance TCAM Based on Scaled Double-Gate FeFETs

Author: Amrouch Hussam
Hu Xiaobo Sharon
Kumar Shubham
Liu Liu
Thomann Simon
Publication venue
Publication date: 07/04/2023
Field of study

Ternary content addressable memory (TCAM), widely used in network routers and high-associativity caches, is gaining popularity in machine learning and data-analytic applications. Ferroelectric FETs (FeFETs) are a promising candidate for implementing TCAM owing to their high ON/OFF ratio, non-volatility, and CMOS compatibility. However, conventional single-gate FeFETs (SG-FeFETs) suffer from relatively high write voltage, low endurance, potential read disturbance, and face scaling challenges. Recently, a double-gate FeFET (DG-FeFET) has been proposed and outperforms SG-FeFETs in many aspects. This paper investigates TCAM design challenges specific to DG-FeFETs and introduces a novel 1.5T1Fe TCAM design based on DG-FeFETs. A 2-step search with early termination is employed to reduce the cell area and improve energy efficiency. A shared driver design is proposed to reduce the peripherals area. Detailed analysis and SPICE simulation show that the 1.5T1Fe DG-TCAM leads to superior search speed and energy efficiency. The 1.5T1Fe TCAM design can also be built with SG-FeFETs, which achieve search latency and energy improvement compared with 2FeFET TCAM.Comment: Accepted by Design Automation Conference (DAC) 202

arXiv.org e-Print Archive

Unlocking efficiency in BNNs: global by local thresholding for analog-based HW accelerators

Author: Amrouch Hussam
Chen Jian-Jia
Frustaci Fabio
Spagnolo Fanny
Yayla Mikail
Publication venue
Publication date: 14/09/2023
Field of study

For accelerating Binarized Neural Networks (BNNs), analog computing-based crossbar accelerators, utilizing XNOR gates and additional interface circuits, have been proposed. Such accelerators demand a large amount of analog-to-digital converters (ADCs) and registers, resulting in expensive designs. To increase the inference efficiency, the state of the art divides the interface circuit into an Analog Path (AP), utilizing (cheap) analog comparators, and a Digital Path (DP), utilizing (expensive) ADCs and registers. During BNN execution, a certain path is selectively triggered. Ideally, as inference via AP is more efficient, it should be triggered as often as possible. However, we reveal that, unless the number of weights is very small, the AP is rarely triggered. To overcome this, we propose a novel BNN inference scheme, called Local Thresholding Approximation (LTA). It approximates the global thresholdings in BNNs by local thresholdings. This enables the use of the AP through most of the execution, which significantly increases the interface circuit efficiency. In our evaluations with two BNN architectures, using LTA reduces the area by 42x and 54x, the energy by 2.7x and 4.2x, and the latency by 3.8x and 1.15x, compared to the state-of-the-art crossbar-based BNN accelerators

Eldorado - Ressourcen aus und für Lehre, Studium und Forschung