Search CORE

200 research outputs found

Hardware-Aware Static Optimization of Hyperdimensional Computations

Author: Achour Sara
Pu
Yi
Publication venue
Publication date: 20/08/2023
Field of study

Binary spatter code (BSC)-based hyperdimensional computing (HDC) is a highly error-resilient approximate computational paradigm suited for error-prone, emerging hardware platforms. In BSC HDC, the basic datatype is a hypervector, a typically large binary vector, where the size of the hypervector has a significant impact on the fidelity and resource usage of the computation. Typically, the hypervector size is dynamically tuned to deliver the desired accuracy; this process is time-consuming and often produces hypervector sizes that lack accuracy guarantees and produce poor results when reused for very similar workloads. We present Heim, a hardware-aware static analysis and optimization framework for BSC HD computations. Heim analytically derives the minimum hypervector size that minimizes resource usage and meets the target accuracy requirement. Heim guarantees the optimized computation converges to the user-provided accuracy target on expectation, even in the presence of hardware error. Heim deploys a novel static analysis procedure that unifies theoretical results from the neuroscience community to systematically optimize HD computations. We evaluate Heim against dynamic tuning-based optimization on 25 benchmark data structures. Given a 99% accuracy requirement, Heim-optimized computations achieve a 99.2%-100.0% median accuracy, up to 49.5% higher than dynamic tuning-based optimization, while achieving 1.15x-7.14x reductions in hypervector size compared to HD computations that achieve comparable query accuracy and finding parametrizations 30.0x-100167.4x faster than dynamic tuning-based approaches. We also use Heim to systematically evaluate the performance benefits of using analog CAMs and multiple-bit-per-cell ReRAM over conventional hardware, while maintaining iso-accuracy -- for both emerging technologies, we find usages where the emerging hardware imparts significant benefits

arXiv.org e-Print Archive

Large-scale memristive associative memories

Author: Eero Lehtonen
Jussi Poikonen
Mika Laiho
Pentti Kanerva
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/10/2022
Field of study

Associative memories, in contrast to conventional address-based memories, are inherently fault-tolerant and allow retrieval of data based on partial search information. This paper considers the possibility of implementing large-scale associative memories through memristive devices jointly with CMOS circuitry. An advantage of a memristive associative memory is that the memory elements are located physically above the CMOS layer, which yields more die area for the processing elements realized in CMOS. This allows for high-capacity memories even while using an older CMOS technology, as the capacity of the memory depends more on the feature size of the memristive crossbar than on that of the CMOS components. In this paper, we propose the memristive implementations, and present simulations and error analysis of the autoassociative content-addressable memory, the Willshaw memory, and the sparse distributed memory. Furthermore, we present a CMOS cell that can be used to implement the proposed memory architectures.</div

UTUPub

QubitHD: A Stochastic Acceleration Method for HD Computing-Based Machine Learning

Author: Bosch Samuel
de la Cerda Alexander Sanchez
De Micheli Giovanni
Imani Mohsen
Rosing Tajana Simunic
Publication venue
Publication date: 02/12/2019
Field of study

Machine Learning algorithms based on Brain-inspired Hyperdimensional (HD) computing imitate cognition by exploiting statistical properties of high-dimensional vector spaces. It is a promising solution for achieving high energy-efficiency in different machine learning tasks, such as classification, semi-supervised learning and clustering. A weakness of existing HD computing-based ML algorithms is the fact that they have to be binarized for achieving very high energy-efficiency. At the same time, binarized models reach lower classification accuracies. To solve the problem of the trade-off between energy-efficiency and classification accuracy, we propose the QubitHD algorithm. It stochastically binarizes HD-based algorithms, while maintaining comparable classification accuracies to their non-binarized counterparts. The FPGA implementation of QubitHD provides a 65% improvement in terms of energy-efficiency, and a 95% improvement in terms of the training time, as compared to state-of-the-art HD-based ML algorithms. It also outperforms state-of-the-art low-cost classifiers (like Binarized Neural Networks) in terms of speed and energy-efficiency by an order of magnitude during training and inference.Comment: 8 pages, 7 figures, 3 table

arXiv.org e-Print Archive

Hardware-Based Authentication for the Internet of Things

Author: Arafin Md Tanvir
Publication venue
Publication date: 01/01/2018
Field of study

Entity authentication is one of the most fundamental problems in computer security. Implementation of any authentication protocol requires the solution of several sub-problems, such as the problems regarding secret sharing, key generation, key storage and key verification. With the advent of the Internet-of-Things(IoT), authentication becomes a pivotal concern in the security of IoT systems. Interconnected components of IoT devices normally contains sensors, actuators, relays, and processing and control equipment that are designed with the limited budget on power, cost and area. As a result, incorporating security protocols in such resource constrained IoT components can be rather challenging. To address this issue, in this dissertation, we design and develop hardware oriented lightweight protocols for the authentication of users, devices and data. These protocols utilize physical properties of memory components, computing units, and hardware clocks on the IoT device. Recent works on device authentication using physically uncloneable functions can render the problem of entity authentication and verification based on the hardware properties tractable. Our studies reveal that non-linear characteristics of resistive memories can be useful in solving several problems regarding authentication. Therefore, in this dissertation, first we explore the ideas of secret sharing using threshold circuits and non-volatile memory components. Inspired by the concepts of visual cryptography, we identify the promises of resistive memory based circuits in lightweight secret sharing and multi-user authentication. Furthermore, the additive and monotonic properties of non-volatile memory components can be useful in addressing the challenges of key storage. Overall, in the first part of this dissertation, we present our research on design of low-cost, non-crypto based user authentication schemes using physical properties of a resistive memory based system. In the second part of the dissertation, we demonstrate that in computational units, the emerging voltage over-scaling (VOS)-based computing leaves a process variation dependent error signature in the approximate results. Current research works in VOS focus on reducing these errors to provide acceptable results from the computation point of view. Interestingly, with extreme VOS, these errors can also reveal significant information about the underlying physical system and random variations therein. As a result, these errors can be methodically profiled to extract information about the process variation in a computational unit. Therefore, in this dissertation, we also employ error profiling techniques along with the basic key-based authentication schemes to create lightweight device authentication protocols. Finally, intrinsic properties of hardware clocks can provide novel ways of device fingerprinting and authentication. The clock signatures can be used for real-time authentication of electromagnetic signals where some temporal properties of the signal are known. In the last part of this dissertation, we elaborate our studies on data authentication using hardware clocks. As an example, we propose a GPS signature authentication and spoofing detection technique using physical properties such as the frequency skew and drift of hardware clocks in GPS receivers

Digital Repository at the University of Maryland

An efficient logic fault diagnosis framework based on effect-cause approach

Author: Wu Lei
Publication venue
Publication date: 15/05/2009
Field of study

Fault diagnosis plays an important role in improving the circuit design process and the manufacturing yield. With the increasing number of gates in modern circuits, determining the source of failure in a defective circuit is becoming more and more challenging. In this research, we present an efficient effect-cause diagnosis framework for combinational VLSI circuits. The framework consists of three stages to obtain an accurate and reasonably precise diagnosis. First, an improved critical path tracing algorithm is proposed to identify an initial suspect list by backtracing from faulty primary outputs toward primary inputs. Compared to the traditional critical path tracing approach, our algorithm is faster and exact. Second, a novel probabilistic ranking model is applied to rank the suspects so that the most suspicious one will be ranked at or near the top. Several fast filtering methods are used to prune unrelated suspects. Finally, to refine the diagnosis, fault simulation is performed on the top suspect nets using several common fault models. The difference between the observed faulty behavior and the simulated behavior is used to rank each suspect. Experimental results on ISCAS85 benchmark circuits show that this diagnosis approach is efficient both in terms of memory space and CPU time and the diagnosis results are accurate and reasonably precise

Texas A&M Repository

WHYPE: A Scale-Out Architecture with Wireless Over-the-Air Majority for Scalable In-memory Hyperdimensional Computing

Author: Abadal Sergi
Alarcón Eduard
Guirado Robert
Karunaratne Geethan
Rahimi Abbas
Sebastian Abu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2023
Field of study

Hyperdimensional computing (HDC) is an emerging computing paradigm that represents, manipulates, and communicates data using long random vectors known as hypervectors. Among different hardware platforms capable of executing HDC algorithms, in-memory computing (IMC) has shown promise as it is very efficient in performing matrix-vector multiplications, which are common in the HDC algebra. Although HDC architectures based on IMC already exist, how to scale them remains a key challenge due to collective communication patterns that these architectures required and that traditional chip-scale networks were not designed for. To cope with this difficulty, we propose a scale-out HDC architecture called WHYPE, which uses wireless in-package communication technology to interconnect a large number of physically distributed IMC cores that either encode hypervectors or perform multiple similarity searches in parallel. In this context, the key enabler of WHYPE is the opportunistic use of the wireless network as a medium for over-the-air computation. WHYPE implements an optimized source coding that allows receivers to calculate the bit-wise majority of multiple hypervectors (a useful operation in HDC) being transmitted concurrently over the wireless channel. By doing so, we achieve a joint broadcast distribution and computation with a performance and efficiency unattainable with wired interconnects, which in turn enables massive parallelization of the architecture. Through evaluations at the on-chip network and complete architecture levels, we demonstrate that WHYPE can bundle and distribute hypervectors faster and more efficiently than a hypothetical wired implementation, and that it scales well to tens of receivers. We show that the average error rate of the majority computation is low, such that it has negligible impact on the accuracy of HDC classification tasks.Comment: Accepted at IEEE Journal on Emerging and Selected Topics in Circuits and Systems (JETCAS). arXiv admin note: text overlap with arXiv:2205.1088

arXiv.org e-Print Archive

UPCommons. Portal del coneixement obert de la UPC

Recommended from our members

Threat Analysis, Countermeaures and Design Strategies for Secure Computation in Nanometer CMOS Regime

Author: Kumar Raghavan
Publication venue: ScholarWorks@UMass Amherst
Publication date: 09/11/2015
Field of study

Advancements in CMOS technologies have led to an era of Internet Of Things (IOT), where the devices have the ability to communicate with each other apart from their computational power. As more and more sensitive data is processed by embedded devices, the trend towards lightweight and efficient cryptographic primitives has gained significant momentum. Achieving a perfect security in silicon is extremely difficult, as the traditional cryptographic implementations are vulnerable to various active and passive attacks. There is also a threat in the form of hardware Trojans inserted into the supply chain by the untrusted third-party manufacturers for economic incentives. Apart from the threats in various forms, some of the embedded security applications such as random number generators (RNGs) suffer from the impacts of process variations and noise in nanometer CMOS. Despite their disadvantages, the random and unique nature of process variations can be exploited for generating unique identifiers and can be of tremendous use in embedded security. In this dissertation, we explore techniques for precise fault-injection in cryptographic hardware based on voltage/temperature manipulation and hardware Trojan insertion. We demonstrate the effectiveness of these techniques by mounting fault attacks on state-of-the-art ciphers. Physically Unclonable Functions (PUFs) are novel cryptographic primitives for extracting secret keys from complex manufacturing variations in integrated circuits (ICs). We explore the vulnerabilities of some of the popular strong PUF architectures to modeling attacks using Machine Learning (ML) algorithms. The attacks use silicon data from a test chip manufactured in IBM 32nm silicon-on-insulator (SOI) technology. Attack results demonstrate that the majority of strong PUF architectures can be predicted to very high accuracies using limited training data. We also explore the techniques to exploit unreliable data from strong PUF architectures and effectively use them to improve the prediction accuracies of modeling attacks. Motivated by the vulnerabilities of existing PUF architectures, we present a novel modeling attack resistant PUF architecture based on non-linear computing elements. Post-silicon validation results are used to demonstrate the effectiveness of the non-linear PUF architecture against modeling and fault-injection attacks. Apart from the techniques to improve the security of PUF circuits, we also present novel solutions to improve the performance of PUF circuits from the perspectives of IC fabrication and system/protocol design. Finally, we present a statistical benchmark suite to evaluate PUFs in conceptualization phase and also to enable fine-grained security assessments for varying PUF parameters. Data compressibility analyses for validating the statistical benchmark suite are also presented

ScholarWorks@UMass Amherst

Memristive Circuits for LDPC Decoding

Author: Eero Lehtonen
Jonne Poikonen
Jussi Poikonen
Mika Laiho
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 27/10/2022
Field of study

We present design principles for implementing decoders for low-density parity check codes in CMOL-type memristive circuits. The programmable nonvolatile connectivity enabled by the nanowire arrays in such circuits is used to map the parity check matrix of an LDPC code in the decoder, while decoding operations are realized by a cellular CMOS circuit structure. We perform detailed performance analysis and circuit simulations of example decoders, and estimate how CMOL and memristor characteristics such as the memristor OFF/ON resistance ratio, nanowire resistance, and the total capacitance of the nanowire array affect decoder specification and performance. We also analyze how variation in circuit characteristics and persistent device defects affect the decoders.</div

UTUPub