200 research outputs found

    Hardware-Aware Static Optimization of Hyperdimensional Computations

    Full text link
    Binary spatter code (BSC)-based hyperdimensional computing (HDC) is a highly error-resilient approximate computational paradigm suited for error-prone, emerging hardware platforms. In BSC HDC, the basic datatype is a hypervector, a typically large binary vector, where the size of the hypervector has a significant impact on the fidelity and resource usage of the computation. Typically, the hypervector size is dynamically tuned to deliver the desired accuracy; this process is time-consuming and often produces hypervector sizes that lack accuracy guarantees and produce poor results when reused for very similar workloads. We present Heim, a hardware-aware static analysis and optimization framework for BSC HD computations. Heim analytically derives the minimum hypervector size that minimizes resource usage and meets the target accuracy requirement. Heim guarantees the optimized computation converges to the user-provided accuracy target on expectation, even in the presence of hardware error. Heim deploys a novel static analysis procedure that unifies theoretical results from the neuroscience community to systematically optimize HD computations. We evaluate Heim against dynamic tuning-based optimization on 25 benchmark data structures. Given a 99% accuracy requirement, Heim-optimized computations achieve a 99.2%-100.0% median accuracy, up to 49.5% higher than dynamic tuning-based optimization, while achieving 1.15x-7.14x reductions in hypervector size compared to HD computations that achieve comparable query accuracy and finding parametrizations 30.0x-100167.4x faster than dynamic tuning-based approaches. We also use Heim to systematically evaluate the performance benefits of using analog CAMs and multiple-bit-per-cell ReRAM over conventional hardware, while maintaining iso-accuracy -- for both emerging technologies, we find usages where the emerging hardware imparts significant benefits

    Large-scale memristive associative memories

    Get PDF
    Associative memories, in contrast to conventional address-based memories, are inherently fault-tolerant and allow retrieval of data based on partial search information. This paper considers the possibility of implementing large-scale associative memories through memristive devices jointly with CMOS circuitry. An advantage of a memristive associative memory is that the memory elements are located physically above the CMOS layer, which yields more die area for the processing elements realized in CMOS. This allows for high-capacity memories even while using an older CMOS technology, as the capacity of the memory depends more on the feature size of the memristive crossbar than on that of the CMOS components. In this paper, we propose the memristive implementations, and present simulations and error analysis of the autoassociative content-addressable memory, the Willshaw memory, and the sparse distributed memory. Furthermore, we present a CMOS cell that can be used to implement the proposed memory architectures.</div

    QubitHD: A Stochastic Acceleration Method for HD Computing-Based Machine Learning

    Full text link
    Machine Learning algorithms based on Brain-inspired Hyperdimensional (HD) computing imitate cognition by exploiting statistical properties of high-dimensional vector spaces. It is a promising solution for achieving high energy-efficiency in different machine learning tasks, such as classification, semi-supervised learning and clustering. A weakness of existing HD computing-based ML algorithms is the fact that they have to be binarized for achieving very high energy-efficiency. At the same time, binarized models reach lower classification accuracies. To solve the problem of the trade-off between energy-efficiency and classification accuracy, we propose the QubitHD algorithm. It stochastically binarizes HD-based algorithms, while maintaining comparable classification accuracies to their non-binarized counterparts. The FPGA implementation of QubitHD provides a 65% improvement in terms of energy-efficiency, and a 95% improvement in terms of the training time, as compared to state-of-the-art HD-based ML algorithms. It also outperforms state-of-the-art low-cost classifiers (like Binarized Neural Networks) in terms of speed and energy-efficiency by an order of magnitude during training and inference.Comment: 8 pages, 7 figures, 3 table

    Hardware-Based Authentication for the Internet of Things

    Get PDF
    Entity authentication is one of the most fundamental problems in computer security. Implementation of any authentication protocol requires the solution of several sub-problems, such as the problems regarding secret sharing, key generation, key storage and key verification. With the advent of the Internet-of-Things(IoT), authentication becomes a pivotal concern in the security of IoT systems. Interconnected components of IoT devices normally contains sensors, actuators, relays, and processing and control equipment that are designed with the limited budget on power, cost and area. As a result, incorporating security protocols in such resource constrained IoT components can be rather challenging. To address this issue, in this dissertation, we design and develop hardware oriented lightweight protocols for the authentication of users, devices and data. These protocols utilize physical properties of memory components, computing units, and hardware clocks on the IoT device. Recent works on device authentication using physically uncloneable functions can render the problem of entity authentication and verification based on the hardware properties tractable. Our studies reveal that non-linear characteristics of resistive memories can be useful in solving several problems regarding authentication. Therefore, in this dissertation, first we explore the ideas of secret sharing using threshold circuits and non-volatile memory components. Inspired by the concepts of visual cryptography, we identify the promises of resistive memory based circuits in lightweight secret sharing and multi-user authentication. Furthermore, the additive and monotonic properties of non-volatile memory components can be useful in addressing the challenges of key storage. Overall, in the first part of this dissertation, we present our research on design of low-cost, non-crypto based user authentication schemes using physical properties of a resistive memory based system. In the second part of the dissertation, we demonstrate that in computational units, the emerging voltage over-scaling (VOS)-based computing leaves a process variation dependent error signature in the approximate results. Current research works in VOS focus on reducing these errors to provide acceptable results from the computation point of view. Interestingly, with extreme VOS, these errors can also reveal significant information about the underlying physical system and random variations therein. As a result, these errors can be methodically profiled to extract information about the process variation in a computational unit. Therefore, in this dissertation, we also employ error profiling techniques along with the basic key-based authentication schemes to create lightweight device authentication protocols. Finally, intrinsic properties of hardware clocks can provide novel ways of device fingerprinting and authentication. The clock signatures can be used for real-time authentication of electromagnetic signals where some temporal properties of the signal are known. In the last part of this dissertation, we elaborate our studies on data authentication using hardware clocks. As an example, we propose a GPS signature authentication and spoofing detection technique using physical properties such as the frequency skew and drift of hardware clocks in GPS receivers

    An efficient logic fault diagnosis framework based on effect-cause approach

    Get PDF
    Fault diagnosis plays an important role in improving the circuit design process and the manufacturing yield. With the increasing number of gates in modern circuits, determining the source of failure in a defective circuit is becoming more and more challenging. In this research, we present an efficient effect-cause diagnosis framework for combinational VLSI circuits. The framework consists of three stages to obtain an accurate and reasonably precise diagnosis. First, an improved critical path tracing algorithm is proposed to identify an initial suspect list by backtracing from faulty primary outputs toward primary inputs. Compared to the traditional critical path tracing approach, our algorithm is faster and exact. Second, a novel probabilistic ranking model is applied to rank the suspects so that the most suspicious one will be ranked at or near the top. Several fast filtering methods are used to prune unrelated suspects. Finally, to refine the diagnosis, fault simulation is performed on the top suspect nets using several common fault models. The difference between the observed faulty behavior and the simulated behavior is used to rank each suspect. Experimental results on ISCAS85 benchmark circuits show that this diagnosis approach is efficient both in terms of memory space and CPU time and the diagnosis results are accurate and reasonably precise

    WHYPE: A Scale-Out Architecture with Wireless Over-the-Air Majority for Scalable In-memory Hyperdimensional Computing

    Full text link
    Hyperdimensional computing (HDC) is an emerging computing paradigm that represents, manipulates, and communicates data using long random vectors known as hypervectors. Among different hardware platforms capable of executing HDC algorithms, in-memory computing (IMC) has shown promise as it is very efficient in performing matrix-vector multiplications, which are common in the HDC algebra. Although HDC architectures based on IMC already exist, how to scale them remains a key challenge due to collective communication patterns that these architectures required and that traditional chip-scale networks were not designed for. To cope with this difficulty, we propose a scale-out HDC architecture called WHYPE, which uses wireless in-package communication technology to interconnect a large number of physically distributed IMC cores that either encode hypervectors or perform multiple similarity searches in parallel. In this context, the key enabler of WHYPE is the opportunistic use of the wireless network as a medium for over-the-air computation. WHYPE implements an optimized source coding that allows receivers to calculate the bit-wise majority of multiple hypervectors (a useful operation in HDC) being transmitted concurrently over the wireless channel. By doing so, we achieve a joint broadcast distribution and computation with a performance and efficiency unattainable with wired interconnects, which in turn enables massive parallelization of the architecture. Through evaluations at the on-chip network and complete architecture levels, we demonstrate that WHYPE can bundle and distribute hypervectors faster and more efficiently than a hypothetical wired implementation, and that it scales well to tens of receivers. We show that the average error rate of the majority computation is low, such that it has negligible impact on the accuracy of HDC classification tasks.Comment: Accepted at IEEE Journal on Emerging and Selected Topics in Circuits and Systems (JETCAS). arXiv admin note: text overlap with arXiv:2205.1088

    Memristive Circuits for LDPC Decoding

    Get PDF
    We present design principles for implementing decoders for low-density parity check codes in CMOL-type memristive circuits. The programmable nonvolatile connectivity enabled by the nanowire arrays in such circuits is used to map the parity check matrix of an LDPC code in the decoder, while decoding operations are realized by a cellular CMOS circuit structure. We perform detailed performance analysis and circuit simulations of example decoders, and estimate how CMOL and memristor characteristics such as the memristor OFF/ON resistance ratio, nanowire resistance, and the total capacitance of the nanowire array affect decoder specification and performance. We also analyze how variation in circuit characteristics and persistent device defects affect the decoders.</div
    • …
    corecore