287 research outputs found

    A Survey on Integrated Circuit Trojans

    Get PDF
    Traditionally, computer security has been associated with the software security, or the information-data security. Surprisingly, the hardware on which the software executes or the information stored-processed-transmitted has been assumed to be a trusted base of security. The main building blocks of any electronic device are Integrated circuits (ICs) which form the fabric of a computer system. Lately, the use of ICs has expanded from handheld calculators and personal computers (PCs) to smartphones, servers, and Internet-of-Things (IoT) devices. However, this significant growth in the IC market created intense competition among IC vendors, leading to new trends in IC manufacturing. System-on-chip (SoC) design based on intellectual property (IP), a globally spread supply chain of production and distribution of ICs are the foremost of these trends. The emerging trends have resulted in many security and trust weaknesses and vulnerabilities, in computer systems. This includes Hardware Trojans attacks, side-channel attacks, Reverse-engineering, IP piracy, IC counterfeiting, micro probing, physical tampering, and acquisition of private or valuable assets by debugging and testing. IC security and trust vulnerabilities may cause loss of private information, modified/altered functions, which may cause a great economical hazard and big damage to society. Thus, it is crucial to examine the security and trust threats existing in the IC lifecycle and build defense mechanisms against IC Trojan threats. In this article, we examine the IC supply chain and define the possible IC Trojan threats for the parties involved. Then we survey the latest progress of research in the area of countermeasures against the IC Trojan attacks and discuss the challenges and expectations in this area. Keywords: IC supply chain, IC security, IP privacy, hardware trojans, IC trojans DOI: 10.7176/CEIS/12-2-01 Publication date: April 30th 202

    FPSA: A Full System Stack Solution for Reconfigurable ReRAM-based NN Accelerator Architecture

    Full text link
    Neural Network (NN) accelerators with emerging ReRAM (resistive random access memory) technologies have been investigated as one of the promising solutions to address the \textit{memory wall} challenge, due to the unique capability of \textit{processing-in-memory} within ReRAM-crossbar-based processing elements (PEs). However, the high efficiency and high density advantages of ReRAM have not been fully utilized due to the huge communication demands among PEs and the overhead of peripheral circuits. In this paper, we propose a full system stack solution, composed of a reconfigurable architecture design, Field Programmable Synapse Array (FPSA) and its software system including neural synthesizer, temporal-to-spatial mapper, and placement & routing. We highly leverage the software system to make the hardware design compact and efficient. To satisfy the high-performance communication demand, we optimize it with a reconfigurable routing architecture and the placement & routing tool. To improve the computational density, we greatly simplify the PE circuit with the spiking schema and then adopt neural synthesizer to enable the high density computation-resources to support different kinds of NN operations. In addition, we provide spiking memory blocks (SMBs) and configurable logic blocks (CLBs) in hardware and leverage the temporal-to-spatial mapper to utilize them to balance the storage and computation requirements of NN. Owing to the end-to-end software system, we can efficiently deploy existing deep neural networks to FPSA. Evaluations show that, compared to one of state-of-the-art ReRAM-based NN accelerators, PRIME, the computational density of FPSA improves by 31x; for representative NNs, its inference performance can achieve up to 1000x speedup.Comment: Accepted by ASPLOS 201

    Challenges and Opportunities in Near-Threshold DNN Accelerators around Timing Errors

    Get PDF
    AI evolution is accelerating and Deep Neural Network (DNN) inference accelerators are at the forefront of ad hoc architectures that are evolving to support the immense throughput required for AI computation. However, much more energy efficient design paradigms are inevitable to realize the complete potential of AI evolution and curtail energy consumption. The Near-Threshold Computing (NTC) design paradigm can serve as the best candidate for providing the required energy efficiency. However, NTC operation is plagued with ample performance and reliability concerns arising from the timing errors. In this paper, we dive deep into DNN architecture to uncover some unique challenges and opportunities for operation in the NTC paradigm. By performing rigorous simulations in TPU systolic array, we reveal the severity of timing errors and its impact on inference accuracy at NTC. We analyze various attributes—such as data–delay relationship, delay disparity within arithmetic units, utilization pattern, hardware homogeneity, workload characteristics—and uncover unique localized and global techniques to deal with the timing errors in NTC

    Side-channel security of superscalar CPUs: Evaluating the Impact of Micro-architectural Features

    Get PDF
    Side-channel attacks are performed on increasingly complex targets, starting to threaten superscalar CPUs supporting a complete operating system. The difficulty of both assessing the vulnerability of a device to them, and validating the effectiveness of countermeasures is increasing as a consequence. In this work we prove that assessing the side-channel vulnerability of a software implementation running on a CPU should take into account the microarchitectural features of the CPU itself. We characterize the impact of microarchitectural features and prove the effectiveness of such an approach attacking a dual-core superscalar CPU

    Understanding Timing Error Characteristics From Overclocked Systolic Multiply–Accumulate Arrays in FPGAs

    Get PDF
    Artificial Intelligence (AI) hardware accelerators have seen tremendous developments in recent years due to the rapid growth of AI in multiple fields. Many such accelerators comprise a Systolic Multiply–Accumulate Array (SMA) as its computational brain. In this paper, we investigate the faulty output characterization of an SMA in a real silicon FPGA board. Experiments were run on a single Zybo Z7-20 board to control for process variation at nominal voltage and in small batches to control for temperature. The FPGA is rated up to 800 MHz in the data sheet due to the max frequency of the PLL, but the design is written using Verilog for the FPGA and C++ for the processor and synthesized with a chosen constraint of a 125 MHz clock. We then operate the system at a frequency range of 125 MHz to 450 MHz for the FPGA and the nominal 667 MHz for the processor core to produce timing errors in the FPGA without affecting the processor. Our extensive experimental platform with a hardware–software ecosystem provides a methodological pathway that reveals fascinating characteristics of SMA behavior under an overclocked environment. While one may intuitively expect that timing errors resulting from overclocked hardware may produce a wide variation in output values, our post-silicon evaluation reveals a lack of variation in erroneous output values. We found an intriguing pattern where error output values are stable for a given input across a range of operating frequencies far exceeding the rated frequency of the FPGA

    Continuous Integration for Fast SoC Algorithm Development

    Get PDF
    Digital systems have become advanced, hard to design and optimize due to ever-growing technology. Integrated Circuits (ICs) have become more complicated due to complex computations in latest technologies. Communication systems such as mobile networks have evolved and become a part of our daily lives with the advancement in technology over the years. Hence, need of efficient, reusable and automated processes for System-on-a-Chip (SoC) development has been increased. Purpose of this thesis is to study and evaluate currently used SoC development processes and presents guidelines on how these processes can be streamlined. The thesis starts by evaluating currently used SoC development flows and their advantages and disadvantages. One important aspect is to identify step which cause duplication of work and unnecessary idle times in SoC development teams. A study is conducted and input from SoC development experts is taken in order to optimize SoC flows and use of Continuous Integration (CI) system. An algorithm model is implemented that can be used in multiple stages of SoC development at adequate complexity and is “easy enough” to be used for a person not mastering the topic. The thesis outcome is proposal for CI system in SoC development for accelerating the speed and reliability of implementing algorithms to RTL code and finally into product. CI system tool is also implemented to automate and test the model design so that it also remains up to date
    • …
    corecore