21 research outputs found

    EE-ACML: Energy-Efficient Adiabatic CMOS/MTJ Logic for CPA-Resistant IoT Devices

    Get PDF
    Internet of Things (IoT) devices have strict energy constraints as they often operate on a battery supply. The cryptographic operations within IoT devices consume substantial energy and are vulnerable to a class of hardware attacks known as side-channel attacks. To reduce the energy consumption and defend against side-channel attacks, we propose combining adiabatic logic and Magnetic Tunnel Junctions to form our novel Energy Efficient-Adiabatic CMOS/MTJ Logic (EE-ACML). EE-ACML is shown to be both low energy and secure when compared to existing CMOS/MTJ architectures. EE-ACML reduces dynamic energy consumption with adiabatic logic, while MTJs reduce the leakage power of a circuit. To show practical functionality and energy savings, we designed one round of PRESENT-80 with the proposed EE-ACML integrated with an adiabatic clock generator. The proposed EE-ACML-based PRESENT-80 showed energy savings of 67.24% at 25 MHz and 86.5% at 100 MHz when compared with a previously proposed CMOS/MTJ circuit. Furthermore, we performed a CPA attack on our proposed design, and the key was kept secret

    Fast 3D Integrated Circuit Placement Methodology using Merging Technique

    Get PDF
    In the recent years the advancement in the field of microelectronics integrated circuit (IC) design technologies proved to be a boon for design and development of various advanced systems in-terms of its reduction in form factor, low power, high speed and with increased capacity to incorporate more designs. These systems provide phenomenal advantage for armoured fighting vehicle (AFV) design to develop miniaturised low power, high performance sub-systems. One such emerging high-end technology to be used to develop systems with high capabilities for AFVs is discussed in this paper. Three dimensional IC design is one of the emerging field used to develop high density heterogeneous systems in a reduced form factor. A novel grouping based partitioning and merge based placement (GPMP) methodology for 3D ICs to reduce through silicon vias (TSVs) count and placement time is proposed. Unlike state-of-the-art techniques, the proposed methodology does not suffer from initial overlap of cells during intra-layer placement which reduces the placement time. Connectivity based grouping and partitioning ensures less number of TSVs and merge based placement further reduces intra layer wire-length. The proposed GPMP methodology has been extensively against the IBMPLACE database and performance has been compared with the latest techniques resulting in 12 per cent improvement in wire-length, 13 per cent reduction in TSV and 1.1x improvement in placement time

    Extending the performance of hybrid NoCs beyond the limitations of network heterogeneity

    Get PDF
    To meet the performance and scalability demands of the fast-paced technological growth towards exascale and Big-Data processing with the performance bottleneck of conventional metal based interconnects (wireline), alternative interconnect fabrics such as inhomogeneous three-dimensional integrated Network-on-Chip (3D NoC) and hybrid wired-wireless Network-on-Chip (WiNoC) have emanated as a cost-effective solution for emerging System-on-Chip (SoC) design. However, these interconnects trade-off optimized performance for cost by restricting the number of area and power hungry 3D routers and wireless nodes. Moreover, the non-uniform distributed traffic in chip multiprocessor (CMP) demands an on-chip communication infrastructure which can avoid congestion under high traffic conditions while possessing minimal pipeline delay at low-load conditions. To this end, in this paper, we propose a low-latency adaptive router with a low-complexity single-cycle bypassing mechanism to alleviate the performance degradation due to the slow 2D routers in such emerging hybrid NoCs. The proposed router transmits a flit using dimension-ordered routing (DoR) in the bypass datapath at low-loads. When the output port required for intra-dimension bypassing is not available, the packet is routed adaptively to avoid congestion. The router also has a simplified virtual channel allocation (VA) scheme that yields a non-speculative low-latency pipeline. By combining the low-complexity bypassing technique with adaptive routing, the proposed router is able balance the traffic in hybrid NoCs to achieve low-latency communication under various traffic loads. Simulation shows that, the proposed router can reduce applications’ execution time by an average of 16.9% compared to low-latency routers such as SWIFT. By reducing the latency between 2D routers (or wired nodes) and 3D routers (or wireless nodes) the proposed router can improve performance efficiency in terms of average packet delay by an average of 45% (or 50%) in 3D NoCs (or WiNoCs)

    ALL-MASK: A Reconfigurable Logic Locking Method for Multicore Architecture with Sequential-Instruction-Oriented Key

    Full text link
    Intellectual property (IP) piracy has become a non-negligible problem as the integrated circuit (IC) production supply chain is becoming increasingly globalized and separated that enables attacks by potentially untrusted attackers. Logic locking is a widely adopted method to lock the circuit module with a key and prevent hackers from cracking it. The key is the critical aspect of logic locking, but the existing works have overlooked three possible challenges of the key: safety of key storage, easy key-attempt from interface and key-related overheads, bringing the further challenges of low error rate and small state space. In this work, the key is dynamically generated by utilizing the huge space of a CPU core, and the unlocking is performed implicitly through the interconnection inside the chip. A novel low-cost logic reconfigurable gate is together proposed with ferroelectric FET (FeFET) to mitigate the reverse engineering and removal attack. Compared to the common logic locking methods, our proposed approach is 19,945 times more time consuming to traverse all the possible combinations in only 9-bit-key condition. Furthermore, our technique let key length increases this complexity exponentially and ensure the logic obfuscation effect.Comment: 15 pages, 17 figure

    Improving spiking neural network performance with auxiliary learning

    Get PDF
    The use of back propagation through the time learning rule enabled the supervised training of deep spiking neural networks to process temporal neuromorphic data. However, their performance is still below non-spiking neural networks. Previous work pointed out that one of the main causes is the limited number of neuromorphic data currently available, which are also difficult to generate. With the goal of overcoming this problem, we explore the usage of auxiliary learning as a means of helping spiking neural networks to identify more general features. Tests are performed on neuromorphic DVS-CIFAR10 and DVS128-Gesture datasets. The results indicate that training with auxiliary learning tasks improves their accuracy, albeit slightly. Different scenarios, including manual and automatic combination losses using implicit differentiation, are explored to analyze the usage of auxiliary tasks

    Analyzing the divide between FPGA academic and commercial results

    Get PDF
    The pinnacle of success for academic work is often achieved by having impact on commercial products. In order to have a successful transfer bridge, academic evaluation flows need to provide representative results of similar quality to commercial flows. A majority of publications in FPGA research use the same set of known academic CAD tools and benchmarks to evaluate new architecture and tool ideas. However, it is not clear whether the claims in academic publications based on these tools and benchmarks translate to real benefits in commercial products. In this work we compare the latest Xilinx commercial tools and products with these well-known academic tools to identify the gap in the major figures of merit. Our results show that there is a significant 2.2X gap in speed-performance for similar process technology. We have also identified the area-efficiency and runtime divide between commercial and academic tools to be 5% and 2.2X, respectively. We show that it is possible to improve portions of the academic flow such as ABC logic optimization to match the quality of commercial tools at the expense of additional runtime. Our results also show that depth reduction, which is often used as the main figure of merit for logic optimization papers does not translate to post-routing timing improvements. We finally discuss the differences between academic and commercial benchmark designs. We explain the main differences and trends that may influence the topic choice and conclusions of academic research. This work emphasizes how difficult it is to identify the relevant FPGA academic work that can provide meaningful benefits for commercial products
    corecore