12 research outputs found

    Extending the performance of hybrid NoCs beyond the limitations of network heterogeneity

    Get PDF
    To meet the performance and scalability demands of the fast-paced technological growth towards exascale and Big-Data processing with the performance bottleneck of conventional metal based interconnects (wireline), alternative interconnect fabrics such as inhomogeneous three-dimensional integrated Network-on-Chip (3D NoC) and hybrid wired-wireless Network-on-Chip (WiNoC) have emanated as a cost-effective solution for emerging System-on-Chip (SoC) design. However, these interconnects trade-off optimized performance for cost by restricting the number of area and power hungry 3D routers and wireless nodes. Moreover, the non-uniform distributed traffic in chip multiprocessor (CMP) demands an on-chip communication infrastructure which can avoid congestion under high traffic conditions while possessing minimal pipeline delay at low-load conditions. To this end, in this paper, we propose a low-latency adaptive router with a low-complexity single-cycle bypassing mechanism to alleviate the performance degradation due to the slow 2D routers in such emerging hybrid NoCs. The proposed router transmits a flit using dimension-ordered routing (DoR) in the bypass datapath at low-loads. When the output port required for intra-dimension bypassing is not available, the packet is routed adaptively to avoid congestion. The router also has a simplified virtual channel allocation (VA) scheme that yields a non-speculative low-latency pipeline. By combining the low-complexity bypassing technique with adaptive routing, the proposed router is able balance the traffic in hybrid NoCs to achieve low-latency communication under various traffic loads. Simulation shows that, the proposed router can reduce applications’ execution time by an average of 16.9% compared to low-latency routers such as SWIFT. By reducing the latency between 2D routers (or wired nodes) and 3D routers (or wireless nodes) the proposed router can improve performance efficiency in terms of average packet delay by an average of 45% (or 50%) in 3D NoCs (or WiNoCs)

    A Bandwidth Control Arbitration for SoC Interconnections Performing Applications With Task Dependencies

    Get PDF
    Current System-on-Chips (SoCs) execute applications with task dependency that compete for shared resources such as buses, memories, and accelerators. In such a structure, the arbitration policy becomes a critical part of the system to guarantee access and bandwidth suitable for the competing applications. Some strategies proposed in the literature to cope with these issues are Round-Robin, Weighted Round-Robin, Lottery, Time Division Access Multiplexing (TDMA), and combinations. However, a fine-grained bandwidth control arbitration policy is missing from the literature. We propose an innovative arbitration policy based on opportunistic access and a supervised utilization of the bus in terms of transmitted flits (transmission units) that settle the access and fine-grained control. In our proposal, every competing element has a budget. Opportunistic access grants the bus to request even if the component has spent all its flits. Supervised debt accounts a record for every transmitted flit when it has no flits to spend. Our proposal applies to interconnection systems such as buses, switches, and routers. The presented approach achieves deadlock-free behavior even with task dependency applications in the scenarios analyzed through cycle-accurate simulation models. The synergy between opportunistic and supervised debt techniques outperforms Lottery, TDMA, and Weighted Round-Robin in terms of bandwidth control in the experimental studies performed

    Time- and Amplitude-Controlled Power Noise Generator against SPA Attacks for FPGA-Based IoT Devices

    Get PDF
    Power noise generation for masking power traces is a powerful countermeasure against Simple Power Analysis (SPA), and it has also been used against Differential Power Analysis (DPA) or Correlation Power Analysis (CPA) in the case of cryptographic circuits. This technique makes use of power consumption generators as basic modules, which are usually based on ring oscillators when implemented on FPGAs. These modules can be used to generate power noise and to also extract digital signatures through the power side channel for Intellectual Property (IP) protection purposes. In this paper, a new power consumption generator, named Xored High Consuming Module (XHCM), is proposed. XHCM improves, when compared to others proposals in the literature, the amount of current consumption per LUT when implemented on FPGAs. Experimental results show that these modules can achieve current increments in the range from 2.4 mA (with only 16 LUTs on Artix-7 devices with a power consumption density of 0.75 mW/LUT when using a single HCM) to 11.1 mA (with 67 LUTs when using 8 XHCMs, with a power consumption density of 0.83 mW/LUT). Moreover, a version controlled by Pulse-Width Modulation (PWM) has been developed, named PWM-XHCM, which is, as XHCM, suitable for power watermarking. In order to build countermeasures against SPA attacks, a multi-level XHCM (ML-XHCM) is also presented, which is capable of generating different power consumption levels with minimal area overhead (27 six-input LUTS for generating 16 different amplitude levels on Artix-7 devices). Finally, a randomized version, named RML-XHCM, has also been developed using two True Random Number Generators (TRNGs) to generate current consumption peaks with random amplitudes at random times. RML-XHCM requires less than 150 LUTs on Artix-7 devices. Taking into account these characteristics, two main contributions have been carried out in this article: first, XHCM and PWM-XHCM provide an efficient power consumption generator for extracting digital signatures through the power side channel, and on the other hand, ML-XHCM and RML-XHCM are powerful tools for the protection of processing units against SPA attacks in IoT devices implemented on FPGAs.Junta de AndaluciaEuropean Commission B-TIC-588-UGR2

    Global Adaptation Controlled by an Interactive Consistency Protocol

    Get PDF
    Static schedules for systems can lead to an inefficient usage of the resources, because the system’s behavior cannot be adapted at runtime. To improve the runtime system performance in current time-triggered Multi-Processor System on Chip (MPSoC), a dynamic reaction to events is performed locally on the cores. The effects of this optimization can be increased by coordinating the changes globally. To perform such global changes, a consistent view on the system state is needed, on which to base the adaptation decisions. This paper proposes such an interactive consistency protocol with low impact on the system w.r.t. latency and overhead. We show that an energy optimizing adaptation controlled by the protocol can enable a system to save up to 43% compared to a system without adaptation

    Function Implementation in a Multi-Gate Junctionless FET Structure

    Get PDF
    Title from PDF of title page, viewed September 18, 2023Dissertation advisor: Mostafizur RahmanVitaIncludes bibliographical references (pages 95-117)Dissertation (Ph.D.)--Department of Computer Science and Electrical Engineering, Department of Physics and Astronomy. University of Missouri--Kansas City, 2023This dissertation explores designing and implementing a multi-gate junctionless field-effect transistor (JLFET) structure and its potential applications beyond conventional devices. The JLFET is a promising alternative to conventional transistors due to its simplified fabrication process and improved electrical characteristics. However, previous research has focused primarily on the device's performance at the individual transistor level, neglecting its potential for implementing complex functions. This dissertation fills this research gap by investigating the function implementation capabilities of the JLFET structure and proposing novel circuit designs based on this technology. The first part of this dissertation presents a comprehensive review of the existing literature on JLFETs, including their fabrication techniques, operating principles, and performance metrics. It highlights the advantages of JLFETs over traditional metal-oxide-semiconductor field-effect transistors (MOSFETs) and discusses the challenges associated with their implementation. Additionally, the review explores the limitations of conventional transistor technologies, emphasizing the need for exploring alternative device architectures. Building upon the theoretical foundation, the dissertation presents a detailed analysis of the multi-gate JLFET structure and its potential for realizing advanced functions. The study explores the impact of different design parameters, such as channel length, gate oxide thickness, and doping profiles, on the device performance. It investigates the trade-offs between power consumption, speed, and noise immunity, and proposes design guidelines for optimizing the function implementation capabilities of the JLFET. To demonstrate the practical applicability of the JLFET structure, this dissertation introduces several novel circuit designs based on this technology. These designs leverage the unique characteristics of the JLFET, such as its steep subthreshold slope and improved on/off current ratio, to implement complex functions efficiently. The proposed circuits include arithmetic units, memory cells, and digital logic gates. Detailed simulations and analyses are conducted to evaluate their performance, power consumption, and scalability. Furthermore, this dissertation explores the potential of the JLFET structure for emerging technologies, such as neuromorphic computing and bioelectronics. It investigates how the JLFET can be employed to realize energy-efficient and biocompatible devices for applications in artificial intelligence and biomedical engineering. The study investigates the compatibility of the JLFET with various materials and substrates, as well as its integration with other functional components. In conclusion, this dissertation contributes to the field of nanoelectronics by providing a comprehensive investigation into the function implementation capabilities of the multi-gate JLFET structure. It highlights the potential of this device beyond its individual transistor performance and proposes novel circuit designs based on this technology. The findings of this research pave the way for the development of advanced electronic systems that are more energy-efficient, faster, and compatible with emerging applications in diverse fields.Introduction -- Literature review -- Crosstalk principle -- Experiment of crosstalk -- Device architecture -- Simulation & results -- Conclusio

    Doctor of Philosophy

    Get PDF
    dissertationOver the last decade, cyber-physical systems (CPSs) have seen significant applications in many safety-critical areas, such as autonomous automotive systems, automatic pilot avionics, wireless sensor networks, etc. A Cps uses networked embedded computers to monitor and control physical processes. The motivating example for this dissertation is the use of fault- tolerant routing protocol for a Network-on-Chip (NoC) architecture that connects electronic control units (Ecus) to regulate sensors and actuators in a vehicle. With a network allowing Ecus to communicate with each other, it is possible for them to share processing power to improve performance. In addition, networked Ecus enable flexible mapping to physical processes (e.g., sensors, actuators), which increases resilience to Ecu failures by reassigning physical processes to spare Ecus. For the on-chip routing protocol, the ability to tolerate network faults is important for hardware reconfiguration to maintain the normal operation of a system. Adding a fault-tolerance feature in a routing protocol, however, increases its design complexity, making it prone to many functional problems. Formal verification techniques are therefore needed to verify its correctness. This dissertation proposes a link-fault-tolerant, multiflit wormhole routing algorithm, and its formal modeling and verification using two different methodologies. An improvement upon the previously published fault-tolerant routing algorithm, a link-fault routing algorithm is proposed to relax the unrealistic node-fault assumptions of these algorithms, while avoiding deadlock conservatively by appropriately dropping network packets. This routing algorithm, together with its routing architecture, is then modeled in a process-algebra language LNT, and compositional verification techniques are used to verify its key functional properties. As a comparison, it is modeled using channel-level VHDL which is compiled to labeled Petri-nets (LPNs). Algorithms for a partial order reduction method on LPNs are given. An optimal result is obtained from heuristics that trace back on LPNs to find causally related enabled predecessor transitions. Key observations are made from the comparison between these two verification methodologies

    A VOICE PRIORITY QUEUE (VPQ) SCHEDULER FOR VOIP OVER WLANs

    Get PDF
    The Voice over Internet Protocol (VoIP) application has observed the fastest growth in the world of telecommunication. The Wireless Local Area Network (WLAN) is the most assuring of technologies among the wireless networks, which has facilitated high-rate voice services at low cost and good flexibility. In a voice conversation, each client works as a sender and as a receiver depending on the direction of traffic flow over the network. A VoIP application requires a higher throughput, less packet loss and a higher fairness index over the network. The packets of VoIP streaming may experience drops because of the competition among the different kinds of traffic flow over the network. A VoIP application is also sensitive to delay and requires the voice packets to arrive on time from the sender to the receiver side without any delay over WLANs. The scheduling system model for VoIP traffic is still an unresolved problem. A new traffic scheduler is necessary to offer higher throughput and a higher fairness index for a VoIP application. The objectives of this thesis are to propose a new scheduler and algorithms that support the VoIP application and to evaluate, validate and verify the newly proposed scheduler and algorithms with the existing scheduling algorithms over WLANs through simulation and experimental environment. We proposed a new Voice Priority Queue (VPQ) scheduling system model and algorithms to solve scheduling issues. VPQ system model is implemented in three stages. The first stage of the model is to ensure efficiency by producing a higher throughput and fairness for VoIP packets. The second stage will be designed for bursty Virtual-VoIP Flow (Virtual-VF) while the third stage is a Switch Movement (SM) technique. Furthermore, we compared the VPQ scheduler with other well known schedulers and algorithms. We observed in our simulation and experimental environment that the VPQ provides better results for the VoIP over WLANs

    Internet of Things. Information Processing in an Increasingly Connected World

    Get PDF
    This open access book constitutes the refereed post-conference proceedings of the First IFIP International Cross-Domain Conference on Internet of Things, IFIPIoT 2018, held at the 24th IFIP World Computer Congress, WCC 2018, in Poznan, Poland, in September 2018. The 12 full papers presented were carefully reviewed and selected from 24 submissions. Also included in this volume are 4 WCC 2018 plenary contributions, an invited talk and a position paper from the IFIP domain committee on IoT. The papers cover a wide range of topics from a technology to a business perspective and include among others hardware, software and management aspects, process innovation, privacy, power consumption, architecture, applications

    High-Performance Modelling and Simulation for Big Data Applications

    Get PDF
    This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications
    corecore