10 research outputs found

    Channel routing for integrated optics

    Get PDF
    pre-printIncreasing scope and applications of integrated optics necessitates the development of automated techniques for physical design of optical systems. This paper presents an automated, planar channel routing technique for integrated optical waveguides. Integrated optics is a planar technology and lacks the inherent signal restoration capabilities of static-CMOS. Therefore, signal loss minimization-as a function of waveguide crossings and bends-is the primary objective of this technique. This is in contrast to track and wire-length minimization of traditional VLSI routing. Our optical channel router guarantees minimal waveguide crossings by drawing upon sorting-based techniques for waveguide routing. To further improve our solutions in terms of signal loss, we extend the router to reduce the number of bends produced during routing. Finally, we implement the optical channel routing technique and describe the experimental results, comparing the costs of routing solutions with respect to waveguide crossings, bends, and channel height

    Neural network computing using on-chip accelerators

    Get PDF
    The use of neural networks, machine learning, or artificial intelligence, in its broadest and most controversial sense, has been a tumultuous journey involving three distinct hype cycles and a history dating back to the 1960s. Resurgent, enthusiastic interest in machine learning and its applications bolsters the case for machine learning as a fundamental computational kernel. Furthermore, researchers have demonstrated that machine learning can be utilized as an auxiliary component of applications to enhance or enable new types of computation such as approximate computing or automatic parallelization. In our view, machine learning becomes not the underlying application, but a ubiquitous component of applications. This view necessitates a different approach towards the deployment of machine learning computation that spans not only hardware design of accelerator architectures, but also user and supervisor software to enable the safe, simultaneous use of machine learning accelerator resources. In this dissertation, we propose a multi-transaction model of neural network computation to meet the needs of future machine learning applications. We demonstrate that this model, encompassing a decoupled backend accelerator for inference and learning from hardware and software for managing neural network transactions can be achieved with low overhead and integrated with a modern RISC-V microprocessor. Our extensions span user and supervisor software and data structures and, coupled with our hardware, enable multiple transactions from different address spaces to execute simultaneously, yet safely. Together, our system demonstrates the utility of a multi-transaction model to increase energy efficiency improvements and improve overall accelerator throughput for machine learning applications

    Robust and Traffic Aware Medium Access Control Mechanisms for Energy-Efficient mm-Wave Wireless Network-on-Chip Architectures

    Get PDF
    To cater to the performance/watt needs, processors with multiple processing cores on the same chip have become the de-facto design choice. In such multicore systems, Network-on-Chip (NoC) serves as a communication infrastructure for data transfer among the cores on the chip. However, conventional metallic interconnect based NoCs are constrained by their long multi-hop latencies and high power consumption, limiting the performance gain in these systems. Among, different alternatives, due to the CMOS compatibility and energy-efficiency, low-latency wireless interconnect operating in the millimeter wave (mm-wave) band is nearer term solution to this multi-hop communication problem. This has led to the recent exploration of millimeter-wave (mm-wave) wireless technologies in wireless NoC architectures (WiNoC). To realize the mm-wave wireless interconnect in a WiNoC, a wireless interface (WI) equipped with on-chip antenna and transceiver circuit operating at 60GHz frequency range is integrated to the ports of some NoC switches. The WIs are also equipped with a medium access control (MAC) mechanism that ensures a collision free and energy-efficient communication among the WIs located at different parts on the chip. However, due to shrinking feature size and complex integration in CMOS technology, high-density chips like multicore systems are prone to manufacturing defects and dynamic faults during chip operation. Such failures can result in permanently broken wireless links or cause the MAC to malfunction in a WiNoC. Consequently, the energy-efficient communication through the wireless medium will be compromised. Furthermore, the energy efficiency in the wireless channel access is also dependent on the traffic pattern of the applications running on the multicore systems. Due to the bursty and self-similar nature of the NoC traffic patterns, the traffic demand of the WIs can vary both spatially and temporally. Ineffective management of such traffic variation of the WIs, limits the performance and energy benefits of the novel mm-wave interconnect technology. Hence, to utilize the full potential of the novel mm-wave interconnect technology in WiNoCs, design of a simple, fair, robust, and efficient MAC is of paramount importance. The main goal of this dissertation is to propose the design principles for robust and traffic-aware MAC mechanisms to provide high bandwidth, low latency, and energy-efficient data communication in mm-wave WiNoCs. The proposed solution has two parts. In the first part, we propose the cross-layer design methodology of robust WiNoC architecture that can minimize the effect of permanent failure of the wireless links and recover from transient failures caused by single event upsets (SEU). Then, in the second part, we present a traffic-aware MAC mechanism that can adjust the transmission slots of the WIs based on the traffic demand of the WIs. The proposed MAC is also robust against the failure of the wireless access mechanism. Finally, as future research directions, this idea of traffic awareness is extended throughout the whole NoC by enabling adaptiveness in both wired and wireless interconnection fabric

    FDMA Enabled Phase-based Wireless Network-on-Chip using Graphene-based THz-band Antennas

    Get PDF
    The future growth in System-on-chip design is moving in the direction of multicore systems. Design of efficient interconnects between cores are crucial for improving the performance of a multicore processor. Such trends are seen due to the benefits the multicore systems provide in terms of power reduction and scalability. Network-on-chips (NoC) are viewed as an emerging solution in the design of interconnects in multicore systems. However, Traditional Network-on-chip architectures are no longer able to satisfy the performance requirements due to long distance communication over multi-hop wireline paths. Multi-hop communication leads to higher energy consumption, increase in latency and reduction in bandwidth. Research in recent years has explored emerging technologies such as 3D integration, photonic and radio frequency based Network-on-chips. The use of wireless interconnects using mm-wave antennas are able to alleviate the performance issues in a wireline interconnect system. However, to satisfy the increasing demand for higher bandwidth and lower energy consumption, Wireless Network-on-Chip enabled with high speed direct links operating in THz band between distant cores is desired. Recent research has brought to light highly efficient graphene-based antennas operating in THz band. These antennas can provide high data rate and are found to consume less power with low area overheads. In this thesis, an innovative approach using novel devices based on graphene structures is proposed to provide a high-performance on-chip interconnection. This novel approach combines the regular NoC structure with the proposed wireless infrastructure to exploit the performance benefits. An architecture with wireless interfaces on every core is explored in this work. Simultaneous multiple communications in a network can be achieved by adopting Frequency Division Multiple access (FDMA). However, in a system where all cores are equipped with a wireless interface, FDMA requires more number of frequency bands. This becomes difficult to achieve as the system scales and the number of cores increase. Therefore, a FDMA protocol along with a 4-phased repetitive multi-band architecture is envisioned in this work. The phase-based protocol allows multiple wireless links to be active at a time, the phase-based protocol along with the FDMA protocol provides a reliable data transfer between cores with lesser number of frequency bands. In this thesis, an architecture with a combination of FDMA and phase-based protocol using point-to-point graphene-based wireless links is proposed. The proposed architecture is also extended for a multichip system. With cycle accurate system-level simulations, it is shown that the proposed architecture provides huge gains in performance and energy-efficiency in data transfer both in NoC based multicore and multichip systems

    Overcoming the Challenges for Multichip Integration: A Wireless Interconnect Approach

    Get PDF
    The physical limitations in the area, power density, and yield restrict the scalability of the single-chip multicore system to a relatively small number of cores. Instead of having a large chip, aggregating multiple smaller chips can overcome these physical limitations. Combining multiple dies can be done either by stacking vertically or by placing side-by-side on the same substrate within a single package. However, in order to be widely accepted, both multichip integration techniques need to overcome significant challenges. In the horizontally integrated multichip system, traditional inter-chip I/O does not scale well with technology scaling due to limitations of the pitch. Moreover, to transfer data between cores or memory components from one chip to another, state-of-the-art inter-chip communication over wireline channels require data signals to travel from internal nets to the peripheral I/O ports and then get routed over the inter-chip channels to the I/O port of the destination chip. Following this, the data is finally routed from the I/O to internal nets of the target chip over a wireline interconnect fabric. This multi-hop communication increases energy consumption while decreasing data bandwidth in a multichip system. On the other hand, in vertically integrated multichip system, the high power density resulting from the placement of computational components on top of each other aggravates the thermal issues of the chip leading to degraded performance and reduced reliability. Liquid cooling through microfluidic channels can provide cooling capabilities required for effective management of chip temperatures in vertical integration. However, to reduce the mechanical stresses and at the same time, to ensure temperature uniformity and adequate cooling competencies, the height and width of the microchannels need to be increased. This limits the area available to route Through-Silicon-Vias (TSVs) across the cooling layers and make the co-existence and co-design of TSVs and microchannels extreamly challenging. Research in recent years has demonstrated that on-chip and off-chip wireless interconnects are capable of establishing radio communications within as well as between multiple chips. The primary goal of this dissertation is to propose design principals targeting both horizontally and vertically integrated multichip system to provide high bandwidth, low latency, and energy efficient data communication by utilizing mm-wave wireless interconnects. The proposed solution has two parts: the first part proposes design methodology of a seamless hybrid wired and wireless interconnection network for the horizontally integrated multichip system to enable direct chip-to-chip communication between internal cores. Whereas the second part proposes a Wireless Network-on-Chip (WiNoC) architecture for the vertically integrated multichip system to realize data communication across interlayer microfluidic coolers eliminating the need to place and route signal TSVs through the cooling layers. The integration of wireless interconnect will significantly reduce the complexity of the co-design of TSV based interconnects and microchannel based interlayer cooling. Finally, this dissertation presents a combined trade-off evaluation of such wireless integration system in both horizontal and vertical sense and provides future directions for the design of the multichip system

    MOCAST 2021

    Get PDF
    The 10th International Conference on Modern Circuit and System Technologies on Electronics and Communications (MOCAST 2021) will take place in Thessaloniki, Greece, from July 5th to July 7th, 2021. The MOCAST technical program includes all aspects of circuit and system technologies, from modeling to design, verification, implementation, and application. This Special Issue presents extended versions of top-ranking papers in the conference. The topics of MOCAST include:Analog/RF and mixed signal circuits;Digital circuits and systems design;Nonlinear circuits and systems;Device and circuit modeling;High-performance embedded systems;Systems and applications;Sensors and systems;Machine learning and AI applications;Communication; Network systems;Power management;Imagers, MEMS, medical, and displays;Radiation front ends (nuclear and space application);Education in circuits, systems, and communications

    Recent Advances in Embedded Computing, Intelligence and Applications

    Get PDF
    The latest proliferation of Internet of Things deployments and edge computing combined with artificial intelligence has led to new exciting application scenarios, where embedded digital devices are essential enablers. Moreover, new powerful and efficient devices are appearing to cope with workloads formerly reserved for the cloud, such as deep learning. These devices allow processing close to where data are generated, avoiding bottlenecks due to communication limitations. The efficient integration of hardware, software and artificial intelligence capabilities deployed in real sensing contexts empowers the edge intelligence paradigm, which will ultimately contribute to the fostering of the offloading processing functionalities to the edge. In this Special Issue, researchers have contributed nine peer-reviewed papers covering a wide range of topics in the area of edge intelligence. Among them are hardware-accelerated implementations of deep neural networks, IoT platforms for extreme edge computing, neuro-evolvable and neuromorphic machine learning, and embedded recommender systems

    Single Electron Devices and Circuit Architectures: Modeling Techniques, Dynamic Characteristics, and Reliability Analysis

    Get PDF
    The Single Electron (SE) technology is an important approach to enabling further feature size reduction and circuit performance improvement. However, new methods are required for device modeling, circuit behavior description, and reliability analysis with this technology due to its unique operation mechanism. In this thesis, a new macro-model of SE turnstile is developed to describe its physical characteristics for large-scale circuit simulation and design. Based on this model, several novel circuit architectures are proposed and implemented to further demonstrate the advantages of SE technique. The dynamic behavior of SE circuits, which is different from their CMOS counterpart, is also investigated using a statistical method. With the unreliable feature of SE devices in mind, a fast and recursive algorithm is developed to evaluate the reliability of SE logic circuits in a more efficient and effective manner

    Dependable Embedded Systems

    Get PDF
    This Open Access book introduces readers to many new techniques for enhancing and optimizing reliability in embedded systems, which have emerged particularly within the last five years. This book introduces the most prominent reliability concerns from today’s points of view and roughly recapitulates the progress in the community so far. Unlike other books that focus on a single abstraction level such circuit level or system level alone, the focus of this book is to deal with the different reliability challenges across different levels starting from the physical level all the way to the system level (cross-layer approaches). The book aims at demonstrating how new hardware/software co-design solution can be proposed to ef-fectively mitigate reliability degradation such as transistor aging, processor variation, temperature effects, soft errors, etc. Provides readers with latest insights into novel, cross-layer methods and models with respect to dependability of embedded systems; Describes cross-layer approaches that can leverage reliability through techniques that are pro-actively designed with respect to techniques at other layers; Explains run-time adaptation and concepts/means of self-organization, in order to achieve error resiliency in complex, future many core systems

    Ensembles of Pruned Deep Neural Networks for Accurate and Privacy Preservation in IoT Applications

    Get PDF
    The emergence of the AIoT (Artificial Intelligence of Things) represents the powerful convergence of Artificial Intelligence (AI) with the expansive realm of the Internet of Things (IoT). By integrating AI algorithms with the vast network of interconnected IoT devices, we open new doors for intelligent decision-making and edge data analysis, transforming various domains from healthcare and transportation to agriculture and smart cities. However, this integration raises pivotal questions: How can we ensure deep learning models are aptly compressed and quantised to operate seamlessly on devices constrained by computational resources, without compromising accuracy? How can these models be effectively tailored to cope with the challenges of statistical heterogeneity and the uneven distribution of class labels inherent in IoT applications? Furthermore, in an age where data is a currency, how do we uphold the sanctity of privacy for the sensitive data that IoT devices incessantly generate while also ensuring the unhampered deployment of these advanced deep learning models? Addressing these intricate challenges forms the crux of this thesis, with its contributions delineated as follows: Ensyth: A novel approach designed to synthesise pruned ensembles of deep learning models, which not only makes optimal use of limited IoT resources but also ensures a notable boost in predictability. Experimental evidence gathered from CIFAR-10, CIFAR-5, and MNIST-FASHION datasets solidify its merit, especially given its capacity to achieve high predictability. MicroNets: Venturing into the realms of efficiency, this is a multi-phase pruning pipeline that fuses the principles of weight pruning, channel pruning. Its objective is clear: foster efficient deep ensemble learning, specially crafted for IoT devices. Benchmark tests conducted on CIFAR-10 and CIFAR-100 datasets demonstrate its prowess, highlighting a compression ratio of nearly 92%, with these pruned ensembles surpassing the accuracy metrics set by conventional models. FedNets: Recognising the challenges of statistical heterogeneity in federated learning and the ever-growing concerns of data privacy, this innovative federated learning framework is introduced. It facilitates edge devices in their collaborative quest to train ensembles of pruned deep neural networks. More than just training, it ensures data privacy remains uncompromised. Evaluations conducted on the Federated CIFAR-100 dataset offer a testament to its efficacy. In this thesis, substantial contributions have been made to the AIoT application domain. Ensyth, MicroNets, and FedNets collaboratively tackle the challenges of efficiency, accuracy, statistical heterogeneity arising from distributed class labels, and privacy concerns inherent in deploying AI applications on IoT devices. The experimental results underscore the effectiveness of these approaches, paving the way for their practical implementation in real-world scenarios. By offering an integrated solution that satisfies multiple key requirements simultaneously, this research brings us closer to the realisation of effective and privacy-preserved AIoT systems
    corecore