3 research outputs found

    Overcoming the Challenges for Multichip Integration: A Wireless Interconnect Approach

    Get PDF
    The physical limitations in the area, power density, and yield restrict the scalability of the single-chip multicore system to a relatively small number of cores. Instead of having a large chip, aggregating multiple smaller chips can overcome these physical limitations. Combining multiple dies can be done either by stacking vertically or by placing side-by-side on the same substrate within a single package. However, in order to be widely accepted, both multichip integration techniques need to overcome significant challenges. In the horizontally integrated multichip system, traditional inter-chip I/O does not scale well with technology scaling due to limitations of the pitch. Moreover, to transfer data between cores or memory components from one chip to another, state-of-the-art inter-chip communication over wireline channels require data signals to travel from internal nets to the peripheral I/O ports and then get routed over the inter-chip channels to the I/O port of the destination chip. Following this, the data is finally routed from the I/O to internal nets of the target chip over a wireline interconnect fabric. This multi-hop communication increases energy consumption while decreasing data bandwidth in a multichip system. On the other hand, in vertically integrated multichip system, the high power density resulting from the placement of computational components on top of each other aggravates the thermal issues of the chip leading to degraded performance and reduced reliability. Liquid cooling through microfluidic channels can provide cooling capabilities required for effective management of chip temperatures in vertical integration. However, to reduce the mechanical stresses and at the same time, to ensure temperature uniformity and adequate cooling competencies, the height and width of the microchannels need to be increased. This limits the area available to route Through-Silicon-Vias (TSVs) across the cooling layers and make the co-existence and co-design of TSVs and microchannels extreamly challenging. Research in recent years has demonstrated that on-chip and off-chip wireless interconnects are capable of establishing radio communications within as well as between multiple chips. The primary goal of this dissertation is to propose design principals targeting both horizontally and vertically integrated multichip system to provide high bandwidth, low latency, and energy efficient data communication by utilizing mm-wave wireless interconnects. The proposed solution has two parts: the first part proposes design methodology of a seamless hybrid wired and wireless interconnection network for the horizontally integrated multichip system to enable direct chip-to-chip communication between internal cores. Whereas the second part proposes a Wireless Network-on-Chip (WiNoC) architecture for the vertically integrated multichip system to realize data communication across interlayer microfluidic coolers eliminating the need to place and route signal TSVs through the cooling layers. The integration of wireless interconnect will significantly reduce the complexity of the co-design of TSV based interconnects and microchannel based interlayer cooling. Finally, this dissertation presents a combined trade-off evaluation of such wireless integration system in both horizontal and vertical sense and provides future directions for the design of the multichip system

    System Level Analysis And Design For Wireless Inter-Chip Interconnection Communication Systems By Applying Advanced Wireless Communication Technologies

    Get PDF
    As the dramatic development of high speed integrated circuits has progressed, the 60 GHz silicon technology has been introduced to enable much faster computer systems and their corresponding applications. However, when signals are propagating at 60 GHz or higher frequencies on a PCB (Printed Circuit Board), the crosstalk among signal buses and devices, trace losses, and introduced parasitic capacitance and inductance between high density traces, become significant and may be severe enough such that the inter-chip communications will not be able to meet computer system signal specifications. High speed circuit signal integrity researchers in both electronic industries and academia have explored various methodologies to resolve these high frequency issues. Moreover, Intel is introducing Ultra Path Interconnect (UPI) for multi-core server systems, which demands more than 2.44 Tbps data rate between two CPUs, and 1.5 Tbps data rate for PCIe channel operation. Recently, the concept of the wireless inter/intra-chip interconnection (WIIC) technology was introduced [19, 23] for solving high frequency signal integrity issues. Here this dissertation mainly focuses on the inter-chip case while still using the WIIC designation for generality. Various WIIC technologies have been presented in the literature, which have focused on the investigations on Ultra Wide-Band (UWB), propagation channels, modulations, antennas, and power controls and interference. However, not much research has focused on a system level design, which includes the lowest two layers of the communication protocol in a WIIC system, namely, the physical, and data link layers. Also, the previously published literature has rarely reached the data rate at 100 Gbps or higher, and none of the prior research has obtained a spectrum utilization ratio of 4 bit/Hz or greater. In addition, currently existing research has not fully taken advantage of advanced and matured wireless communication technologies such as Orthogonal Frequency Division Multiplexing (OFDM), high order modulation, and Multiple-Input/Multiple-Output (MIMO) systems for increasing data rates and improving reliability, although the use of UWB [29], conventional FDMA or TDMA [39], and binary modulations including Binary Phase Shift Keying (BPSK) [22], On-Off Keying (OOK) [31], and Amplitude Shift Keying (ASK) [35] have been studied in previous research. In this dissertation, a complete WIIC system and a representative WIIC channel model have been developed by taking full advantages of advanced wireless communication techniques. First, this research has analyzed the potential of higher-order modulation, error correction, OFDM, and channel coding to the WIIC setting. Although MIMO, interleaving and scrambling are also analyzed but not included in the current version of the proposed WIIC system, they could be featured in hypothetically ideal future research to determine their potential benefits. Second, the performance of a proposed WIIC system has been analyzed in order to reach 100 Gbps data rate. Third, a 60 GHz WIIC channel based on metamaterial Electronic Band Gap (EBG) absorbers has been designed and analyzed using the numerical electromagnetics solver HFSS® and this EBG is integrated into the representative WIIC channel. Moreover, the impulse response of the WIIC channel is numerically extracted and is used for the system validation and testing. Furthermore, the system has been simulated with the WIIC channel and the wired PCB channel. It has been found that, the Bit Error Rate (BER) performance of the proposed WIIC channel is close to that of an AWGN channel with FEC, and much better than the AWGN channel without FEC, which means that the designed WIIC system and channel work properly within the frequency band centered at 60 GHz, while the wired PCB channel is almost cut off at 15 GHz or higher for the cases investigated. With only five or six layers on a PCB board, the WIIC system is able to provide 384 Gbps data rate theoretically with 12 GHz bandwidth, while the wired PCB counterpart needs more than 20 layers in order to avoid severe SI problems and to properly layout the Tbps channels. The current version of the WIIC system is able to provide 24 Gbps data rate with the bandwidth of 12 GHz using OFDM and QPSK

    Wireless Chip-Scale Communications for Neural Network Accelerators

    Get PDF
    Wireless on-chip communications have been proposed as a complement to conventional Network-on-Chip (NoC) paradigms in manycore processors. In massively parallel architectures, the fast broadcast and reconfigurability capabilities of the wireless plane open the door to new scalable and adaptive architectures with significant impact on a plethora of fields. This thesis aims to explore such impact in the all-pervasive field of AI accelerators, designing and evaluating new accelerators augmented with wireless on-chip communication.The last decade has witnessed an explosive growth in the use of Deep Neural Networks in fields such as computer vision, natural language processing, medicine or economics. Their achievements in accuracy across so many relevant and different applications exhibit the enormous potential of this disruptive technology. However, this unprecedented performance is closely tied with the fact that their new designs contain much deeper and bigger layer sets, forcing them to manage millions - and in some cases even billions - of parameters. This comes at a high computational and communication cost at the processor level, which has prompted the development of new hardware aimed at handling such large computing expense more efficiently, the so called \acrlong{dnn} accelerators. This work explores the potential of enhancing the performance of these accelerators by introducing Wireless Networks-on-Chip in their design, a novel interconnect paradigm proposed by the research community to overcome some of the communication challenges that manycore systems face. Specifically, both on-chip and off-chip wireless interconnect implementations have been studied and evaluated. In the off-chip case, a theoretical improvement of 13X in the runtime has been achieved, but at the expense of some area and power overheads.La última década ha sido testigo de un inmenso crecimiento en el uso de Deep Neural Networks en campos como la visión artificial, procesamiento de lenguaje natural, medicina o economía. Haber conseguido estos resultados sin precedentes en aplicaciones tan relevantes y variadas muestra el enorme potencial de esta tecnología tan disruptiva. Sin embargo, estos logros van muy ligados al hecho de que los nuevos diseños contienen muchas más capas y más profundas, lo que se traduce en millones - y en algunos casos billones - de parámetros. Esto supone un gran coste computacional y de comunicación a nivel de procesador, lo que ha impulsado el desarrollo de nuevo hardware que permita gestionar tal coste de manera más eficiente, los llamados aceleradores de Deep Neural Networks. Este proyecto explora la potencial mejora en rendimiento de estos aceleradores mediante la introducción de Wireless Newtorks-on-Chip en su diseño, un nuevo paradigma de interconexiones propuesto por la comunidad científica para superar algunos de los problemas de comunicación que sistemas manycore deben afrontar. Específicamente, implementaciones tanto on-chip como off-chip se han estudiado y evaluado. Se ha conseguido una mejora teórica de 13X en el runtime, pero con algunos costes añadidos de área y potencia.La darrera dècada ha estat testimoni d'un immens creixement en l'ús de Deep Neural Networks en camps com la visió artificial, processament de llenguatge natural, medicina o economia. Haver aconseguit aquests resultats sense precedents en aplicacions tan rellevants i variades mostra l?enorme potencial d?aquesta tecnologia tan disruptiva. No obstant, aquests èxits van molt lligats al fet de que els nous dissenys contenen moltes més capes i més profundes, cosa que es tradueix en milions - i en alguns casos bilions - de paràmetres. Això suposa un gran cost computacional i de comunicació a nivell de processador, cosa que ha impulsat el desenvolupament de nou hardware que permetin gestionar tal cost de manera més eficient, els anomenats acceleradors de Deep Neural Networks. Aquest projecte explora la potencial millora en rendiment d'aquests acceleradors mitjançant la introducció de Wireless Newtorks-on-Chip al seu disseny, un nou paradigma d'interconnexions proposat per la comunitat científica per a superar alguns dels problemes de comunicació que sistemes manycore han d'afrontar. Específicament, implementacions tant on-chip com off-chip s'han estudiat i evaluat. En el cas off-chip, s'ha aconseguit una millora teòrica de 13X al runtime però amb alguns costos afegits d'àrea i potència
    corecore