We present a vertical integration approach for designing silicon photonic networks for communication in manycore systems. Using a top-down approach we project the photonic device requirements for a 64-tile system designed in 22 nm technology.
Introduction
To sustain the historic performance improvement in VLSI systems, while remaining within the power envelope, the trend has moved towards designing multiple cores on a single die. The core count per die is steadily increasing [2, 7, 8, 11] (hundreds of cores per die expected in the near future) in turn increasing the bandwidth demands on on-chip and off-chip communication networks. The projected enhancements in the current electrical solutions (both on-chip [5] and off-chip) may not be able to satisfy these increasing bandwidth requirements as they would get limited by bandwidth density and power dissipation. It is therefore necessary to explore alternate interconnect technologies that have high energy-efficiency and bandwidth-density. Silicon photonics interconnects could be a viable solution to this bandwidth problem due to its energy-efficient modulation and detection, and high bandwidth density due to dense-wavelength division multiplexing (DWDM). However, these photonic systems need to be jointly designed at device, circuit and system level to maximize the potential benefits of using silicon photonics technology.
Monolithic integration of photonic devices in CMOS technology
Most of the methodologies that have been used so far for designing photonic devices have used specialized processes and materials like SiN [1] , silicon-on-insulator (SoI) with thick buried oxide [4] , etc which are not compatible with existing commercial manufacturing flows for processors. To minimize the modulation/detection energy costs and area while providing high bandwidth density, and ensure a good return on the massive investments to setup standardized process flows, the photonic devices need to be monolithically integrated. There have been some efforts to monolithically integrate photonic devices using the commercial bulk CMOS process [6, 9] and SoI process with thin buried oxide [3] . We adopt a top-down approach, and attempt to project the photonic device specifications to design opto-electrical networks using commercial process flows for manycore systems.
Photonic network design
We adopt a power-constrained approach and investigate the network design space for a 64-tile system operating at 2.5 GHz frequency and designed using 22 nm technology. Each tile can have one or more cores. Figure 1 shows a u-shaped layout of the waveguides for implementing an example global crossbar network topology for off-chip communication in the 64-tile system. The physical layout drives the photonic device design specifications -like optical losses, thermal tuning sensitivity, etc. For example, in the global crossbar network, where waveguide loss and through losses are dominant, for a total cores-to-memory bandwidth of 80 Tbps and optical laser power budget of 4 W, the u-shaped layout can tolerate waveguide loss of 2 dB/cm (see Figure 2) if we can achieve a through loss of 0.0001 dB/ring [12] . Similarly, a through loss of more than 0.05 dB/ring can be tolerated if we could reduce the waveguide loss to be close to 0 dB/cm. If we scale the cores-to-memory bandwidth requirement from 40 Tbps to 80 Tbps to 160 Tbps, the waveguide loss requirements scale from 2.8 dB/cm to 2 dB/cm to 1.4 dB/cm. The waveguide loss and through loss specifications can similarly be projected for other physical layouts like ring matrix (1 dB/cm waveguide loss, 0.01 dB/ring through loss), serpentine (0.8 dB/cm waveguide loss, 0.05 dB/ring through loss), etc. For a desired bandwidth, a change in photonic device losses affects the total area of the photonic devices. Figure 3 shows the contour plot of the percent area of the photonic devices for the 64-tile system with 80 Tbps cores-to-memory bandwidth. The physical layout also imposes a limit on the thermal sensitivities of the photonic devices. For the same 64-tile example with 80 Tbps core-to-memory bandwidth, u-shaped layout spends 0.32 W of power for thermal tuning assuming a very optimistic thermal tuning circuit cost of 20 µW/ring. The ring matrix and serpentine require 0.64 W and 0.32 W of thermal tuning power, indicating the need for rings with lower thermal sensitivity in ring matrix layout. The thermal tuning cost also scales with bandwidth. These optical loss and thermal sensitivity specifications can similarly be determined for other logical topologies Figure 1 : Global crossbar network design for a 64-tile system implemented using a u-shaped physical network. and on-chip networks. Thus, while designing the silicon photonic communication networks (on-chip and off-chip) we need to be cognizant of the tight interaction between the choice of network topology, physical layout, circuit design and device design, and should jointly optimize across the various design levels. 
