11 research outputs found

    Towards Compelling Cases for the Viability of Silicon-Nanophotonic Technology in Future Many-core Systems

    Get PDF
    Many crossbenchmarking results reported in the open literature raise optimistic expectations on the use of optical networks-on-chip (ONoCs) for high-performance and low-power on-chip communications in future Manycore Systems. However, these works ultimately fail to make a compelling case for the viability of silicon-nanophotonic technology for two fundamental reasons: (1)Lack of aggressive electrical baselines (ENoCs). (2) Inaccuracy in physical- and architecture-layer analysis of the ONoC. This thesis aims at providing the guidelines and minimum requirements so that nanophotonic emerging technology may become of practical relevance. The key enabler for this study is a cross-layer design methodology of the optical transport medium, ranging from the consideration of the predictability gap between ONoC logic schemes and their physical implementations, up to architecture-level design issues such as the network interface and its co-design requirements with the memory hierarchy. In order to increase the practical relevance of the study, we consider a consolidated electrical NoC counterpart with an optimized architecture from a performance and power viewpoint. The quality metrics of this latter are derived from synthesis and place&route on an industrial 40nm low-power technology library. Building on this methodology, we are able to provide a realistic energy efficiency comparison between ONoC and ENoC both at the level of the system interconnect and of the system as a whole, pointing out the sensitivity of the results to the maturity of the underlying silicon nanophotonic technology, and at the same time paving the way towards compelling cases for the viability of such technology in next generation many-cores systems

    Thermal Aware Design Method for VCSEL-Based On-Chip Optical Interconnect

    Full text link
    Optical Network-on-Chip (ONoC) is an emerging technology considered as one of the key solutions for future generation on-chip interconnects. However, silicon photonic devices in ONoC are highly sensitive to temperature variation, which leads to a lower efficiency of Vertical-Cavity Surface-Emitting Lasers (VCSELs), a resonant wavelength shift of Microring Resonators (MR), and results in a lower Signal to Noise Ratio (SNR). In this paper, we propose a methodology enabling thermal-aware design for optical interconnects relying on CMOS-compatible VCSEL. Thermal simulations allow designing ONoC interfaces with low gradient temperature and analytical models allow evaluating the SNR.Comment: IEEE International Conference on Design Automation and Test in Europe (DATE 2015), Mar 2015, Grenoble, France. 201

    Layout aware router design and optimization for Wavelength-Routed Optical NoCs

    Get PDF
    Optical Networks-on-Chip are a promising solution for high-performance multi-core integration with better latency and bandwidth than traditional Electrical NoCs. Wavelength-routed ONoCs offer yet additional performance guarantees. However, WRONoC design presents new EDA challenges which have not yet been fully addressed. So far, most topology analysis is abstract, i.e., overlooks layout concerns, while for layout the tools available perform P&R but no topology optimization. Thus, a need arises for a novel optimization method combining both aspects of WRONoC design. In this thesis such a method is proposed and compared to the state of the art design procedure. Results available so far show a remarkable 50% reduction in maximum insertion loss with this new approach

    Security of Electrical, Optical and Wireless On-Chip Interconnects: A Survey

    Full text link
    The advancement of manufacturing technologies has enabled the integration of more intellectual property (IP) cores on the same system-on-chip (SoC). Scalable and high throughput on-chip communication architecture has become a vital component in today's SoCs. Diverse technologies such as electrical, wireless, optical, and hybrid are available for on-chip communication with different architectures supporting them. Security of the on-chip communication is crucial because exploiting any vulnerability would be a goldmine for an attacker. In this survey, we provide a comprehensive review of threat models, attacks, and countermeasures over diverse on-chip communication technologies as well as sophisticated architectures.Comment: 41 pages, 24 figures, 4 table

    High-Performance and Wavelength-Reused Optical Network on Chip (ONoC) Architectures and Communication Schemes for Manycore Processor

    Get PDF
    Optical Network on Chip (ONoC) is an emerging chip-scale optical interconnection technology to realize the high-performance and power-efficient inter-core communication for many-core processors. By utilizing the silicon photonic interconnects to transmit data packets with optical signals, it can achieve ultra low communication delay, high bandwidth capacity, and low power dissipation. With the benefits of Wavelength Division Multiplexing (WDM), multiple optical signals can simultaneously be transmitted in the same optical interconnect through different wavelengths. Thus, the WDM-based ONoC is becoming a hot research topic recently. However, the maximal number of available wavelengths is restricted for the reliable and power-efficient optical communication in ONoC. Hence, with a limited number of wavelengths, the design of high-performance and power-efficient ONoC architecture is an important and challenging problem. In this thesis, the design methodology of wavelength-reused ONoC architecture is explored. With the wavelength reuse scheme in optical routing paths, high-performance and power-efficient communication is realized for many-core processors only using a small number of available wavelengths. Three wavelength-reused ONoC architectures and communication schemes are proposed to fulfil different communication requirements, i.e., network scalability, multicast communication, and dark silicon. Firstly, WRH-ONoC, a wavelength-reused hierarchical Optical Network on Chip architecture, is proposed to achieve high network scalability, namely obtaining low communication delay and high throughput capacity for hundreds of thousands of cores by reusing the limited number of available wavelengths with the modest hardware cost and energy overhead. WRH-ONoC combines the advantages of non-blocking communication in each lambda-router and wavelength reuse in all lambda-routers through the hierarchical networking. Both theoretical analysis and simulation results indicate that WRH-ONoC can achieve prominent improvement on the communication performance and scalability (e.g., 46.0% of reduction on the zero-load packet delay and 72.7% of improvement on the network throughput for 400 cores with small hardware cost and energy overhead) in comparison with existing schemes. Secondly, DWRMR, a dynamical wavelength-reused multicast scheme based on the optical multicast ring, is proposed for widely existing multicast communications in many-core processors. In DWRMR, an optical multicast ring is dynamically constructed for each multicast group and the multicast packets are transmitted in a single-send-multi-receive manner requiring only one wavelength. All the cores in the same multicast group can reuse the established multicast ring through an optical token arbitration scheme for the interactive multicast communications, thereby avoiding the frequent construction of multicast routing paths dedicatedly for each core. Simulation results indicate that DWRMR can reduce more than 50% of end-to-end packet delay with slight hardware cost, or require only half number of wavelengths to achieve the same performance compared with existing schemes. Thirdly, Dark-ONoC, a dynamically configurable ONoC architecture, is proposed for the many-core processor with dark silicon. Dark silicon is an inevitable phenomenon that only a small number of cores can be activated simultaneously while the other cores must stay in dark state (power-gated) due to the restricted power budget. Dark-ONoC periodically allocates non-blocking optical routing paths only between the active cores with as less wavelengths as possible. Thus, it can obtain high-performance communication and low power consumption at the same time. Extensive simulations are conducted with the dark silicon patterns from both synthetic distribution and real data traces. The simulation results indicate that the number of wavelengths is reduced by around 15% and the overall power consumption is reduced by 23.4% compared to existing schemes. Finally, this thesis concludes several important principles on the design of wavelength-reused ONoC architecture, and summarizes some perspective issues for the future research

    Design Space Exploration and Resource Management of Multi/Many-Core Systems

    Get PDF
    The increasing demand of processing a higher number of applications and related data on computing platforms has resulted in reliance on multi-/many-core chips as they facilitate parallel processing. However, there is a desire for these platforms to be energy-efficient and reliable, and they need to perform secure computations for the interest of the whole community. This book provides perspectives on the aforementioned aspects from leading researchers in terms of state-of-the-art contributions and upcoming trends

    Monolithic electronic-photonic integration in state-of-the-art CMOS processes

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student submitted PDF version of thesis.Includes bibliographical references (p. 388-407).As silicon CMOS transistors have scaled, increasing the density and energy efficiency of computation on a single chip, the off-chip communication link to memory has emerged as the major bottleneck within modern processors. Photonic devices promise to break this bottleneck with superior bandwidth-density and energy-efficiency. Initial work by many research groups to adapt photonic device designs to a silicon-based material platform demonstrated suitable independent performance for such links. However, electronic-photonic integration attempts to date have been limited by the high cost and complexity associated with modifying CMOS platforms suitable for modern high-performance computing applications. In this work, we instead utilize existing state-of-the-art electronic CMOS processes to fabricate integrated photonics by: modifying designs to match the existing process; preparing a design-rule compliant layout within industry-standard CAD tools; and locally-removing the handle silicon substrate in the photonic region through post-processing. This effort has resulted in the fabrication of seven test chips from two major foundries in 28, 45, 65 and 90 nm CMOS processes. Of these efforts, a single die fabricated through a widely available 45nm SOI-CMOS mask-share foundry with integrated waveguides with 3.7 dB/cm propagation loss alongside unmodified electronics with less than 5 ps inverter stage delay serves as a proof-of-concept for this approach. Demonstrated photonic devices include high-extinction carrier-injection modulators, 8-channel wavelength division multiplexing filter banks and low-efficiency silicon germanium photodetectors. Simultaneous electronic-photonic functionality is verified by recording a 600 Mb/s eye diagram from a resonant modulator driven by integrated digital circuits. Initial work towards photonic device integration within the peripheral CMOS flow of a memory process that has resulted in polysilicon waveguide propagation losses of 6.4 dB/cm will also be presented.by Jason S. Orcutt.Ph.D

    On the simulation and design of manycore CMPs

    Get PDF
    The progression of Moore’s Law has resulted in both embedded and performance computing systems which use an ever increasing number of processing cores integrated in a single chip. Commercial systems are now available which provide hundreds of cores, and academics have proposed architectures for up to 1024 cores. Embedded multicores are increasingly popular as it is easier to guarantee hard-realtime constraints using individual cores dedicated for tasks, than to use traditional time-multiplexed processing. However, finding the optimal hardware configuration to meet these requirements at minimum cost requires extensive trial and error approaches to investigate the design space. This thesis tackles the problems encountered in the design of these large scale multicore systems by first addressing the problem of fast, detailed micro-architectural simulation. Initially addressing embedded systems, this work exploits the lack of hardware cache-coherence support in many deeply embedded systems to increase the available parallelism in the simulation. Then, through partitioning the NoC and using packet counting and cycle skipping reduces the amount of computation required to accurately model the NoC interconnect. In combination, this enables simulation speeds significantly higher than the state of the art, while maintaining less error, when compared to real hardware, than any similar simulator. Simulation speeds reach up to 370MIPS (Million (target) Instructions Per Second), or 110MHz, which is better than typical FPGA prototypes, and approaching final ASIC production speeds. This is achieved while maintaining an error of only 2.1%, significantly lower than other similar simulators. The thesis continues by scaling the simulator past large embedded systems up to 64-1024 core processors, adding support for coherent architectures using the same packet counting techniques along with low overhead context switching to enable the simulation of such large systems with stricter synchronisation requirements. The new interconnect model was partitioned to enable parallel simulation to further improve simulation speeds in a manner which did not sacrifice any accuracy. These innovations were leveraged to investigate significant novel energy saving optimisations to the coherency protocol, processor ISA, and processor micro-architecture. By introducing a new instruction, with the name wait-on-address, the energy spent during spin-wait style synchronisation events can be significantly reduced. This functions by putting the core into a low-power idle state while the cache line of the indicated address is monitored for coherency action. Upon an update or invalidation (or traditional timer or external interrupts) the core will resume execution, but the active energy of running the core pipeline and repeatedly accessing the data and instruction caches is effectively reduced to static idle power. The thesis also shows that existing combined software-hardware schemes to track data regions which do not require coherency can adequately address the directory-associativity problem, and introduces a new coherency sharer encoding which reduces the energy consumed by sharer invalidations when sharers are grouped closely together, such as would be the case with a system running many tasks with a small degree of parallelism in each. The research concludes by using the extremely fast simulation speeds developed to produce a large set of training data, collecting various runtime and energy statistics for a wide range of embedded applications on a huge diverse range of potential MPSoC designs. This data was used to train a series of machine learning based models which were then evaluated on their capacity to predict performance characteristics of unseen workload combinations across the explored MPSoC design space, using only two sample simulations, with promising results from some of the machine learning techniques. The models were then used to produce a ranking of predicted performance across the design space, and on average Random Forest was able to predict the best design within 89% of the runtime performance of the actual best tested design, and better than 93% of the alternative design space. When predicting for a weighted metric of energy, delay and area, Random Forest on average produced results within 93% of the optimum result. In summary this thesis improves upon the state of the art for cycle accurate multicore simulation, introduces novel energy saving changes the the ISA and microarchitecture of future multicore processors, and demonstrates the viability of machine learning techniques to significantly accelerate the design space exploration required to bring a new manycore design to market

    Contrasting wavelength-routed optical NoC topologies for power-efficient 3D-stacked multicore processors using physical-layer analysis

    No full text
    Optical networks-on-chip (ONoCs) are currently still in the concept stage, and would benefit from explorative studies capable of bridging the gap between abstract analysis frameworks and the constraints and challenges posed by the physical layer. This paper aims to go beyond the traditional comparison of wavelength-routed ONoC topologies based only on their abstract properties, and for the first time assesses their physical implementation efficiency in an homogeneous experimental setting of practical relevance. As a result, the paper can demonstrate the significant and different deviation of topology layouts from their logic schemes under the effect of placement constraints on the target system. This becomes then the preliminary step for the accurate characterization of technology-specific metrics such as the insertion loss critical path, and to derive the ultimate impact on power efficiency and feasibility of each design
    corecore