12 research outputs found
Photonic Neural Networks and Optics-informed Deep Learning Fundamentals
The recent explosive compute growth, mainly fueled by the boost of AI and
DNNs, is currently instigating the demand for a novel computing paradigm that
can overcome the insurmountable barriers imposed by conventional electronic
computing architectures. PNNs implemented on silicon integration platforms
stand out as a promising candidate to endow NN hardware, offering the potential
for energy efficient and ultra-fast computations through the utilization of the
unique primitives of photonics i.e. energy efficiency, THz bandwidth and
low-latency. Thus far, several demonstrations have revealed the huge potential
of PNNs in performing both linear and non-linear NN operations at unparalleled
speed and energy consumption metrics. Transforming this potential into a
tangible reality for DL applications requires, however, a deep understanding of
the basic PNN principles, requirements and challenges across all constituent
architectural, technological and training aspects. In this tutorial, we,
initially, review the principles of DNNs along with their fundamental building
blocks, analyzing also the key mathematical operations needed for their
computation in a photonic hardware. Then, we investigate, through an intuitive
mathematical analysis, the interdependence of bit precision and energy
efficiency in analog photonic circuitry, discussing the opportunities and
challenges of PNNs. Followingly, a performance overview of PNN architectures,
weight technologies and activation functions is presented, summarizing their
impact in speed, scalability and power consumption. Finally, we provide an
holistic overview of the optics-informed NN training framework that
incorporates the physical properties of photonic building blocks into the
training process in order to improve the NN classification accuracy and
effectively elevate neuromorphic photonic hardware into high-performance DL
computational settings
Compiler transformations in hardware synthesis of Mpeg2 codes
High-level synthesis is the technique that translates high-level programming language programs into equivalent hardware descriptions. The use of conventional programming languages as input to high-level synthesis is challenging, due to the conceptual differences between software programs and hardware descriptions, but is nonetheless becoming the preferred input to high-level synthesis tools. Compilers play an important role in this process, since they can not only bridge such differences, thus making high-level synthesis tools better accepted by the scientific community, but they can also apply code transformations that target an optimized hardware output. In this paper, we discuss a number of transformations that can be implemented in the C language front end of the CCC high-level synthesis tool. We present experiments of such transformations conducted on the MPEG2 open-source code, which prove that compiler optimizations can have a significant positive impact in high-level synthesis tools. © 2016 IEEE
Source-level compiler optimizations for high-level synthesis
With high-level synthesis becoming the preferred method for hardware design, tools that operate on high-level programming languages and optimize hardware output are crucial for successful synthesis. In high-level synthesis, conventional programming language codes describe hardware behavior. Those codes are translated into RTL-level description by some appropriate tool. Common such tools that not only translate, but also optimize code, are programming language compilers. Compilers can make the transition from software to hardware smooth, allowing programmers to use their software skills on hardware programming, without any language compromises. Nonetheless, compilers also utilize optimization techniques to obtain a better output hardware description. In this paper, we discuss compiler issues for high-level synthesis, and present the results of several compiler transformations that can be implemented on our C language compiler front end of the CCC high-level synthesis tool. The results are taken from experiments conducted on the MPEG2 open-source codes, and prove the importance of such transformations in high-level synthesis. Copyright is held by the owner/author(s)
End-to-end optical packet switching with burst-mode reception at 25 Gb/s through a 1024-port 25.6 Tb/s capacity Hipoλaos Optical Packet Switch
We demonstrate end-to-end 25Gb/s true optical packet switching featuring burst-mode reception with <;50ns locking time through a 1024-port 25.6Tb/s capacity Hipoλaos Optical Packet Switch architecture. Error-free performance at 10 -9 was obtained for all validated port-combinations
Performance analysis of a 1024-port Hipoλaos OPS in DCN, HPC, and 5G fronthauling Ethernet applications
The explosive traffic growth of emerging cloud, augmented/virtual reality, artificial intelligence, and 5G applications along with the inherent need for high-bandwidth transport of big data has fueled the expansion of the Ethernet switch market for data centers (DCs), high-performance computers (HPCs), and 5G fronthaul networks. As such, next-generation switches have to be capable of conforming to the requirements of a versatile traffic environment, expanding along DC, HPC, and 5G fronthauling infrastructures while meeting the performance needs of these different application sectors. Within this frame, we experimentally validate for the first time, to the best of our knowledge, the performance of the recently reported Hipoλaos optical packet switch (OPS) within Ethernet-based traffic exchange in DC and 5G fronthaul testbeds, highlighting the credentials of the 1024-port and 10.24 Tb/s capacity OPS to successfully support real-world DC and 5G applications. Error-free unicast and dual-output multicast Ethernet packet transmission at 10 Gb/s are successfully validated for different output ports of the switch, followed by successful server-to-server high-definition video transmission, both when the OPS was employed in the DC as well as in the 5G fronthauling testbed. Network performance and stability over time were confirmed through several measurements carried out through the iperf application suite, revealing submicrosecond end-to-end (Layer 7) latency performance. Finally, an OMNeT++ simulation analysis for an Ethernet-switched Hipoλaos network utilizing real-world application traces collected from the MareNostrum HPC system revealed up to 89% lower latency performance compared to the actual system.</p
Noise-resilient and high-speed deep learning with coherent silicon photonics
The explosive growth of deep learning applications has triggered a new era in computing hardware, targeting the efficient deployment of multiply-and-accumulate operations. In this realm, integrated photonics have come to the foreground as a promising energy efficient deep learning technology platform for enabling ultra-high compute rates. However, despite integrated photonic neural network layouts have already penetrated successfully the deep learning era, their compute rate and noise-related characteristics are still far beyond their promise for high-speed photonic engines. Herein, we demonstrate experimentally a noise-resilient deep learning coherent photonic neural network layout that operates at 10GMAC/sec/axon compute rates and follows a noise-resilient training model. The coherent photonic neural network has been fabricated as a silicon photonic chip and its MNIST classification performance was experimentally evaluated to support accuracy values of >99% and >98% at 5 and 10GMAC/sec/axon, respectively, offering 6× higher on-chip compute rates and >7% accuracy improvement over state-of-the-art coherent implementations.</p
Lossless silicon photonic ROADM based on a Si3N4 platform and a monolithically integrated Erbium Doped Amplifier
The first demonstration of a lossless four-port silicon photonic ROADM-node based on a monolithic-integrated spiral Al2O3:Er3+ Erbium Doped Waveguide Amplifier and MZI-interleaver layout on a Si3N4 platform is presented, routing a 4×50Gb/s WDM data-traffic capacity
Lossless silicon photonic ROADM based on a Si3N4 platform and a monolithically integrated Erbium Doped Amplifier
The first demonstration of a lossless four-port silicon photonic ROADM-node based on a monolithic-integrated spiral Al2O3:Er3+ Erbium Doped Waveguide Amplifier and MZI-interleaver layout on a Si3N4 platform is presented, routing a 4×50Gb/s WDM data-traffic capacity
Silicon integrated photonic-electronic neuron for noise-resilient deep learning
This paper presents an experimental demonstration of the photonic segment of a photonic-electronic multiply accumulate neuron (PEMAN) architecture, employing a silicon photonic chip with high-speed electro-absorption modulators for matrix-vector multiplications. The photonic integrated circuit has been evaluated through a noise-sensitive three-layer neural network (NN) with 1350 trainable parameters targeting heartbeat sound classification for health monitoring purposes. Its experimental validation revealed F1-scores of 85.9% and 81% at compute rates of 10 and 20 Gbaud, respectively, exploiting quantization- and noise-aware deep learning techniques and introducing a novel activation function slope stretching strategy for mitigating noise impairments. The enhanced noise-resilient properties of this novel training model are confirmed via simulations for varying noise levels, being in excellent agreement with the respective experimental data obtained at 10, 20, and 30 Gbaud symbol rates