299 research outputs found

    Experimentally realized in situ backpropagation for deep learning in nanophotonic neural networks

    Full text link
    Neural networks are widely deployed models across many scientific disciplines and commercial endeavors ranging from edge computing and sensing to large-scale signal processing in data centers. The most efficient and well-entrenched method to train such networks is backpropagation, or reverse-mode automatic differentiation. To counter an exponentially increasing energy budget in the artificial intelligence sector, there has been recent interest in analog implementations of neural networks, specifically nanophotonic neural networks for which no analog backpropagation demonstration exists. We design mass-manufacturable silicon photonic neural networks that alternately cascade our custom designed "photonic mesh" accelerator with digitally implemented nonlinearities. These reconfigurable photonic meshes program computationally intensive arbitrary matrix multiplication by setting physical voltages that tune the interference of optically encoded input data propagating through integrated Mach-Zehnder interferometer networks. Here, using our packaged photonic chip, we demonstrate in situ backpropagation for the first time to solve classification tasks and evaluate a new protocol to keep the entire gradient measurement and update of physical device voltages in the analog domain, improving on past theoretical proposals. Our method is made possible by introducing three changes to typical photonic meshes: (1) measurements at optical "grating tap" monitors, (2) bidirectional optical signal propagation automated by fiber switch, and (3) universal generation and readout of optical amplitude and phase. After training, our classification achieves accuracies similar to digital equivalents even in presence of systematic error. Our findings suggest a new training paradigm for photonics-accelerated artificial intelligence based entirely on a physical analog of the popular backpropagation technique.Comment: 23 pages, 10 figure

    Dual adaptive training of photonic neural networks

    Full text link
    Photonic neural network (PNN) is a remarkable analog artificial intelligence (AI) accelerator that computes with photons instead of electrons to feature low latency, high energy efficiency, and high parallelism. However, the existing training approaches cannot address the extensive accumulation of systematic errors in large-scale PNNs, resulting in a significant decrease in model performance in physical systems. Here, we propose dual adaptive training (DAT) that allows the PNN model to adapt to substantial systematic errors and preserves its performance during the deployment. By introducing the systematic error prediction networks with task-similarity joint optimization, DAT achieves the high similarity mapping between the PNN numerical models and physical systems and high-accurate gradient calculations during the dual backpropagation training. We validated the effectiveness of DAT by using diffractive PNNs and interference-based PNNs on image classification tasks. DAT successfully trained large-scale PNNs under major systematic errors and preserved the model classification accuracies comparable to error-free systems. The results further demonstrated its superior performance over the state-of-the-art in situ training approaches. DAT provides critical support for constructing large-scale PNNs to achieve advanced architectures and can be generalized to other types of AI systems with analog computing errors.Comment: 31 pages, 11 figure

    BPLight-CNN: A Photonics-based Backpropagation Accelerator for Deep Learning

    Full text link
    Training deep learning networks involves continuous weight updates across the various layers of the deep network while using a backpropagation algorithm (BP). This results in expensive computation overheads during training. Consequently, most deep learning accelerators today employ pre-trained weights and focus only on improving the design of the inference phase. The recent trend is to build a complete deep learning accelerator by incorporating the training module. Such efforts require an ultra-fast chip architecture for executing the BP algorithm. In this article, we propose a novel photonics-based backpropagation accelerator for high performance deep learning training. We present the design for a convolutional neural network, BPLight-CNN, which incorporates the silicon photonics-based backpropagation accelerator. BPLight-CNN is a first-of-its-kind photonic and memristor-based CNN architecture for end-to-end training and prediction. We evaluate BPLight-CNN using a photonic CAD framework (IPKISS) on deep learning benchmark models including LeNet and VGG-Net. The proposed design achieves (i) at least 34x speedup, 34x improvement in computational efficiency, and 38.5x energy savings, during training; and (ii) 29x speedup, 31x improvement in computational efficiency, and 38.7x improvement in energy savings, during inference compared to the state-of-the-art designs. All these comparisons are done at a 16-bit resolution; and BPLight-CNN achieves these improvements at a cost of approximately 6% lower accuracy compared to the state-of-the-art

    A training algorithm for networks of high-variability reservoirs

    Get PDF
    Physical reservoir computing approaches have gained increased attention in recent years due to their potential for low-energy high-performance computing. Despite recent successes, there are bounds to what one can achieve simply by making physical reservoirs larger. Therefore, we argue that a switch from single-reservoir computing to multi-reservoir and even deep physical reservoir computing is desirable. Given that error backpropagation cannot be used directly to train a large class of multi-reservoir systems, we propose an alternative framework that combines the power of backpropagation with the speed and simplicity of classic training algorithms. In this work we report our findings on a conducted experiment to evaluate the general feasibility of our approach. We train a network of 3 Echo State Networks to perform the well-known NARMA-10 task, where we use intermediate targets derived through backpropagation. Our results indicate that our proposed method is well-suited to train multi-reservoir systems in an efficient way

    Towards fully integrated photonic backpropagation training and inference using on-chip nonlinear activation and gradient functions

    Full text link
    Gradient descent-based backpropagation training is widely used in many neural network systems. However, photonic implementation of such method is not straightforward mainly since having both the nonlinear activation function and its gradient using standard integrated photonic components is challenging. Here, we demonstrate the realization of two commonly used neural nonlinear activation functions and their gradients on a silicon photonic platform. Our method leverages the nonlinear electro-optic response of a micro-disk modulator. As a proof of concept, the experimental results are incorporated into a neural network simulation platform to classify MNIST handwritten digits dataset where we classification accuracies of more than 97\% are achieved that are on par with those of ideal nonlinearities and gradients
    corecore