299 research outputs found
Experimentally realized in situ backpropagation for deep learning in nanophotonic neural networks
Neural networks are widely deployed models across many scientific disciplines
and commercial endeavors ranging from edge computing and sensing to large-scale
signal processing in data centers. The most efficient and well-entrenched
method to train such networks is backpropagation, or reverse-mode automatic
differentiation. To counter an exponentially increasing energy budget in the
artificial intelligence sector, there has been recent interest in analog
implementations of neural networks, specifically nanophotonic neural networks
for which no analog backpropagation demonstration exists. We design
mass-manufacturable silicon photonic neural networks that alternately cascade
our custom designed "photonic mesh" accelerator with digitally implemented
nonlinearities. These reconfigurable photonic meshes program computationally
intensive arbitrary matrix multiplication by setting physical voltages that
tune the interference of optically encoded input data propagating through
integrated Mach-Zehnder interferometer networks. Here, using our packaged
photonic chip, we demonstrate in situ backpropagation for the first time to
solve classification tasks and evaluate a new protocol to keep the entire
gradient measurement and update of physical device voltages in the analog
domain, improving on past theoretical proposals. Our method is made possible by
introducing three changes to typical photonic meshes: (1) measurements at
optical "grating tap" monitors, (2) bidirectional optical signal propagation
automated by fiber switch, and (3) universal generation and readout of optical
amplitude and phase. After training, our classification achieves accuracies
similar to digital equivalents even in presence of systematic error. Our
findings suggest a new training paradigm for photonics-accelerated artificial
intelligence based entirely on a physical analog of the popular backpropagation
technique.Comment: 23 pages, 10 figure
Dual adaptive training of photonic neural networks
Photonic neural network (PNN) is a remarkable analog artificial intelligence
(AI) accelerator that computes with photons instead of electrons to feature low
latency, high energy efficiency, and high parallelism. However, the existing
training approaches cannot address the extensive accumulation of systematic
errors in large-scale PNNs, resulting in a significant decrease in model
performance in physical systems. Here, we propose dual adaptive training (DAT)
that allows the PNN model to adapt to substantial systematic errors and
preserves its performance during the deployment. By introducing the systematic
error prediction networks with task-similarity joint optimization, DAT achieves
the high similarity mapping between the PNN numerical models and physical
systems and high-accurate gradient calculations during the dual backpropagation
training. We validated the effectiveness of DAT by using diffractive PNNs and
interference-based PNNs on image classification tasks. DAT successfully trained
large-scale PNNs under major systematic errors and preserved the model
classification accuracies comparable to error-free systems. The results further
demonstrated its superior performance over the state-of-the-art in situ
training approaches. DAT provides critical support for constructing large-scale
PNNs to achieve advanced architectures and can be generalized to other types of
AI systems with analog computing errors.Comment: 31 pages, 11 figure
BPLight-CNN: A Photonics-based Backpropagation Accelerator for Deep Learning
Training deep learning networks involves continuous weight updates across the
various layers of the deep network while using a backpropagation algorithm
(BP). This results in expensive computation overheads during training.
Consequently, most deep learning accelerators today employ pre-trained weights
and focus only on improving the design of the inference phase. The recent trend
is to build a complete deep learning accelerator by incorporating the training
module. Such efforts require an ultra-fast chip architecture for executing the
BP algorithm. In this article, we propose a novel photonics-based
backpropagation accelerator for high performance deep learning training. We
present the design for a convolutional neural network, BPLight-CNN, which
incorporates the silicon photonics-based backpropagation accelerator.
BPLight-CNN is a first-of-its-kind photonic and memristor-based CNN
architecture for end-to-end training and prediction. We evaluate BPLight-CNN
using a photonic CAD framework (IPKISS) on deep learning benchmark models
including LeNet and VGG-Net. The proposed design achieves (i) at least 34x
speedup, 34x improvement in computational efficiency, and 38.5x energy savings,
during training; and (ii) 29x speedup, 31x improvement in computational
efficiency, and 38.7x improvement in energy savings, during inference compared
to the state-of-the-art designs. All these comparisons are done at a 16-bit
resolution; and BPLight-CNN achieves these improvements at a cost of
approximately 6% lower accuracy compared to the state-of-the-art
A training algorithm for networks of high-variability reservoirs
Physical reservoir computing approaches have gained increased attention in recent years due to their potential for low-energy high-performance computing. Despite recent successes, there are bounds to what one can achieve simply by making physical reservoirs larger. Therefore, we argue that a switch from single-reservoir computing to multi-reservoir and even deep physical reservoir computing is desirable. Given that error backpropagation cannot be used directly to train a large class of multi-reservoir systems, we propose an alternative framework that combines the power of backpropagation with the speed and simplicity of classic training algorithms. In this work we report our findings on a conducted experiment to evaluate the general feasibility of our approach. We train a network of 3 Echo State Networks to perform the well-known NARMA-10 task, where we use intermediate targets derived through backpropagation. Our results indicate that our proposed method is well-suited to train multi-reservoir systems in an efficient way
Towards fully integrated photonic backpropagation training and inference using on-chip nonlinear activation and gradient functions
Gradient descent-based backpropagation training is widely used in many neural
network systems. However, photonic implementation of such method is not
straightforward mainly since having both the nonlinear activation function and
its gradient using standard integrated photonic components is challenging.
Here, we demonstrate the realization of two commonly used neural nonlinear
activation functions and their gradients on a silicon photonic platform. Our
method leverages the nonlinear electro-optic response of a micro-disk
modulator. As a proof of concept, the experimental results are incorporated
into a neural network simulation platform to classify MNIST handwritten digits
dataset where we classification accuracies of more than 97\% are achieved that
are on par with those of ideal nonlinearities and gradients
- …