Abstract-We propose a new approach for neuromorphic computing on a silicon photonic chip, based on the concept of reservoir computing. The proposed reservoir computer consists of a signal-mixing photonic crystal cavity acting as the reservoir connected to a linear readout layer. The signal mixing cavity has a quarter-stadium shape, which is known to introduce nontrivial mixing of an input wave. This mixing turns out to be very useful in the context of reservoir computing and has been used to tackle several benchmark telecom tasks. We show that the proposed reservoir computer can perform several digital tasks with a very wide region of operation in terms of bitrate, such as up to 6 bit header recognition and performing the XOR between two subsequent bits in a bitstream.
I. INTRODUCTION
The constant demand for high-throughput telecommunication systems that can process massive amounts of data has challenged the traditional digital signal processing systems. To address these challenges, research is considering going back to physical analog processing units using the inherent dynamics of these systems to tackle these processing tasks. An important subset of these are the so-called neuromorphic computing systems, which use brain-inspired physical architectures to perform tasks that are traditionally difficult for digital systems.
Reservoir computing (RC) is such a brain-inspired computing paradigm, first proposed in the early 2000s [1] , [2] . It was specifically designed to make the training of recurrent neural networks easier. Traditionally, a reservoir computer consists of a large dynamical system, usually a recurrent neural network with random connections that is not being trained: the reservoir. This reservoir is then connected to a linear readout layer, which makes a linear combination of this internal reservoir states. The reservoir itself is tuned to mix the inputs and exhibit a fading memory, which means that the chance of recovering an input after a certain amount of time should go to zero.
This dynamic nature of the reservoir in combination with a simple linear readout layer makes the system as a whole able to perform a multitude of operations by just changing the readout weights. This is a big advantage on its own, as it does not require a complete redesign of a preprocessor to tackle a new kind of problem, which is for example the case in conventional delay lines.
Because of its simple structure, RC always had an appeal to photonic telecom applications [3] - [8] . These incarnations, collectively called Photonic Reservoir Computing (PRC), have so far followed the conventional node structure of neural networks quite closely. However, the inherent parallel nature of photonics allows for totally new architectures that depart from this architecture. Such a possible design consists of a photonic crystal cavity with a quarter stadium shape [9] , depicted in Fig. 1 . In this design, the special shape makes sure that an input signal gets mixed in a complicated manner, after which the mixed light leaks out of the cavity along the connected waveguides.
This new design solves a few issues with the more conventional on-chip photonic reservoir computer [10] , [11] . First, it allows for a much richer interconnection topology, by allowing a continuum of routes from the input waveguide or (node) to the output waveguides (nodes), while needing considerably less chip real-estate. The dimensions are 30 µm × 60 µm for a cavity with optimal bitrate around 50 Gbps, while in theory, bitrates of 1 Tbps and higher can be achieved if the cavity size is reduced to 6 µm × 4.5 µm On top of that, this photonic crystal design promises very low loss, combined with excellent performance on several benchmark telecom tasks, such as the highly nonlinear XOR task, where the xor is taken between two subsequent bits in a bit stream, and header recognition. The main benefit of this approach is that this system can be tuned to work for a wide range of bitrates.
II. DESIGN
While designing a photonic crystal cavity for reservoir computing, several important properties have to be taken into account. First of all, the cavity needs to mix the input fields in a sufficiently complex way. This can be accommodated by choosing a quarter stadium shape, which is known to mix the fields in an almost chaotic manner [12] - [15] . Secondly, the cavity needs to possess a fading memory, i.e. the signal should remain inside the cavity long enough to mix with subsequent input bits, but not too long such that it obfuscates the patterns emerging. This fading memory can obviously be controlled by controlling the Q-factor of the cavity, i.e. tuning the quality, pitch a and radius r of the holes of the photonic crystal Fig. 1 . A photonic crystal cavity used for reservoir computing. In this case, a single input signal gets mixed inside a photonic crystal cavity. The mixing of the input field can clearly be witnessed by inspecting the field profiles inside the cavity. The mixed light leaks out of the cavity along all the waveguides. By routing this leaked out light to a readout, a reservoir computer can be formed. lattice. However, changing the size and number of connected waveguide arms will also yield a non-trivial effect. In this paper a cavity with 7 connected waveguides was chosen (1 input and 6 outputs), while only the cavity size was varied, not the lattice parameters, which were fixed to r = 420 nm and r/a = 0.26.
Three cavity sizes were used during the simulations, a 30 µm × 60 µm cavity -the standard design, and two smaller cavities of 18 µm × 13 µm and 6 µm × 4.5 µm respectively, which were used to see the limits of the design in terms of bitrate. A typical photonic crystal cavity reservoir design is shown in Fig. 1 . Fig. 3 . By sweeping over the bitrate to find the operation range, we find that the reservoir can distinguish headers up to a header length of L = 6 bits, up to 100Gbps.
III. METHOD
The state of the reservoir can be described as a linear combination of its input states combined with a linear combination of the reservoir states.
The output of the reservoir at each of the arms can then be given by applying a nonlinear detection function f to the signal at the output arms and taking a linear combination of this nonlinearly detected signal:
This nonlinear detection function f adds shot noise and thermal noise to the output states and implements a bandwidthlimitation at 50 Gbps implemented by a low-pass filter with a 3 dB cutoff.
During the simulations, the response of a bitstream of 10 5 bits has to be found. Simulating the propagation of this bitstream though the photonic crystal is not trivial, especially if we would limit ourselves to pure FDTD simulations. Instead, the response to a single bit is recorded from an FDTD simulation. From this response, the resulting fields are coherently added together according to a pseudo random bit stream, resulting in a full bit stream response. Note the similarities between the impulse response method and the "bit-level" variation. Here, the latter option was chosen for numerical stability.
The raw reservoir output now has to be interpreted by a simple linear readout layer. In the readout, a set of weights is searched for that performs a linear combination on the output stream such that it performs the intended application. In most cases with a limited number of classes (such as the XOR task), this boils down to doing linear regression (usually with a regularization parameter). For multi class problems, Linear Discriminant Analysis [16] is more suited.
IV. HEADER RECOGNITION
As a first task, a PRBS of 10 5 bits with an input power of 1 mW is sent through one of the photonic crystal W1- waveguides (the top waveguide on the left in Fig. 1 ). The light gets mixed inside the cavity and finally, the responses of the other six waveguides are recorded. On this recorded output, the readout weights are trained to recognize all the different headers present in the bit stream. To do this, each bit location was labeled according to the decimal value of header formed by the bit at that location and the L − 1 previous bits. This procedure is shown in Table I . Linear Discriminant Analysis (LDA) [16] was then used to find a different weight matrix W out for each of the different classes.
As can be seen in Fig. 3 , the header recognition tasks works up to 6 bit headers in a wide region of operation. This is to be expected, as by only using 6 arms, the readout space is limited to a 6-dimensional output. The bit error rate was cropped at 10 −3 , two orders of magnitude higher than the number of bits used during the simulation, as is the general guideline [17] .
To see the separation of the headers visually, we can make a projection from the 2 L -dimensional header-space to a lower dimensional space. This is illustrated in Fig. 4 .
Longer headers can more easily be recognized at lower bitrates. This is unsurprising, as for longer headers, the reservoir needs to keep more bits in memory, therefore, the bitrate needs to be higher to accommodate this.
V. XOR TASK
In the XOR Task, a system performs the XOR between two consecutive bits in a bitstream. It is a task where traditional digital processors are excellent at, however, it also serves as a good benchmark task for the nonlinearity and general computing capabilities of the system.
The readout of the system can also be trained to perform the XOR on two consequtive bits in the bitstream with an as low as possible mean squared error. After the ideal weights are found, the BER is calculated as the difference between the predicted bits and the target bits. This procedure can be repeated at different bitrates, after which one can determine the operation range of the cavity Interestingly, the readout of the reservoir can be tuned to make the reservoir operate in a very wide range of different bitrates, by just choosing a different set of readout weights, which can be done relatively easily. As can be seen in Fig. 5a , errorless performance can be achieved between 25 Gbps and 67 Gbps on the XOR task on subsequent bits.
However, increasing the memory requirements on the XOR task by performing the XOR of two bits with one bit in between, the BER immediately increases above 10%. This also proves that the XOR Task is a much harder task than the header recognition task, where headers up to 6 bits could still be found.
Although the reservoir already works at a wide range of bitrates, It's still interesting to look at its physical limits. This is done by reducing the size of the cavity to the smallest possible size while still keeping the number of connected waveguides fixed to 7. This is of course a much more radical approach than just changing the readout weights, as it requires a complete redesign of the cavity. However, by reducing the cavity size, the achievable bitrates can in theory go up to 2 Tbps, as can be seen in Fig. 5b .
These extra two hyper-small cavities were of course simulated under the assumption that photodetectors exist that can reach these bitrates. However, this means that in theory, one can achieve reservoir computing at bitrates higher that 1 Tbps on a chip footprint smaller than 10 −6 cm 2 . Another approach that has an important effect on the performance is the number of attached waveguides. In Fig.  5d , it can clearly be seen that the reservoir starts operating starting at 6 connected waveguides (1 input and 5 outputs). One could argue however, that adding even more waveguides will eventually decrease the performance as the Q-factor will be reduced so much that the cavity does not have enough memory left.
What's more, even though we are working with photonic crystals, the reservoir operates in a quite wide wavelength range: 1510 nm−1600 nm, with an exception around 1560 nm, where we probably hit a stop band of the W1-waveguides.
VI. CONCLUSION
The proposed tiny photonic crystal cavity seems to be an excellent canditate to do reservoir computing on a silicon photonics chip. It shows excellent performance on the XOR task, while also at the same time it can do header recognition up to 6-bit headers.
What's more, it can do these tasks with a very wide range of operation bitrates between 25 Gbps and 67 Gbps. This is a big advantage of conventional delay lines, which are by definition only designed for a single bitrate.
The operation range can further trivially be upscaled to reach bitrates of up to 2 Tbps for the XOR task by just reducing the size of the cavity.
What's more, the existing reservoir can easily be repurposed by retraining its readout weights to perform a different task. As was shown by using the same reservoir to perform both the XOR task and header recognition.
While the tasks performed to date are fairly simple, it's the promise of performing those operations at extremely high bitrates while using little power and little chip real estate that makes this cavity stand out. This is the main difference between previously proposed reservoirs on chip, while yielding very similar performance. Additionally, the relative low power loss in photonic crystals would allow for a very modular structure, in which multiple cavities can be easily cascaded or connected in a network to each solve a part of a bigger problem.
