We present a cellular neural network architecture for parallel analog random vector generation, including experimental results from an analog VLSI prototype with 64 channels. Nearest-neighbor coupling between cells produ1:es parallel channels o f uniformly distributed random analog values, with statistics that are truly uncorrelated across channels and over time. The cell for each random channel emulates an integratini: nonlinearity essentially implementing a delta-sigma modulator, and measures 100 pm x 120 pm in 2 pm CMOS technology. Appl [cations include analog encryption and secure communications, analog built-in self-test, stochastic neural networks, and simulated annealing optimization and learning.
INTRODUCTION
On-line random analog signal generation is an essential component in many of today's analog VLSI systems for signal or information processing. An on-line siipply of random analog vectors comes handy, for instance, to support testing and characterization of the hardware, or as part of the implemented algorithms. Examples of applications include encryption and secure communications [ 11, analog VLSI built-in self-test [2], and neural computation [3, 41, simulated annealing optimization [SI and stochastic model-free learning [6, 7] .
Most commonly used in VLSI are arrays of random binary sources implemented with linear feedback shift registers (LFSR) [8, 91 or cellular automata (CA) [ 10, 111, which yield compact and scalable parallel VLSI architectures [ 121. High-bandwidth, lowpower analog noise generators in VLSI are obtained by means of chaotic oscillators [13, 141, or through recursion of a nonlinear map such as the logistic map or a linear congruential map [15, 16] .
In this paper, we consider cellular arrays of cascaded deltasigma modulators for the purpose of random analog vector generation. The particular form of nonlinear coupling between cells not only avoids correlatioiis across cells, but in addition produces a truly random sequence in the sense that the outcome of a cell at a given time is statistically independent of its history. The interactions are nearest-neighbor as in cellular automata, and permit a simple scalable and parallel VLSI architecture. Our motivation to study these structures is inspired by remarkable noise-shaping properties observed in "MASH" cascade structures of delta-sigma modulators [17, 18, 191 The following section introduces the basic cellular architecture and its variants, and relates delta-sigma modulation to a congruential linear analog version of additive cellular automata. Section 3 presents a compact analog VLSI implementation, and Section 4 includes experimental results from a 64-channel (and 65-channel) CMOS prototype. Finally Section 5 concludes the results.
NONLINEAR NOISE-SHAPING AND CELLULAR

ARCHITECTURE
The general structure we consider combines additive cellular automata is defined, as the quantization residue in single-bit delta-sigma modulation:
It can be shown [ 17, 181 that this map is functionally equivalent to a modulo operation, which can be analyzed with the standard rules of residue arithmetic.
Cascade and Cellular Stuctures
The general template of nearest-neighbor interactions ( 1 ) allows to formulate cellular networks of various topologies. The simplest case to be possibly considered is a neighborhood of two cells: one cell and one of its neighbors. With a = 0 and , B = 1 we obtain not overload and the "noise" does not appear to correlate with the input at least for constant and sinusoidal inputs [ 171 and iid random inputs [19] .
We consider two special cases of boundary conditions for the cascade structure of N cells X I . . . E N : a "chain" topology with a constant input supplied to the first element X I , and a "ring" topology with cyclic boundary conditions where the output of the last element X N feeds into the input of the first X I (xo E Z N in (3)). The ring structure is preferrable because of symmetry which provides more uniform random noise properties across the array, although stability of noise-shaping in the feedback loop is an issue which will be addressed below. The linear cascaded chain and ring topologies can be implemented in scalable cellular VLSI architectures on either a 1-D and 2-D grid. To realize the chain and ring topologies on a 2-D grid, shown in Figure 1 , two sets of linear cascade segments are interleaved in opposing directions, and extemal connections at the periphery of the array span no more than two adjacent cell spacings on the grid.
Theory on the statistical and dynamical randomness properties of both topologies is presented in [23] . This paper focuses on circuit architectures and VLSI implementation.
IMPLEMENTATION
Analog VLSI implementation offers potentially higher integration density and higher energy efficiency than equivalent digital VLSI implementations. Unlike more convential analog designs, where high precision and high noise rejection are primary design constraints, the circuits can be implemented with minimum-size, noisy and imprecise components biased at lower currents.
Architecture
We have adopted an implementation style using low-power, highdensity switched-capacitor circuits, producing a voltage output format, V , ( k ) . In principle, compact alternative realizations using current-mode technology, of the type used in [16] , can be derived as well.
The switched-capacitor architecture implementing the MASH cell is shown in Figure 2 (a) , and the corresponding signal timing diagram is given in Figure 2 is presented to the output OUT. The accumulator functions as a standard switched-capacitor non-inverting integrator [21] , where in the sampling phase the capacitor CI is precharged to the input, and in the accumulate phase this charge is transferred onto capacitor Functionally, the first accumulate produces zi ( k ) + zj-1 (k), and the second accumulate subtracts the sign of the first. The net operation thus yields f ( z j ( L ) + x i -l ( k ) ) as desired.
CMOS Implementation
The transistor-level circuit diagram of the MASH cell is shown in Figure 3 . For low-power operation and compatibility with digital interface circuitry, the circuit uses a single supply Vdd, set to 5 V for the experiments. The signal ground level is set to V, = 2 V, and the signal range is f l V as determined by the D/A levels, V,; = 1 V and q2 = 3 V, symmetric around V,. Thus, K ( k ) = Vm + Kange zt(k) (4) where I&nge = 1 V.
The amplifier A is implemented as a single, non-cascoded pseudo-nMQS inverter Ml-M2. The relatively low gain of this design is adequate for the purpose of a random generator, where linearity and gain errors are less important than power dissipation and size. The virtual ground voltage of the amplifier, used for the precharge in the sampling phase of the accumulator, is obtained from circuit M3-M4, of which the V, , bias is generated from V, . The reason for not precharging directly from the unity-gain connected amplifier is because the accumulator output is needed simultaneously to precharge the next cell, occupying the amplifier. This introduces l/f noise in the accumulator which otherwise would have been cancelled by a correlated double-sampling technique. Finally, the capacitances CI and CZ are 0.2 p F in 2 pm technology, enough to provide adequate matching, and to avoid excessive switch-injection and clock-feedthrough noise contributed by the switches 41 and q52. Figure 4 shows a micrograph of the Tiny (2.22 mm x 2.25 mm) double-metal, double-poly 2 pm CMOS chip prototyped through MOSIS, which integrate:; a two-dimensional array of 64 MASH cells configured as shown in Figure 1 , plus one extra MASH cell and additional test circuitry. The cell for each random channes measures 100 pm x 120 p m in 2pm technology.
111-148
- SEL f *cc+ vm-Y -
EXPERIMENTAL RESULTS
All experimental results reported in this paper were obtained from this chip. Experiments were performed on chain and ring topologies with 64 and 65 cells, using the array of 8 x 8 modulators plus the extra modulator.
Transfer Characteristics: The measured transfer characteristic of a single MASH modulator cell, implementing (3), is shown in Figure 5 , for a spectrum of input and output voltages in the range of the [y;, yz] interval. The combined gain errors are in the order of 5 %, and their effect on the random statistics is evaluated next. Statistics: The hypothasis of statistical independence pt;' across channels and over time was tested experimentally, illustrated in Figure 6 for two concurrent neighboring outputs in the ring topology. The scatter-plot shows 2, ( k) vs. z -1 (k) , corresponding to the joint probability density which clearly is uniform as expected. For the chain topology, the obtained results were qualitatively similar, except for the first few channels which showed systematic correlations, due to transient effects studied next.
Dynamics:
We extensively studied the effect of chain and ring topologies on the transient and steady modulation dynamics. Spectral analysis of experimental data over a 1024-point time window, shown in Figure 7 , reveals that effects of limit cycle oscillations or other colored spectral features are limited to no more than the first three stages of the chain, and are absent in the ring.
System-Level Issues:
The operation of the chip has been verified over a range of speeds from 2 Ksampleds to 50 Ksamplesk per channel. The maximum of 50 Ksamples/s per channel that we obtained is affected by extemal capacitive loading of the (multiplexed) output which has not been buffered. Measurements of supply currents yield power dissipation levels ranging from 16 pW to 245 pW per cell, corresponding to 6 nJ of energy dissipated per sample. 
REFERENCES
