Analog Neural Programmable Optimizers in CMOS VLSI Technologies by Domínguez Castro, Rafael et al.
I l l0  IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 27, NO. 7, JULY 1992 
Analog Neural Programmable Optimizers in CMOS VLSI Technologies 
R. Dominguez-Castro, A. Rodriguez-Vhzquez, J. L. Huertas and E. Sanchez-Sinencio 
Abstract-A 3-pm CMOS IC is presented demonstrating the 
concept Of an analog neural system for constrained optimiza- 
tion. A serial time-multiplexed general-purpose architecture is 
introduced for the real-time solution of this kind of problem in 
MOS VLSI. This architecture is a fully programmable and re- 
configurable one exploiting SC techniques for the analog part 
and making extensive use of digital techniques for program- 
mability. ARCHITECTURE 
Finally, a 3-pm CMOS SC prototype is reported demon- 
strating the concept of sc analog neural optimizers via an 
integrated circuit. 
11. MATHEMATICAL MODEL AND BASIC SC 
I. INTRODUCTION 
HE constrained optimization problem can be defined T as one of minimizing a multivariable cost function 
+(x) subjected to a set of constraints Fk(x)  2 0,  1 k 
s Q. It has been recently demonstrated that the con- 
strained optimization problem can be solved in real time 
by using analog neural feedback networks [ 11-[3]. These 
networks show potential for those applications where on- 
line optimization is required as in robotics, satellite guid- 
ance, etc. 
Up to now, demonstration of the different proposals for 
analog neural optimizers has been made either by discrete 
prototypes [ 11-[3] or by simulation results [3]. Previously 
reported optimizers are, on the other hand, intended to 
just demonstrate the concept via the solution of problems 
with fixed weights. Weight programmability is not real- 
istically covered by any of the above-mentioned propos- 
als; this represents an important drawback for the practi- 
cal use of this kind of circuit. 
This paper discusses the implementation of digitally 
programmable analog neural optimizers using switched- 
capacitor (SC) integrated circuit techniques. The use of 
switched capacitors for neural network algorithms has 
been considered by different authors [3]-[5]. However, 
the proposed architectures are of the parallel type and, as 
a consequence, large-area occupation can be expected in 
case binary-weighted capacitor arrays are used for digital 
programmability. In this paper we first introduce a par- 
allel SC neural optimizer architecture and discuss area 
limitations due to the incorportion of programmability is- 
sues. Due to these limitations this architecture is only 
suitable for low dimension problems. Then, a serial time- 
multiplexed architecture is presented which allows digital 
control of the weight values with reasonable area figures. 
There are many useful optimization problems that can 
be formulated as linear and quadratic problems. Consider 
the problem of minimizing a quadratic cost function with 
linear constraints: 
N 
Its solution can be found by using the modified external 
penalty technique [3], [6]. To this purpose an equivalent 
unconstrained optimization problem must first be defined 
having the following cost function: 
8 
k =  I 
= U(F)d)(x)  + p u(-Fk)(Fk(x)l (2a) 
where p is a scale factor called penalty multiplier and 
U ( F )  and U( - F k )  are threshold operators defined as fol- 
lows: 
This equivalent unconstrained problem can then be min- 
imized by applying a discrete time gradient strategy, 
Manuscript received December 3, 1991; revised March 3, 1992. i =  1 , 2 ,  - * .  , N  (3) 
R. Domfnguez-Castro, A. Rodriguez-VBzquez, and J .  L. Huertas are 
with the Department of Design of Analog Circuits, Centro Nacional de 
Microelectrhica, University of Seville, 41012 Sevilla, Spain. 
E.  SBnchez-Sinencio is with the Department of Electrical Engineering, 
IEEE Log Number 9200273. 
yielding, per state variable, 
Texas A&M University, College Station, TX 77843. x;(n + 1) = xi@) - 
0018-9200/92$03.00 0 1992 IEEE 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on March 20,2020 at 14:54:34 UTC from IEEE Xplore.  Restrictions apply. 
IEEE JOURNAL OF SOLID-STATE CIRCUITS. VOL. 27. NO. I .  JULY 1992 1 1 1 1  
1 1  
Fig. 1 .  Fully interconnected parallel neural optimizer architecture 
where T,, the time interval between consecutive time in- 
stances, has been eliminated for simplicity. 
The solution of (1) can either be inside the region de- 
fined by the problem constraints (feasibility region) or on 
its border. This solution is obtained by the companion dy- 
namic gradient system in ( 3 )  and (4) via the process of 
seeking the minima of I (x). Notice that the penalty (sec- 
ond term in (2a)) makes I (x) have a valley floor centered 
around the feasibility region, the valley walls becoming 
steeper as the penalty multiplier increases. Thus, any tra- 
jectory starting outside this region is forced to enter it. 
Once the trajectory enters the feasibility region, the term 
due to @(x) in ( 3 )  makes the system seek the problem 
solution point. In case this point is inside the region, the 
evolution is asymptotic according to the gradient of @ (x). 
Otherwise, a steady state is reached where the trajectory 
remains confined inside a small interval around the nom- 
inal solution point. The size of this interval can be con- 
trolled by design [ 6 ] .  
Fig. 1 shows a parallel fully interconnected architecture 
for the implementation of (4), containing N integrating 
neurons (shown on the left in Fig. 1) and Q constraint 
blocks (shown on the right). Fig. 2(a) and (b) shows par- 
asitic insensitive SC schematics for the ith integrating 
neuron and the kth constraint block, respectively. Notice 
that these schematics by themselves do only allow posi- 
tive weight values. Thus, in order for both negative and 
positive values to be implemented, each input signal has 
to be multiplied by the sign of the corresponding weight, 
as is indicated in Fig. 2(a) and (b), where sgn(-)  holds 
for the sign function. 
In Fig. 2(a) some switches are controlled by the clock 
phases while others are controlled by the signals resulting 
from the logical AND operation between the even clock 
phase and threshold operator outputs (indicated in the fig- 
ure by using superscript e ) .  This is done so as to allow 
modulation of the weights by threshold operators, as re- 
quired for (4). The controlling signals U( - F k )  are ob- 
tained at the outputs of the schematics of Fig. 2(b), while 
U ( F )  can be obtained via the logical NOR operation among 
the different U( - F k )  (represented by the block labeled 
feasibility region encoder in Fig. 1). 
Notice the output of Fig. 2(a) is only valid during the 
even clock phase at any iteration, while this output has to 
be used during the odd clock phase at the subsequent it- 
erations. Thus, we must provide a way for the output sig- 
nals to be stored from the even to the odd clock phases. 
Also, a mechanism must be provided to get -xi from xi ,  
as required for Fig. 2(a) and (b). These functions can be 
implemented by using the circuit of Fig. 2(c). In a similar 
way the memory circuit of Fig. 2(d) is required since the 
outputs of the constraint blocks are obtained using the odd 
clock phases and used during the even clock phases. 
Each weight in the blocks of Fig. 2(a) can be made 
programmable by just substituting the associated input ca- 
pacitor by a binary-weighted capacitor array (BWCA), as 
illustrated in Fig. 2(e) for an M-bit codification of a ge- 
neric weight w, where wmaX is a scaling factor for the 
weight values. For the parallel inputs approach, however, 
full digital programmability has the drawback that the ca- 
pacitor area occupation increases proportionally to N(N + 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on March 20,2020 at 14:54:34 UTC from IEEE Xplore.  Restrictions apply. 
1112 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 27. NO. I ,  JULY 1992 
-1, 
Fig. 2 .  SC building blocks for constrained optimization circuits. 
2Q)2M ( M  being the number of bits of the digital word 
codifying the weights). Also, since N + Q routing lines 
per neuron and Q per constraint evaluation block are re- 
quired, routing area increases as 2N3 + 5N2 Q + 2NQ3.  
As a matter of fact, the latter area term may become the 
dominant one even for low dimension problems. Taking 
into account both area terms, we can see that area limi- 
tations are severe and hence that, for reasonable yield, a 
constraint will exist on either the problem dimension or 
the length of the digital word. 
Operation speed introduces additional constraints to the 
parallel architecture. Notice each neuron output must si- 
multaneously drive N + Q BWCA’s as well as large par- 
asitics associated with the routing lines. Hence, operation 
speed will be penalized when trying to achieve a reason- 
able power consumption for large dimension problems. 
On the other hand, stability analysis [3], [6] shows that 
the model time constant ( T ~  in (4)) must be made to in- 
crease as the number of neuron inputs increases, which 
further penalizes the operation speed for large dimen- 
sions. 
111. A SERIAL ARCHITECTURE FOR SC OPTIMIZERS 
Previously discussed drawbacks make fully programm- 
able parallel input architectures, as the one in Fig. 1, ap- 
propriate only for low dimension optimization problems 
(about N + Q < 10). For larger dimension problems re- 
quiring programmability, a serial input architecture such 
as the one shown at the conceptual level in Fig. 3(a) can 
be used. In this architecture, a number ( ( N  + Q )  in the 
more general case) of single-input single-output identical 
analog processing units are connected together in a neural 
feedback scheme. Fig. 4(a) shows an SC parasitic insen- 
sitive conceptual schematic for the processing units. No- 
tice that just one BWCA per unit is required. Each pro- 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on March 20,2020 at 14:54:34 UTC from IEEE Xplore.  Restrictions apply. 
IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 27. NO. 7 ,  JULY 1992 
Analog sequential buffer 
(a) 
Digital 
(b) +I2 .... n 
+MI -. . . . 
+la -. . . 
+wn+e+i,  . . . . L 
Fig. 3 .  Concept for a programmable time-multiplexed analog optimizer 
circuit. 
BWCA top 
electrode 
(d) 
Fig. 4. SC building blocks for the serial architecture 
PROCESSING wm 
1113 
0 
Fig. 5. Comparative area figures for the parallel and the serial architec- 
tures. 
cessing unit can be configured via the control signals SI 
to S, to perform either as an integrating neuron or as a 
constraint evaluation block. 
Each processing unit performs weighted summations in 
a serial way, under the control of the set of clock signals 
of Fig. 3(b). At the end of the nth operating cycle (under- 
stood as the time required for one iteration of (4)) the 
corresponding integrating neuron outputs xi (n )  and con- 
straint threshold signals U (  -Fk ( n ) }  are made available 
and are stored in the analog sequential buffer. Also, the 
feedback capacitor of the processing units performing as 
constraint evaluation blocks is zeroed. At the subsequent 
operating cycle, the N + Q analog values stored in the 
sequential buffer are one by one serially provided to the 
BWCA’s driving the integrators, together with their cor- 
responding digital weight codes. For this purpose the dig- 
ital weights are stored in a cyclic memory. Weighted sum- 
mations required to evaluate x j ( n  + 1) and U {  -Fk(n + 
1)) according to (4) are hence sequentially completed after 
N + Q clock subcycles. 
The digital memory storing the weight codes consists 
of M shift registers each of length N + Q. The schematic 
for a single stage in these registers is shown in Fig. 4(b). 
On the other hand, although not explicitly shown in Fig. 
4(a), the integrator time constants have to be electrically 
controlled as well; this is needed for proper solution of 
the stability versus speed trade-off. Fig. 4(c) shows the 
schematic to be used for this purpose, consisting of a C- 
2C ladder which is on the one side connected to the top 
electrode of the binary-weighted array and, on the other, 
to the virtual ground of the op amp. With this scheme the 
feedback capacitor around op amp OAl of the processing 
unit can be fixed to a small value. This solution results in 
important area savings as compared to the use of an ad- 
ditional BWCA to control T ~ .  
The analog buffer in Fig. 3 is composed of N + Q iden- 
tical units, one per basic processing unit, hence ensuring 
complete modularity of the architecture. Fig. 4(d) is a 
schematic for the ith section of this buffer. 
Observe that since the BWCA’s in the serial architec- 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on March 20,2020 at 14:54:34 UTC from IEEE Xplore.  Restrictions apply. 
1 I13 IFEF JOURNAL OF SOLlD-SlA-rF, CIRCUITS. VOL 27. NO. 7. JULY 1992 
Fig. 6 .  3-Fm CMOS demon?trator 
Fig. 7 .  Expcrirncntal rc\ults for Fig. 6 
ture are time multiplexed, the capacitor area term for this 
architecture is proportional to ( N  + Q ) 2 M .  Also, area due 
to routing increases just as N + Q. Fig. 5 shows compar- 
ative area figures for the parallel and the serial architec- 
tures. Area has been estimated assuming identical op amps 
and capacitors for both architectures. As can be seen, the 
area is much lower for the serial architecture than for the 
parallel one. 
Notice also in the serial architecture that there is only 
one analog signal line simultaneously driving N + Q 
BWCA's. It allows a much more flexible power versus 
speed trade-off than in the parallel input case. In fact, for 
a given power consumption, global operation speed for 
both the parallel and the serial input architecture may be 
of the same order of magnitude [6]. Finally, since only 
one analog input and one analog output are required, the 
serial architecture makes the layout easier for multichip 
circuits and wafer-scale integration. 
IV. A 3-pm CMOS DEMONSTRATOR 
Fig. 6 is a microphotograph for a 3-pm CMOS SC 
neural optimizer prototype. It can be reconfigured for dif- 
ferent second-order quadratic and linear problems. Four 
linear constraints have been implemented defining a trap- 
ezoidal feasibility region whose boundaries can also be 
externally modified. Fig. 7(b)-(d) shows measured trajec- 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on March 20,2020 at 14:54:34 UTC from IEEE Xplore.  Restrictions apply. 
IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 27, NO. 7 ,  JULY 1992 1115 
tones for a case where the feasibility region is as shown 
in Fig. 7(a). The cost function is externally controlled in 
such a way that the solution (which corresponds to the 
equilibrium of the model algorithm) can be on any of the 
comers of the feasibility region (points labeled A ,  B ,  C,  
and D in Fig. 7(a)). For all the experimental pictures, the 
point at the origin corresponds to the autozero phase of 
the SC circuits. In the experiment of Fig. 7(b) the cost 
function is initially set for its solution at point A .  The 
circuit is allowed to reach equilibrium in this situation. 
An external control signal is then used to change the cost 
function in such a way the new theoretical solution coin- 
cides with point B .  Fig. 7(b) illustrates the evolution of 
the circuit in the dynamic process of seeking the new so- 
lution point starting from the original one. 
As theoretically predicted by the model algorithm of 
(4), the evolution is on the upper border of the feasibility 
region. Fig. 7(c) is similar to the previous one, the initial 
and final solution points now being C and D ,  respectively. 
Dynamic evolution is now on the lower border of the fea- 
sibility region, as expected. Fig. 7(d) is qualitatively dif- 
ferent. It corresponds to a case where the cost function is 
controlled for the theoretical solution to coincide with 
point C and shows the dynamic circuit evolution from an 
initial point outside the feasibility region until reaching 
equilibrium at the theoretical solution point. As can be 
seen, the trajectory is forced to enter into the feasibility 
region by first following a direction perpendicular to the 
unfulfilled constraint border r r  until is reached. From 
this point, evolution towards equilibrium at C is on r2. 
Maximum operating frequency for the circuit was about 
200 kHz with the time required for the circuit to reach 
equilibrium in the cases shown in Fig. 7 being about 
500 ps for this maximum frequency. 
V. CONCLUSIONS 
As opposed to the general class of networks, the func- 
tionality of analog neural-like optimizers does not emerge 
exclusively from system complexity. Hence, small di- 
mension as well as large dimension problems can be in- 
teresting from the application point of view. Results in 
this paper show the possibility of designing optimizer IC’s 
by using analog/digital circuit techniques. Although the 
presented prototype is in CMOS technology, the proposed 
architecture can be implemented as well in any other tech- 
nology supporting switched-capacitor applications. Op- 
timization strategy in the proposed architectures is based 
in the use of a penalty gradient technique. However, the 
architectures can be very easily modified for other strat- 
egies such as, for instance, Lagrange multipliers [ 6 ] .  From 
a more general point of view, the proposed serial archi- 
tecture can also be applied to the design of mixed-mode 
coprocessors for matrix-by-vector multiplication with 
moderate accuracy (I 10 b). 
REFERENCES 
D. A. Tank and J .  J .  Hopfield, “Simple neural optimization networks: 
An AID converter, signal decision circuit, and a linear programming 
circuit,” IEEE Trans. Circuits S>>st., vol. CAS-33, pp. 533-541, May 
1986. 
[2] M. P. Kennedy and L. 0. Chua, ‘‘Neural networks for nonlinear pro- 
gramming,” IEEE Trans. Circuits S y s t . ,  vol. 35, pp.. 554-562, May 
1988. 
[3] A. Rodriguez-Vazquez, et a l . ,  “Nonlinear switched-capacitor 
“neural” networks for optimization problems,” IEEE Trans. Circuits 
Sysr., vol. 37, pp. 384-397, Mar. 1990. 
[4] Y.  Tsividis and D. Anastassiou, “Switched-capacitor neural net- 
works,” Electron. Lett . ,  pp. 958-959, Aug. 1986. 
[SI B. J .  Maundy and E. I .  El-Masty, “Feedforward associative memory 
switched capacitor artifical neural networks,’’ Analog Integrated Cir- 
cuits and Signals Processing, vol. 1 ,  pp. 321-338, Dec. 1991. 
[6] R. Dominguez-Castro et u l . ,  “Modeling and design of analog VLSI 
“neural” optimizers,” in Proc. 1991 ECCTD, pp. 489-497. 
Authorized licensed use limited to: Universidad de Sevilla. Downloaded on March 20,2020 at 14:54:34 UTC from IEEE Xplore.  Restrictions apply. 
