Abstract. We propose and analyze the error and timing of solvers consisting of both analog and digital circuitry for sparse linear systems of equations. We obtain high speed, but low precision from the analog circuits. We combine this with low speed, but high precision from the digital circuits. The hybrid circuit should be faster than digital circuits alone. As a preconditioner to standard iterative solution methods, the hybrid circuit makes the cost of the preconditioning step negligible. We also apply the hybrid circuit to a standard multilevel algorithm.
1. Introduction. We study a fast equation solver consisting of both analog and digital circuitry. W e expect this hybrid combination to give better results than digital techniques alone. The basic idea is to use an analog solver as a preconditioner for a digital iterative process. For a related study, see 12 . Thus, we can obtain both high speed from a fast exchange of information in analog circuitry and high precision from digital circuitry. E v entually, both types of circuits should be integrated onto a single chip.
In x2, we de ne an analog defect correction algorithm and discuss the sources of error. We also provide an error analysis. In x3, we de ne a basic model for a simple analog solver. We analyze its response speed and its precision. A general example is examined in detail. In x4, we de ne and motivate a two-stage analog solver. As in x3, we analyze it and examine an example. Finally, i n x 5 , we de ne and analyze a m ultilevel solver. We use the term multilevel in the abstract multigrid solution of partial di erential equation sense.
The applicability of the method is essentially limited by the condition 1, where is the relative precision of the analog circuitry and is the condition number of the linear system. For the multilevel solver, the condition number involved is the one for the linear system on the coarsest level. The time required for the analog part of the method also depends on the condition number. We conclude that this time is negligible in comparison to that for the digital part of the method when 1.
Due to being technology dependent, the limit of this theory is currently 1024.
A sequel to this paper 9 considers preconditionings and modi cations to the analog-digital algorithm in x2 to apply this theory to problems where 1024.
We expect the principal bene ts of the proposed method to manifest themselves with advances in technology: analog circuitry has the potential to avoid the information exchange bottleneck of massively parallel digital computation. Essentially, w e are trading recoverable precision for fast dissemination of information between the processors.
We expect that these techniques will be advantageous for large but moderately conditioned positive de nite problems with well de ned sparsity structures. Systems arising by either nite element or nite di erence discretizations of partial di erential equation problems are one possible application. For a general, non-sparse system, the number of connections required is prohibitive.
The technique used in the analysis is classical and can be found in any electrical engineering textbook. We believe the defect correction approach to the hybrid method is new and o ers, as yet, unexplored possibilities in massive parallel computation. For related studies, see 11, 2, 7 .
The precision of current analog circuitry is up to 10 bits 13 using capacitors as basic circuit components. Optical processors have the same or lower precision 10 . A purely optical analog method for solving linear systems was presented in 4 . See 6 for a survey of basic concepts of optical computing.
2. Analog Defect Correction: Error analysis. Consider a system of linear equations in matrix notation:
where A is an n n nonsingular matrix and x is the exact solution. We will sometimes require the hypothesis that A is symmetric, positive de nite. In the following algorithm, we use a standard residual correction technique to solve the system, except that part of the computation is performed digitally and part is performed with an analog solver.
The Hybrid Algorithm Iterative
Step 1. For given x 1 , compute r = f , Ax 1 , digitally.
Step 2. Convert r to n parallel analog signals, and using an analog solver, solve the equation Ay = r: Convert y in parallel to digital output.
Step 3. Compute x 2 = x 1 + y, digitally.
Step 4. Set x 1 x 2 and go to Step 1. We can analyze the error reduction per step using techniques found in 14 . Assume that the precision of digital computation is very high relative to the analog computation. The quality of the analog computation is described by the following backward characterization: Remark on sources of error: Analog computation will have three principle sources of error:
1. Digital-analog and analog-digital conversion of input and output contained in 2.2.
2. Digital-analog conversion error in representation of the matrix A contained in 2.3. 3. E ects of nite ampli cation and nite time contributing to both 2.2 and 2.3. The third source of error will be analyzed in the next section.
In the following section we show that the time spent on the analog part of the so that the Hybrid method is much faster than Jacobi. Table 1 Contraction Factors for the Hybrid Method n 50 100 200 400 1 50 1.0000 2.0000 4.0000 8.0000 1 100 0.5000 1.0000 2.0000 4.0000 1 200 0.2500 0.5000 1.0000 2.0000 1 400 0.1250 0.2500 0.5000 1.0000 1 800 0.0625 0.1250 0.2500 0.5000 Table 2 Contraction Factors for Optimally Overrelaxed Jacobi Tables 1 and 2 contain the contraction factors the error reduction per iteration factors for the Hybrid and Jacobi methods, respectively, for some sample values of and . The ratio of logarithms of contraction factors, log = log , 1= + 1, equals the ratio of the number of iterations required to attain a speci ed precision. Table 3 contains the ratios when the Hybrid method converges, i.e., when 1. The numbers represent speedup factors of the Hybrid method over a comparable fully parallel digital simple iterative method, since an iteration of each method requires the same time. As the table demonstrates, the Hybrid method can be more than a factor of 100 faster than optimally overrelaxed Jacobi. More sophisticated digital methods e.g., conjugate gradients are faster than Jacobi, but are not usually fully parallelizable.
A great deal of research in the past twenty y ears has been devoted to developing useful preconditioners for digital iterative methods see 5 . Generally, an approximation to A is constructed which is close to A in some sense, but much easier to factor. The corresponding preconditioned iterative method converges faster than the original method, but costs more per iteration. These preconditioners typically reduce the error by considerably less than a digit per iteration. The analog solve step can be thought of as a preconditioning step, where the preconditioner is the original matrix A rounded o to nearly three digits. We can do this because we can prove that the analog step takes almost no time in comparison to the digital steps. Table 1 demonstrates when we can expect to reduce the error by at least one digit per iteration. 3 . A Simple Analog Solver. 3.1. Basic model. The basic component of an analog circuit is an ampli er: i.e., a pseudo-di erential operator.
The transmission function of the ampli er is de ned as
where ! is the angular frequency and j is the imaginary unit. For 3.1,
the so-called one-pole transmission function. This relates our approach to conventional engineering terminology.
Consider the network in Figure 1 . Here, the ampli er part consists of n identical ampli ers acting on n signals in parallel. Each ampli er is assumed to have the same transmission function = ! corresponding to a linear di erential operator M: The output x of the ampli ers is processed by a passive network implementing multiplication by A, and then the residual Ax , f is fed back to the input. In fact, this residual determination will be merged with the ampli ers into one circuit; Figure  1 presents just a convenient equivalent model. 
3.7
Note that 3.6 and 3.7 implies 3.4. Thus the time t will never be signi cant when our estimates for the hybrid method are applicable.
3.3. Sample Embodiment. We n o w consider a speci c idealized circuit schema which e m bodies the analog part of the hybrid algorithm. We make use of classical devices: programmable resistors and operational ampli ers. The analog computational network will consist of n identical nodes as in Figure 2 . The resistors should be capable of attaining the values 0 and 1. Each node has n + 1 inputs x 1 ; . . . ; x n the components of x and f i one component o f f . The output is x i . It is connected to all inputs x i of all nodes. In a practical implementation, most of the connections and most of the resistors will be missing. A xed sparsity structure of A will be assumed. Such a sparsity structure may correspond to the discretization of a problem on a 2-dimensional or 3-dimensional mesh, or it might be a band structure.
The output x i is given by the transmission function of the operational ampli er,
Assuming zero output independence and in nite input independence of the operational ampli ers, the current balance at the inputs of the operational ampli er is All quantities x 1 ; . . . ; x n ; v + ; v , ; f i are voltages. By expressing a ij in terms of R ji and R 0 , and noting that resistances are nonnegative, we can show that X j j a ij j 2:
For a practical realization, a network using capacitors instead of resistors might b e required. Capacitors of high precision are easy to construct using MOS 1C technique. An entire hybrid circuit using 2-D mesh geometry could look similar to the one in Figure 3 . An over ow under ow detection two-line bus must be added to adjust the scaling of the residual fed into the analog solver. If the size of the analog output is too large, then the node which detects the over ow condition, guarantee that no j x i j is larger than v max and at least one is larger than v min , t h us making full use of available precision. The scaling factor is then used in the output analog digital conversion and stored for the next iteration as a good initial guess. This scaling can be easily implemented b y a v oltage multiplier attenuator at the output of the conversion unit. An analog bus, using continuous adjustments of the scale, could be also considered. 4 . A Two-Stage Analog Circuit. 4.1. Motivation. In the one-stage circuit, the scale of the output x and of the input f is, in general, di erent. Since x A ,1 f, x will be much larger than f. This can be a source of errors. Therefore, we consider an implementation of the product Ax using another ampli er. The right hand side f is then easily combined with the output using a di erential ampli er. The scaling of A can be used also to increase the speed of the circuit if necessary. Remark: F or the one-pole model of the ampli ers, the circuit is stable; however, compensation of the ampli ers so that they are well approximated by the one-pole model may be more critical here, because of both much stronger feedback and the presence of two ampli ers in the feedback loop. We see then that the hybrid realization here is similar to the one-stage solver.
A Two Level Algorithm. A more powerful version of the residual correction
technique is the multilevel v ariant see 1, 8 . In this section we consider the two level version of the latter see 3 , and we show h o w to employ the hybrid, analog digital methods to it.
The two level method for solving the system Ax = f is described as follows:
Step 1. Repeat p times: x x , GAx , b
Step 2. x x , PB , 1 R Ax , b:
Here the rst step consists of p smoothing iterations using a scaled iterative procedure G e.g., Jacobi, symmetric Gauss-Seidel, or conjugate gradients. The matrix R interpolates from the solution space onto a coarser space and P interpolates from the coarse space into the solution space. Typically, R is a linear interpolation method and P is R T . A customary choice for B is B = RAP, with the dimension of B being considerably less than that of A. 6 . Conclusions. We h a v e analyzed a hybrid digital analog algorithm and shown that it reduces the error per iteration by a considerable amount. In fact, most preconditioned iterative methods reduce the error by a fraction of this amount, moreover at a greater cost. The cost of the analog step has been shown to be negligible in comparison to the that of the digital step. Further, the cost of one iteration of the hybrid method is comparable to that of one iteration of a fully parallelized digital optimally overrelaxed Jacobi method. However, the hybrid method can be over 100 times faster than the corresponding Jacobi method for a xed accuracy requirement. The technology exists now to build such a h ybrid machine, either as a standalone computer or as a coprocessor board for a workstation.
