Introduction
Since the 1960s, electronic components that make up digital systems have been packaged as tiny integrated circuits (ICs) or chips. There has been steady progress since that time in making the components smaller and thereby packing more components per chip. This has led to several generations of IC devices, from small-scale integration (SSI), with only a couple of components per chip, to medium-scale integration (MSI) and large-scale integration (LSI), with tens of thousands of components per chip (Myers 1980) . As a result of improvements in the technology of chip fabrication, achievable circuit densities continue to grow higher. The present level of density extends into very-large-scale integration (VLSI).
The Present State of the Art
New digital sound processor technologies are made possible by the introduction of VLSI technology. In a recent survey of integrated circuits and signal processing, Hoff (1980) points out the increasing complexity that is coming with the introduction of VLSI technology. At the present time, chips like the Motorola MC68000, a 16-bit microprocessor with 68,000 transistors, are at the upper limit of today's mass production (Fig. 1) . As the size of circuit features decreases more in the next few years, one can expect to see dramatic increases in chip density and functional capacity. Some estimates place as many There are five principal limitations in VLSI design: chip size, device density, circuit speed, complexity of device interconnects on the chip, and number of pinouts (the number of pins emanating from a chip's package). When designing a VLSI sound processor, one must consider various tradeoffs. For example, one can conserve on the number of pins used by adding multiplexers to the chip, allowing any one pin to be used for more than one purpose. The demands of real-time computation put an extra burden on digital processors. For such real-time processors, duplication of chip circuitry can increase the speed of signal-processing algorithms at the cost of additional "real estate" on the chip being dedicated to the parallel circuits.
Current Sound Processors
I began my research into VLSI and digital sound processors by considering existing processors implemented using MSI technology. The 4B Machine at the Institut de Recherche et Coordination Acoustique/Musique (IRCAM) (Alles 1976; and the Systems Concepts Digital Synthesizer (Samson 1980) were examined for ideas usable in a VLSI implementation.
The 4B is a multiplexed machine, that is, it runs fast enough to compute a number of voices of sound in the time between sample periods. Translating this machine into a single chip would not be a terribly difficult job. The machine does have a problem, however. When generating line segments for functions (e.g., envelope curves) the 4B allows timer interrupts to be set for the endpoints of the Computer Music Journal line segments. This can put a considerable load on the host processor, which is feeding parameters from the score to the 4B to control the sound synthesis. In the case of the 4B the host computer is a slow LSI-11. The LSI-11 is constantly being interrupted and asked to supply more data; this is known as the parameter-update problem.
The Systems Concepts Digital Synthesizer is a large, pipelined processor built using MSI technology. It has 256 generators, 128 modifiers, and a mixing memory called sum memory. Unfortunately, the machine appears to be designed mainly for frequency modulation (FM) and additive synthesis, rather than as a truly general-purpose synthesizer. Attempts to use the machine for synthesis techniques other than additive and FM synthesis are sometimes successful, but implementation is somewhat contrived. As a stream processor, the Systems Concepts synthesizer has another problem. Commands to the synthesizer are read from memory and executed until a command is given to stop temporarily and allow processing of sound to take place. The problem arises when one must integrate a command stream that comes from diverse input sources such as knobs and keyboards being played by musicians. This real-time input problem must be solved if such processors are to be used in live performance. Designers of digital sound processors will face other problems that will come up as the designs become more refined.
Current Algorithms
As more processing power becomes available on each chip and as the complexity of what musicians want to accomplish with such chips increases, it becomes necessary to think about processors differently. Commonly used signal-processing algorithms can be committed to firmware, thus simplifying the software environment considerably. Rather than consider, What algorithms can I implement using this beast? the chip designer must consider, What algorithms should I implement in silicon? Therefore, the designer of any new digital sound processor must consider what algorithms the user wants to implement. These might include lattice filtering, convolution, waveshaping, additive synthesis, and other algorithms useful for sound generation. In the following sections, two VLSI architectures for music processing will be discussed that address the issues pointed out so far.
A Command-Stream Processor
The purpose of the command-stream processor is to solve the real-time input problem. One needs to be able to mix many external sources of commands and data into a single command stream. One immediate difficulty is the instability of analog components often used in input devices such as potentiometers. In particular, the reference levels of analog-to-digital converts (ADCs) tend to wander. Use of a hysteresis register is one way to solve this problem. The contents of this register are compared against the current input (which is masked) and will match if the masked bits equal the last known 
A Sound Processor
So far, I have discussed ways to overcome the limitations in the bandwidth brought about by multiple, real-time command lists. In this section, I will discuss some considerations in the design of the signal processor that does the work of musical sound generation.
One can begin by considering exactly what algorithms will be implemented using the sound processor. This is a synthesis machine, not to be used for sound analysis. By and large, the computational demands of analysis and synthesis machines differ. Analysis algorithms are mostly block algorithms. These require an entire array (block) of samples before processing can take place. Although such algorithms are useful, in this article I will not discuss processors that implement them. In Fig. 4 , the data path of a rather ordinary sound processor that can be implemented in NMOS is given. Notice that the processor has an "outboard" memory and an internal multiplier. Once again, the processor is microprogrammed. This is more than a consideration for user programming of the control store. As processors get more complicated, it gets more difficult to design the processor to be logically correct. Using random logic only exacerbates this problem. The MC68000 is an example in which microprogramming was used to avoid bugs in the control section of the processor (Stritter and Tredennick 1979).
Each unit of the processor has latches on the input and the output so that processing may proceed without a wait for the results of a given operation (i.e., overlapped execution). The machine is synchronous, and results are latched when the execution unit finishes its action. Notice that the result bus can be connected directly to the execution units. This allows the current results on the bus to "bypass" the latches on the input side of the execution units.
The One-Voice One-Processor Doctrine
In the processor organizations discussed so far, it has been assumed that only one high-speed processor is available. With the advent of low-priced processors, lack of processor cycles should not be permitted to be a problem. Unfortunately, some of the problems of computer networking are then introduced. If we modify the existing architecture of the processor shown in Fig. 1 to have multiple processors, the interconnection of input modules and the sound processor modules becomes a problem. It's not sufficient just to restrict each sound-processor module to have one input module because one might want a single input device to affect many processors. A crossbar switch such as is shown in Another solution to the problem is to use a bus and grant the bus (arbitrate it). Suppose each of the processors is connected in a chain. Then each processor could pass the control token to the next processor when it was finished putting data on the bus. But of course then the sound processor must know which real-time input placed the data on the bus. Therefore, a new bus is needed that buses the processor ID of the input on the data bus. All processors can sample the bus to find out which input is there. The problem with this is that each processor must filter out the real-time requests on the data bus. This again introduces the parameter-update problem! Other possible bus organizations include a timedivision multiplexed (TDM) bus or an Ethernet-like network where each real-time input has a time slot (an address in the Ethernet scheme) and they communicate by broadcasting to the other processor (Metcalfe and Boggs 1976) . Of course this too has its problems-the TDM bus requires that processors wait for the proper time slot. Ethernet was designed to be "unreliable" in the sense that it keeps retransmitting a message over the network until it receives an acknowledgment; it does not assume the message got through on the first transmission. Unfortunately, there is no upper bound on the time it can take a packet of information to be transmitted and received. Because of the basically slow rate of change (with respect to the processor) involved in parameter update, a TDM-bus scheme was used in the command-stream processor.
Notice that in Fig. 6 a new mixing processor has been added to the output of the three sound processors. There must be one mixing processor per channel of output sound. With a fast mixing processor, this could be simulated through the use of multiplexing, since even at a 50-KHz sampling rate for audio, the samples would have to be added at a rate of 20 Lsec each, well within the range of the processor's speed. 
Command-List Processors Sound Processors

Designing the Processor for the Algorithm Instead of Vice Versa
In the design of VLSI processors, a prominent concern is to design the processor for the algorithm. This is based on the belief that processors will decrease in cost, and that by customizing the processor for the algorithm speed can be obtained. An example of customizing the processor can be found in the work of Kung (1980) . Systolic algorithms are formed by arrays of processors that communicate with their neighbors and form rectilinear arrays. For example, Kung has designed second-order filters, a convolution box, and a discrete-Fourier-transform (DFT) box. A special-purpose reverberation box could use a convolution box to implement reverberation using the impulse response of a hall. As more research is done into sound-generation algorithms, perhaps more of them can be placed into systolic form. More research is needed into the structure of algorithms used by computer musicians, particularly the implementation of algorithms that are computationally expensive. Multiprocessor implementation of complex algorithms would be of considerable interest to VLSI designers. If computer musicians engage in a dialogue with VLSI designers, then profitable results for both sides will surely follow. I hope this article will start some of that dialogue.
Conclusion
