An asynchronous architecture for modeling intersegmental neural communication by Patel, Girish N. et al.
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 14, NO. 2, FEBRUARY 2006 97
An Asynchronous Architecture for Modeling
Intersegmental Neural Communication
Girish N. Patel, Member, IEEE, Michael S. Reid, Member, IEEE, David E. Schimmel, Senior Member, IEEE, and
Stephen P. DeWeerth, Senior Member, IEEE
Abstract—This paper presents an asynchronous VLSI archi-
tecture for modeling the oscillatory patterns seen in segmented
biological systems. The architecture emulates the intersegmental
synaptic connectivity observed in these biological systems. The
communications network uses address-event representation
(AER), a common neuromorphic protocol for data transmission.
The asynchronous circuits are synthesized using communicating
hardware processes (CHP) procedures. The architecture is
scalable, supports multichip communication, and operates inde-
pendent of the type of silicon neuron (spiking or burst envelopes).
A 16-segment prototype system was developed, tested, and imple-
mented; data from this system are presented.
Index Terms—Address event representation (AER), asyn-
chronous circuits, central pattern generator (CPG), neurobiolog-
ical modeling, neuromorphic engineering, silicon neuron, VLSI
architecture.
I. INTRODUCTION
B IOLOGICAL systems perform robust neural computa-tions largely because of their highly complex communi-
cation networks. For example, the neural systems that generate
and modulate axial locomotion in segmented invertebrates are
facilitated by short- and long-distance synaptic connections
along the length of the animal [1]. We develop neuromorphic
implementations [2] of these segmented systems with a focus
on communication architectures that model the intersegmental
connections.
The design of an architecture that is capable of producing the
variety of oscillatory patterns seen in segmented biological sys-
tems must draw inspiration from the observed and hypothesized
properties present in biology. An architecture that incorporates
these properties can be used to validate the principles underlying
intersegmental coordination and can result in the development
of engineered systems that reproduce the complex behaviors of
their biological counterparts. The properties in question fall into
two categories: 1) intersegmental connectivity and 2) intraseg-
mental neuronal properties and synaptic interactions.
Intersegmental connectivity in segmented systems creates co-
ordination among the neural oscillators in the spinal cord. These
oscillators, often called central pattern generators (CPG), are
present at each segment along the animal’s body; CPGs gen-
erate rhythmic, oscillatory patterns to produce complex activa-
tion of different groups of motor neurons. The pattern-gener-
Manuscript received August 30, 2004; revised May 18, 2005. This work was
supported by the National Science Foundation (NSF) under Grant IBN-9511721
and Grant IBN-0131612.
The authors are with the Georgia Institute of Technology, Atlanta, GA 30332
USA (e-mail: steve.deweerth@ece.gatech.edu).
Digital Object Identifier 10.1109/TVLSI.2005.863762
ating circuits are densely connected. For example, in the lam-
prey swim system, axonal projections can extend up to nearly
half the length of the body (up to 50 segments), implying that a
typical segment may receive hundreds of connections [1].
A realistic model of the biological system should include
the ability to create different CPG configurations and to imple-
ment both short- and long-distance neural connections. These
synaptic interconnections are essential for the coordination of
segmental oscillators resulting in the movement of the animal.
The intersegmental connectivity and parameter space can be
greatly simplified by assuming that the neural signals exhibit
uniform delay and translational invariance of synaptic con-
nections. This translational invariance, referred to as synaptic
spread in the biological literature, implies that for each connec-
tion a neuron makes with other neurons in its own segment, the
neuron makes the same connections with homolog neurons in
neighboring segments [3]. This simple connectivity rule exists
in intersegmental biological systems.
An unnecessarily large amount of silicon real estate would be
necessary to emulate the same level of connectivity with indi-
vidual wires in a very large scale integration (VLSI) system. In
addition, the conduction velocity of signals in metallic wires is
too fast to mirror the delays present in the biological systems.
Clearly, wires in biology and VLSI systems are grossly mis-
matched, whereas wires in biology are slow and densely packed,
their VLSI counterparts are fast and occupy valuable silicon real
estate [2].
In order to resolve this mismatch between the wiring in bi-
ological and VLSI systems, neuromorphic engineers often use
address-event representation (AER) [4], [5], a mechanism that
has been utilized in neuromorphic analog VLSI systems. In this
protocol, action potentials (or other events such as burst en-
velopes) are encoded and time-multiplexed over a high-speed
communications channel. Because the timing of events in bi-
ological systems are unquantized, AER architectures typically
transmit data asynchronously; thus, information is represented
by the origin and timing of individual events. This informa-
tion is preserved as long as the bandwidth of the communica-
tions system is much greater than the rate at which events are
generated.
In this paper, we describe a custom communications archi-
tecture that we have developed for specific use in hardware
models of intersegmental coordination [6]. To match its bio-
logical counterpart, including the constraints due to interseg-
mental coordination, we implemented a unique AER architec-
ture that uses a pipelined broadcast scheme to emulate a large
number of intersegmental connections with distance-dependent
delays. The architecture is scalable, supports multichip com-
1063-8210/$20.00 © 2006 IEEE
98 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 14, NO. 2, FEBRUARY 2006
Fig. 1. Conceptual views of the (A) biological and (B) artificial segmental
architectures. The intersegmental communications network of the artificial
system facilitates communication among the intrasegmental units with
pipelined stages.
munication, and operates independently of the type of silicon
neuron (spiking or burst envelopes).
II. SYSTEM ARCHITECTURE
Our architecture and the biological system on which it is
based are shown in Fig. 1. The system is divided into intraseg-
mental units and an intersegmental connection network. The
intrasegmental units consist of a number of interconnected
bursting neurons, each of which generates action potentials (or
burst envelopes) and has the potential to receive both intraseg-
mental and intersegmental synaptic input. The intersegmental
connection network converts events from the intrasegmental
units (action potentials or burst envelopes) into packets of data
that are stored and then transmitted to other intrasegmental
units, where the information contained in the packets has an
effect on the intersegmental activity.
Researchers have developed address-event systems that use
either partitioned or global bus architectures that broadcast each
event to a large number (or all) of the other neurons [4], [5].
These architectures are not efficient for our application because
our system requires that each event be capable of synapsing on
every other segment (for generality) with a delay that is linear
in distance. This delay requirement would make the decoding
in a broadcast bus scheme very difficult. Therefore, to fulfill our
requirements, we developed a pipelined broadcast scheme that
is a direct parallel to its biological counterpart. The principal
novelty of our address-event scheme is that both addresses and
delays are generated implicitly from the system architecture.
In the architecture, each event is passed from segment to
neighboring segment bidirectionally down the length of the
one-dimensional communications network. By delaying each
event at every segment, the pipeline architecture facilitates the
creation of distance-dependent delays. These delays model the
delays present in the neural fibers that exist along the length of
the animal’s body.
Fig. 2. Relative addressing scheme showing the origination of
distance-dependent delays.
Fig. 3. Block-level diagram of a communications node illustrating how events
enter and exit each stage of the pipeline.
The other primary advantage of this architecture is that it can
easily generate a relative addressing scheme (as opposed to an
absolute addressing scheme that would be required in a global
bus architecture). By using relative addressing in our architec-
ture, we are able to implement synaptic spread—translational
invariance of synaptic connections.
Fig. 2 illustrates the event-passing architecture with respect
to the relative addressing and distance-dependent delays. Each
event, generated at a particular node (the center node, in this ex-
ample), is transmitted bidirectionally down the length of the net-
work. It is delayed by time at each segment, not including
the initiating segment. By using a sorted queue, multiple delays
can be achieved; thus, axons with different conduction veloci-
ties can be implemented. If all conduction velocities are equal,
the sorted queue can be replaced by the simpler first-in-first-out
(FIFO) queue. In either queue, as an event arrives at each new
segment, it is time stamped, its relative address is incremented
(or decremented), and then it is stored in a queue for the in-
terval . As the event exits the queue, its data is decoded by
the intrasegmental units, and synaptic inputs are applied to the
appropriate intrasegmental neurons. To evoke a correct type of
response, the address of each event also contains information
about its event type, which can represent information such as
which neuron fired and whether the event should propagate in
the ascending or descending direction.
A. Routing Events
A block-level diagram of a single communications node il-
lustrating how events enter and exit each stage of the pipeline is
shown in Fig. 3. Because events arriving from neighboring seg-
ments are merged and inserted into the queue of a local segment,
and events that exit the queue are sent back to the neighboring
segments, it is possible for events to circulate around segments
indefinitely. To prevent this possibility, it is necessary to detect
circling events, and when detected, to prevent them from en-
tering the queue. This is accomplished by tagging events with a
direction bit that describes whether the event is ascending
(propagating toward the head, ) or whether it is de-
scending (propagating toward the tail, ). An event is
dropped, for example on the descending port (a port designated
for receiving descending events from the rostral segment), if the
event is of type ascending.
PATEL et al.: AN ASYNCHRONOUS ARCHITECTURE FOR MODELING INTERSEGMENTAL NEURAL COMMUNICATION 99
The benefit of a single queue instead of two queues (for
events propagating in opposite directions) is manifested when
the utilization of the queue is considered at each segment. If
two queues are used, the queues at the boundaries of the system
would either be overutilized or underutilized, depending on the
direction and the position of the queue. For example, in the
head segment of the system, the queue storing the descending
events would remain unoccupied, while the queue storing the
ascending events would be heavily occupied. Thus, by storing
both ascending and descending events in a single queue, the
occupancy of the queues along the length of the system will, on
average, be uniform.
Another advantage of using a single queue is that the number
of I/O signals between each pipeline stage is reduced. The use of
two queues would require two input ports and two output ports,
whereas in the single-queue design, two input ports and a single
output port are required. In our design, the events that exit the
stage are received by three input ports: the input port of the local
intrasegmental unit and the input ports of the two neighboring
segments.
Although local synaptic connections can be hard-wired in the
intrasegmental units, generally, it should also be possible to es-
tablish local CPG connections via the communications network.
Since local connections are fast, the CPG events are inserted
at the tail of the queue, bypassing the intersegmental delay.
These events are immediately received by the local intraseg-
mental units in which appropriate action can be taken. An ad-
vantage of this feature is that the circuit implementation for both
local and long-distance weights is the same.
B. Event Types
One of our goals is to use different intrasegmental units
without redesigning the network. This constraint implies a
generic interface between the intrasegmental units and the
intersegmental communications network. One method we use
to implement a generic interface is to append additional bits
of data to each event address. The additional bits encode an
event type that carries information about which intrasegmental
neuron generated the event and what kind of response should
be evoked. Because the event types are meaningful only to the
intrasegmental units, they are encoded by the intrasegmental
units and are transmitted from segment to segment without
interpretation by the intersegmental communications network.
We use Morris–Lecar neurons, implemented in silicon and
whose outputs are burst envelopes, to construct the half-center
oscillators that form the intrasegmental units. In order to evoke
an appropriate synaptic response, we encode events as rising or
falling based on the change in a neuronal membrane potential.
This information is encoded by feeding both inverted and
noninverted versions of the neuronal membrane potential into
the input of an encoder. The encoding of event types includes
information describing which neuron fired , whether
the event was a rising-edge or a falling-edge event ,
and whether the event should propagate in the ascending or
descending direction . Because we want to establish
connections in both directions, each transition in a neuron’s
membrane potential generates both ascending and descending
event types. The relative address of all events generated by
local intrasegmental units is zero.
Fig. 4. Processing of event data at each node of the communications network
(handshaking signals were omitted for clarity).
C. Processing Events
In each stage of the pipeline, events must be stored in a queue
for the interval, and before they exit the stage, their relative
addresses must be incremented. The section of the stage that
processes the events is shown in Fig. 4. This section contains
two counters, a queue, a comparator, and an incrementer. The
two counters run at the same clock speed; however, because they
are initialized to different values, their outputs display an offset:
. This offset corresponds to the intersegmental
delay.
An event is processed as follows. When an event arrives from
a neighboring segment, it is time stamped with the contents
of the first counter, , and subsequently, inserted
into the head of the queue. At the tail of the queue, the event’s
time stamp is compared with the contents of the second counter,
. When , the event is allowed to exit the queue.
Because we are emulating axons with uniform conduction ve-
locities, events exit the queue in the same order as they enter.
However, before exiting the stage, the event’s relative address
is incremented by the incrementer. We implement the delaying
of events with two counters instead of one counter because the
implementation is simplified. With a single-counter design, an
asynchronous adder would be necessary to add an offset
to each time stamp. In our implementation, a similar design is
used for the two counters.
In our current implementation, the width of the counters, the
comparator, and the incrementer is four bits. The queue is com-
posed of 18 stages.
D. Analog/Digital Partitioning
To test and debug our system, we make the inputs and
outputs of each segment accessible by partitioning the system.
Although each segment can be implemented with a single
custom analog/digital chip, we have partitioned the intraseg-
mental system, as shown in Fig. 5. This partitioning facilitates
the testing and debugging at the intrasegmental level and de-
composes the segment into smaller analog and digital sections.
The CPG chip contains two silicon Morris–Lecar neurons
(i.e., burst-envelope neurons) [7], 32 synapses (16 per neuron),
and an asynchronous decoder that interfaces the communi-
cations network with the synapses. Each synapse emulates
a graded synaptic transmission that is described by a sharp
and fast thresholding function. The AER chip contains an
100 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 14, NO. 2, FEBRUARY 2006
Fig. 5. Intrasegmental partitioning of analog pattern-generating circuit and digital communications network. Note that the output of the AER chip has three
destinations.
eight-input arbiter/encoder section, an event processing sec-
tion, and input and output merge sections. The arbiter/encoder
section provides an interface between neurons and the com-
munications network. This section detects events from neurons
(rising edges), arbitrates between the events, and encodes the
event types. It creates the appropriate handshaking signals that
are compatible with the output merge section. The input merge
section accepts events from neighboring segments, samples
their bit, and makes a decision whether to accept or drop
the event. After the decision, it forward accepted events from
both the rostral and caudal segments into the event processing
section. As described in Section II-C, the event processing sec-
tion time stamps, stores, and processes the events. The output
merge section merges events arriving from the queue and the
CPG chip and sends them to both the neighboring segments
and the local CPG chip.
We use a microcontroller in each segment to program the
analog synapses in the CPG chip and to deliver a programmable
clock to the AER chip. A host processor delivers commands to
the individual microcontrollers via an Inter-IC (I2C) interface.
Section III describes the communication between each of
these circuits.
III. ASYNCHRONOUS METHODOLOGY
Because biological systems operate asynchronously (the
generation of action potentials between neurons is not synchro-
nized), the choice to make the design asynchronous is another
step toward mapping the architecture to the biological systems.
Since segmented biological systems contain varied numbers of
segments (from about 20 segments in the leech to about 100
in the lamprey), an asynchronous implementation facilitates
the addition of segments without a major redesign and without
having to redistribute global clock signals over a large number
of segments.
A. Synthesis Methodology
We have utilized the synthesis methodology described by
Martin [8]. Although many techniques for synthesizing asyn-
chronous circuits exist, the following are the primary reasons
for using this methodology.
• The rules for implementing processes are simple and in-
tuitive, allowing for manual synthesis and optimization of
the circuits.
• A high-level description or a communications program is
decomposed into smaller processes, making the synthesis
of complex modules tractable.
• The synthesis methodology yields primitives that di-
rectly translate to implementations at the transistor level,
resulting in compact circuits suitable for VLSI implemen-
tation.
• This methodology has been demonstrated successfully in
another neuromorphic system [4].
The first step in the compilation process is to write a
high-level description of each process in the variant of CSP [9],
known as communicating hardware processes (CHP), then each
high-level description is decomposed into smaller processes,
and then communication actions in each simplified process are
expanded into a four-phase, handshake sequence. The elements
in this sequence consist of wait statements and corresponding
transitions of request and acknowledge lines of all the ports
in the process. A production rule set is generated from this
sequence for the design of pull-up and pull-down networks. To
optimize the circuit implementation, signal transitions in the
handshake-expansion sequence may be reshuffled as long as
the four-phase handshake protocol is adhered to and program
functionality remains unaltered. In addition, the predicates
guarding each transition can be strengthened or weakened as
long as they remain noninterfering (that is, the guards do not
introduce signal contention problems).
We now briefly review some notation. To begin with:
• drive to a high value;
PATEL et al.: AN ASYNCHRONOUS ARCHITECTURE FOR MODELING INTERSEGMENTAL NEURAL COMMUNICATION 101
• drive to a low value;
• sequential activity, complete
before beginning ;
• wait for to become TRUE.
In CHP, two processes sharing a common channel commu-
nicate with each other by using communications commands on
their ports. The communications channel (and port) consists of
a request line, an acknowledge line, and data lines. A high-level
description of a program is made up of multiple processes that
are composed by one of three operators: the sequential oper-
ator, represented by a semicolon (;); the concurrent or parallel
operator, represented by parallel bars ; and the coincident
operator, represented by a bullet . When processes are to be
executed in parallel, as in
each subprocess can be executed in any order. In
the subprocesses are executed in the order specified. When using
a bullet operator, however, subprocesses are executed and com-
pleted at the same time. Note that the definition of the coincident
operation is unambiguous if the subprocesses are independent or
noninterfering (i.e., the processes do not share variables). For-
mally, the definition is that both subprocesses complete in the
same state of the computation, where completion is defined as
the point at which all possible continuations of the computation
(traces) contain the remainder of the subprocess.
The execution of processes may be guarded by a Boolean
predicate. When the program is designed such that only one
guard is TRUE at any given time, the selection of the command is
deterministic. The deterministic selection operator is a hollow
bar . The deterministic choice in
is feasible if and only if at most one guard evaluates to TRUE.
When several guards may be TRUE at the same time, the selec-
tion of the command is nondeterministic. The nondeterministic
selection operator is a single bar .
A program is repeated forever by using the repetition oper-
ator, as in . A probe command at a given port, , is to
check whether communication is pending. can be considered
a guarded expression that checks whether the request on port
is TRUE. A guarded command in closed brackets indicates
“hold until is TRUE.” The statement ; indicates that
data on port is to be read and stored in buffer and subse-
quently output on port . In the situation in which buffering is
unnecessary, one would write .
B. Synthesis Example: The Merge Process
We use the process to demonstrate the synthesis
methodology. We start with the definition of a circuit that per-
forms a operation [8]. The circuit, whose block-level
diagram is shown in Fig. 6(a), is used for merging events ar-
riving from neighboring segments, at port and at port , and
inserting into a stage (e.g., a queue) at port . The control sig-
nals of ports and contain request lines that are inputs,
and , respectively, and acknowledge lines that are outputs,
Fig. 6. (A) Block-level diagram of the MERGE process and
(B) decomposition into smaller subprocesses.
and , respectively; port contains a request line that is
an output, , and an acknowledge line that is an input, .1
To increase the throughput of the system, the data arriving at
port and port may be buffered locally; however, because
compactness, and not speed, is important in our application, we
do not buffer the data. Thus, the high-level description is written
as
Because the events are asynchronous, a nondeterministic choice
is necessary to gain access to the queue (channel ). In addition,
as the data is not stored locally, a request at an input port should
not be acknowledged until the subsequent request at port is
acknowledged.
As shown in Fig. 6(b), this problem is made tractable by de-
composing the process into the following four sub-
processes: , , , and . Pro-
cesses and make requests for gaining ac-
cess to channel and to coordinate communication between
ports , , and and , , and , respectively. Process
makes the nondeterministic choice and grants per-
mission for the access of channel . Process multiplexes
data from port or port onto port and initiates commu-
nication on port . The CHP description for these processes is
as follows:
The role of and is understood by studying
the sequence of atomic actions that occur if a request is received,
for example, on port (i.e., ). In this situation,
will make a request to the arbiter ; when the arbiter
acknowledges , makes a request to the mul-
tiplexer ; when the multiplexer acknowledges ,
releases control of the channel by dropping its request
to the arbiter ; drops its request to the multi-
plexer ; and finally, acknowledges the original
request at port . During this process, if a request were re-
ceived at port , the request would be blocked by the arbiter. To
1To maintain a consistent notation, we designate request, acknowledge, and
data signals on a passive port, X , as X , X , and X , respectively. The
request, acknowledge, and data signals on an active port are designated asX ,
X , and X , respectively.
102 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 14, NO. 2, FEBRUARY 2006
Fig. 7. Circuit implementation of the MERGE process. Note the symbols
used for state-holding elements (staticizer circuits) and transmission gates
(at output of Q ). The gain of the feedback element of the staticizer is
approximately 1/8 the gain of the driving circuit.
Fig. 8. Four-phase protocol and bundled data convention. The data input is
captured (seen as a transition on data in) prior to the rising edge of req in.
guarantee that the data on port remains stable, the acknowl-
edge to the arbiter must be delayed until the acknowledge from
the queue is received.
For (and ), although subprocesses and
( and ) are interrelated (there is depen-
dence through a variable), we use the coincident operator to sig-
nify that the two subprocesses are to be executed and completed
at the same time. The ambiguity in the definition will be re-
solved when we perform handshake expansion on the program.
Following the synthesis compilation process defined in [8], the
resulting circuit is shown in Fig. 7.
C. Handshaking Protocol
We assume that data signals are valid when the control signal
becomes valid (i.e., the signals have settled to their final values).
Specifically, when a request line is raised by a preceding stage,
the stage receiving the request assumes that the data on its data
port is valid. This assumption is known as the bundled data con-
vention [10]. It is illustrated in Fig. 8 using a four-phase hand-
shaking protocol. To ensure that the timing assumption is valid,
we have carefully engineered and analyzed the delays on data
and handshaking signals. A data request is propagated to a sub-
sequent stage following the rising acknowledgment of the cap-
ture of the data in the current stage. This corresponds to the
narrow data release scheme [11].
IV. IMPLEMENTATION
In order to implement this architecture, the full processing
of an event must be considered. First, we need to generate an
event from the rapid rise or fall in the membrane potential of a
spiking neuron. The event must then be arbitrated, encoded, and
inserted into the communications channel. Next, the event must
be transmitted to all of the other segments, and finally the event
needs to be decoded such that it effects one synaptic module.
The details of these event handling procedures are presented in
this section.
A. Event Generation, Encoding, and Arbitration
Events in this communication architecture are generated by
silicon neurons. Many such neurons have been presented in the
literature [12]–[14], and these neurons generate either spike
trains (i.e., action potentials) or burst envelopes that represent
these spike trains that are encoded as events. CPG circuits are
created from a set of these neurons. For our system, the neurons
generate burst envelopes and are given by high voltages during
spiking and low voltages during silence. As a result, two event
types are encoded—one by the rising edge of an envelope (i.e.,
the beginning of a spike train) and the other by the falling edge
of an envelope (i.e., the end of a spike train).
To interface neurons to asynchronous circuits, we must
transmit each event with a pair of request/acknowledge lines
that adhere to the four-phase handshake protocol that we have
adopted. Note that there is an implicit timing assumption in
this specification which is required for synthesis. Specifically,
there must be sufficient delay between the rising and falling
events in order for the communication action to occur and to
be able to discern between both events; this is ensured because
the periodicity of the events is on biological time scales (i.e.,
on the order of milliseconds).
Once an event is generated by a neuron in the CPG chip, the
event must be inserted into the communications channel. The ar-
biter/encoder section of the AER chip accomplishes this task by
arbitrating, encoding, and inserting the events into the commu-
nications channel. The decomposition of this process is shown
in Fig. 9.
The actual insertion of events into the communications
channel is controlled by the interaction between the arbiter tree
and the (neuron/arbiter) modules, as shown in Fig. 9. The
high-level description of the and the arbiter processes are
When an event is received, the corresponding module
makes a request to an asynchronous arbiter for access to the
communications channel. The temporal ordering of multiple
events is preserved by the arbiter, and in the case of nearly si-
multaneous events, a nondeterministic2 choice is made. After
receiving access, the module is allowed to make a request
2The choice, however, may be biased because mismatches in the arbiter might
cause some inputs to be favored.
PATEL et al.: AN ASYNCHRONOUS ARCHITECTURE FOR MODELING INTERSEGMENTAL NEURAL COMMUNICATION 103
Fig. 9. Arbiter/encoder section illustrating arbitration and encoding of events.
to the communications channel. As a result of the bundled data
convention, however, the request is not made until the data on
the output port is valid. This condition, fulfilled by ensuring that
the delay in the control path is greater than the delay in the data
path, is guaranteed by adding sufficient delay in the acknowl-
edge lines of the arbiter (see Fig. 9). The bullet operator in the
above description signifies that the two subprocesses complete
in the same state. The implementation of the arbiter/encoder sec-
tion is similar to the -input merge; however, instead of multi-
plexing data, we are creating data. Because the acknowledge
lines of the arbiter are mutually exclusive, the event address is
encoded by using these lines. In addition, the mutually exclusive
acknowledge lines facilitate the use of a common acknowledge
line at port of the modules. We use a wired-OR configu-
ration for the generation of an output request.
We implement an -input arbiter with a tree of two-input ar-
biters. The tree, whose depth is , contains arbiter
cells. The eight-input arbiter used in the arbiter/encoder section
is shown in Fig. 10. The high-level description of an individual
arbiter cell, containing two input ports, and , and a single
output port, , is given as
The implementation of the arbiter cell is based on a previ-
ously published design [15]. (Note: To adhere to the four-phase
protocol, the original design has since been improved [16].)
To complete the arbitration process, the acknowledge signal
on top of the tree is generated from the output request of the
Fig. 10. Eight-input arbiter used in the arbiter/encoder section.
last stage. Because the acknowledge signals are active low, an
inverted version of the output request is connected to the ac-
knowledge line.
Because a neuron’s axon projects to both sides of the seg-
ment, every rise and fall in the neuron’s membrane potential
should generate two events: an ascending event and a de-
scending event. Two events are produced by connecting the
neuron’s output to two handshake modules, which in turn,
connect to two inputs of the arbiter/encoder section.
B. Event Transmission
Once the events are generated by the individual neurons and
inserted into the communications network, they must be trans-
mitted to all other segments. Events arriving from ascending and
descending segments are filtered and then merged by the input
merge section, and subsequently, inserted into the event-pro-
cessing section. In the event-processing section, the events are
time stamped, stored in a queue for a length of time corre-
sponding to the intersegmental delay, and their addresses are
incremented as the events exit the queue. The events are then
merged with events arriving from the CPG chip by the output
merge section. After the events exit the output merge section,
the events are transmitted to the neighboring segments and the
local segmental unit.
1) Merging and Time-Stamping Events: The output merge
section simply merges two event streams (without processing
the events). The input merge section, however, has the addi-
tional task of dropping the circulating events. This task is ac-
complished by checking the polarity of the direction bit in the
event data. Descending events arriving from the ros-
tral segment and ascending events arriving from
the caudal segment are allowed to pass. In contrast, ascending
events arriving from the rostral segment and descending events
arriving from the caudal segment are rejected. The process de-
composition of the input merge section is shown in Fig. 11. The
events are filtered by the following two subprocesses:
104 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 14, NO. 2, FEBRUARY 2006
Fig. 11. Process decomposition of the input merge section. The direction of
events is checked by ASC DIR and DES DIR.
Fig. 12. Circuit implementation for DES DIR.
The handshake expansion of is
The production rule set obtained from the above expansion is
Using this production rule set, the circuit schematic of
is shown in Fig. 12. Because the input events travel
across chip boundaries, to guarantee that our bundled-data
assumption holds, we add delay to the input request, . The
implementation of is similar to that of ;
however, the polarity of the bit is opposite to that of in
.
The first step in processing the events is appending a time
stamp to the event data. The output of a synchronous counter is
used as the time stamp. Because a request from the input merge
section and an upward transition in may occur within
an arbitrarily small interval, a nondeterministic choice is neces-
sary. Thus, the high-level description of the time stamp process
is given as
where events arrive at port and exit at port . The
above process delays a transaction on the output port of
until the counter has finished updating, or
delaying the clock for the counter until the transaction on the
output port is finished. In the latter case, the assumption is that
the delay will be small relative to the period of .
If the delay were large relative to the period of ,
the time stamp would drop a clock tick, affecting only items
already in the queue since the delay is measured relative to that
Fig. 13. Process decomposition of the TIME STAMP process.
register. In practice, this never occurs since the clock period is
very large relative to electronic delays.
The process decomposition of the process
is shown in Fig. 13. The clock signal, which is an internally
buffered signal, is converted to a pair of handshake signals by
the module. To satisfy the bundled-data assumption (that is,
to ensure that the output of the counter is valid before sampling
its contents), we add delay to the acknowledge signal on port
of the module. The undelayed version of this acknowledge
signal is used to update the counter (on every rising edge). Sub-
process controls communication among the
input port (port ), the arbiter (port ), and the queue (port ).
Its high-level description is
The handshake expansion can be written as
The production rule set for the above expansion yields three
wires:
The wires are shown as dashed lines in Fig. 13.
2) Sorted Queue: Within a single communication channel,
events may arrive at the QUEUE out of temporal order. Thus,
a reordering of events in the QUEUE is necessary. Although
not implemented with the architecture described in this paper, a
bubble sort algorithm is sufficient to maintain proper ordering:
each event progresses through the queue, overtaking events
ahead of it, until its time stamp is greater than the time stamp
of the event in the stage ahead of it [6].
3) Storing Events—Asynchronous FIFO: A single stage of
the queue is shown in Fig. 14. To minimize area, we use latches
as the basic storage element in the register. In our implementa-
tion, the latches are transparent and state holding for low and
high levels of the clock, respectively. When the queue stage
is empty, the latches are transparent and ready to capture the
input data. After the input request is raised, the data is cap-
tured by the latches. The storage and transmission of data are
PATEL et al.: AN ASYNCHRONOUS ARCHITECTURE FOR MODELING INTERSEGMENTAL NEURAL COMMUNICATION 105
Fig. 14. Single stage of the queue.
Fig. 15. Implementation of REG CONT.
controlled by process . The design is straightfor-
ward and has been previously described in [10]. Its implemen-
tation is shown in Fig. 15. Note that in [10], the circuit is used
to control a two-phase pipeline, but we used the same design
to control a four-phase pipeline. Also note that the maximum
throughput with this pipeline control is about one half of that
achievable with more complex designs because adjacent stages
cannot be concurrently active in the steady state. Since we are
using the queue to delay events on biological time scales, this
consequence is not detrimental to our design, and as a result,
the queue throughput is sufficient to move all events forward
one place during one inter-arrival time. In addition, because the
data validity scheme is narrow (early release), events propagate
forward in the queue at the maximum rate corresponding to the
delay of one C-element, providing no preceding events exist in
the queue. This ensures that head-of-line events reach the head
of the queue as early as possible, minimizing the probability of
queue-induced jitter in the desired delay.
4) Delaying Events: At the tail of the queue, the time stamp
of an emerging event is compared with the output of a
second counter . When , the event is sent to the
incrementer , where its relative address in incremented.
Thus, the intersegmental delay is determined by the offset be-
tween the two counters.
The process that receives an event and compares its time
stamp with the output of the second counter is described as
where events are received at port and exit at port . Also,
the guard is a Boolean-valued comparison operation ob-
tained by sampling the event time stamp and the output of the
counter; its value is TRUE when and is FALSE other-
wise. The skip operation, which is read as “do nothing,” prevents
the process from completing until the guard is TRUE. As
illustrated in Fig. 16, we decompose the process into
two subprocesses, and . The latter
subprocess, which contains the second counter, does the com-
parison and completes the communications process when
Fig. 16. Events exit the queue through the control of COMP REQ and
DO COMP.
Fig. 17. Implementation of the DO COMP process.
. makes a request for a comparison and coor-
dinates the communication between the queue and the incre-
menter. The high-level descriptions of the subprocesses are
The handshake expansion of is
The production rule set is
Note that the sampling should occur every time an input re-
quest arrives and each time the counter is updated. However, to
prevent erroneous output during the update of the counter, the
comparison is disabled at the onset of the rising edge of the clock
for a sufficient time needed for the output of the counter to settle.
Circuit implementation of is shown in Fig. 17. The
signal used to disable the comparison during the update of the
counter, , is obtained from the handshake module. This
signal is low at the onset of and remains low for suf-
ficient time to allow the outputs of the counter to become valid.
The guard that expresses the requirement that is
where
The handshake expansion of is
106 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 14, NO. 2, FEBRUARY 2006
Fig. 18. Implementation of the COMP REQ process.
To simplify the implementation, we have shuffled the atomic
actions. Specifically, is the last atomic action so that can
be implemented with a single wire. The production rule set is
The implementation is shown in Fig. 18.
The final process in the event-processing section is respon-
sible for incrementing the relative address. Its process descrip-
tion is given as
where events enter at port and exit at port . An intermediate
register, , is used to increment the relative address. For the im-
plementation of the above process, we use the same circuit used
for the process (a c-element). However, because
of the additional processing, we use edge-triggered storage de-
vices with appropriate combinational logic at their inputs in-
stead of latches.
5) Output Events: Events emerging from and the ar-
biter/encoder section are merged and sent, via the output merge
section, to the neighboring AER chips and the local CPG chip.
Because the intersegmental delay is much greater than
the time necessary to transmit the events across the segments
, we do not store (or pipeline) the events after they exit the
output merge section. Because the output merge section sends
events to three passive ports, we use a wired-and configuration
for the output acknowledge signal. This connection, reduces the
pad count, but at the cost of a steady-state current.
C. Event Reception
Each event is represented by an address that encodes the
location of its origin relative to its current position in the
system, the neuron in the CPG from which it was generated,
and whether the event is a rising-edge or falling-edge event.
As events enter the CPG chip, they are decoded such that
they affect one synaptic module in the CPG chip. The role of
each synaptic module is to appropriately excite or inhibit the
CPG neurons. If the synaptic weight is configured positive, it
will excite the neuron in the CPG; if it is configured negative,
it will inhibit the neuron. In our prototype system, all the
synaptic weights are inhibitory. Each synaptic module contains
a floating-gate transistor for storing an analog weight, and the
weights are programmed using Fowler–Nordheim tunneling
and hot-electron injection [17], [18]. Although every neuron
in the system is connected to every other neuron, connections
between neurons can effectively be broken by programming a
synaptic weight of zero.
Fig. 19. Decoding event data and establishing inter- and intrasegmental
connections.
The interface between the communications channel and the
synapses of the neurons is shown in Fig. 19. The interface con-
tains a decoder and intermediate modules that in-
terface the decoder to the synaptic modules. The synaptic mod-
ules use the signals, which are generated in the respec-
tive modules, to turn on or to turn off the inhibitory
currents. The module generates the signal
by sampling the bit of the event data. The description of the
decoder is given as
where events are received at port and are directed to one of
33 ports, – , depending on the mutually exclusive guards,
– . The handshake expansion below illustrates how we
implement two rows of the decoder:
Because of the bundled-data convention, the guards are
obtained by sampling the event address at the same time the
input request arrives. The Boolean variables – represent
inverted or noninverted versions of the individual bits of the
event address depending on the address to be decoded by
row . Because the guards are mutually exclusive and only a
single module is active at any given time, we use a
wired-OR connection for the acknowledge line at port .
The module, whose sole purpose is to generate
the signal, is described as follows:
PATEL et al.: AN ASYNCHRONOUS ARCHITECTURE FOR MODELING INTERSEGMENTAL NEURAL COMMUNICATION 107
Fig. 20. Circuit implementation of the DEC SYN module.
Fig. 21. Scope trace showing output events in a single communication node
when the node is injected with eight simultaneous events.
The handshake expansion is
The circuit implementation is shown in Fig. 20. After the
completion of a communications process, the state of the
variable is held to the appropriate value (depending on
) until the start of another communications process. This
facilitates the implementation of graded synaptic transmission.
V. SYSTEM ANALYSIS
In order to understand the functionality and the scalability of
our system, we have looked at three critical issues: the channel
capacity, the queue size, and the synaptic connections. We ana-
lyzed the capacity of the channel as well as the size of the queue
in order to support data rates that we expected. We also consid-
ered the scalability of this architecture to larger systems through
the addition of synaptic connections.
A. Channel Capacity
The speed of the channel is limited by the interchip commu-
nication as a result of the wired-AND connection (introduced in
Section IV-B5). Therefore, in order to measure the channel ca-
pacity, we need to measure the maximum event rate between the
local CPG network and the neighboring segments. The events
are generated by applying a step input to the inputs of the ar-
biter/encoder section. When events are generated, they propa-
gate through the arbiter/encoder section and the output merge
section and are then received by the neighboring segments and
the local CPG chip. The rate of the output requests is an approx-
imate measure of the channel capacity. Fig. 21 shows a scope
trace of the output request line when a step input is applied to
the eight pads that receive events from the local CPG network.
The channel capacity, which is primarily limited by pad capac-
itance and drivers, is shown in the figure to be approximately
eight output events in 1.6 s, or 5 10 events per second.
B. Queue Size and Scaling
The queue in our -segment lamprey system stores events
for a fixed time corresponding to the intersegmental delay ,
as defined by
(1)
where is the time required for an event to travel the entire
length of the system.
Assuming the events are uniformly distributed and the system
is in stochastic equilibrium (the average input rate into the com-
munications network is equal to the average output rate), the
total number of events in the system is given by
(2)
where is the average event rate per segment, is
the lamprey swim frequency, and is the number of events per
swim cycle. In our case, each segment in our burst-envelope im-
plementation will output four events per swim cycle:
one event for each rising and falling edge for each of the two
neurons in the segment.3 In (2), the total number of events gen-
erated per second is given by the product of the first two terms,
, and gives the average time that an event resides in the
queue. The total number of events in a single segment is, there-
fore, given by
(3)
If we use this architecture to create a compartmental model
of the locomotion system of an entire animal, each compartment
(segment in our model) represents a portion of the animal’s
body. The total time delay through the system represents the
axonal time delay down the total length of the animal’s body,
and, thus, is a constant independent of ; in other words,
is inversely proportional to as described by (1). Given that
the total number of events in the queue, , sets the necessary
size of the segmental queue, this queue size is also independent
of and is a function only of , , and as described by (3).
In an intact lamprey, the swim frequency can vary from 0.25
to 10 Hz [19], thus generating an event rate of 1–40 events
per second per segment in each swim cycle (for ), and the
total axonal time delay can vary from 0.1 to 4.0 s. Therefore,
the maximum queue size in the worst case scenario for
our burst-envelope implementation is given by the largest swim
frequency and the longest total delay , as follows:
(4)
This worst case represents the largest event rate and the case in
which the events are stored for the longest possible time in the
queue.
In order to reduce the cost of silicon for a prototype system,
we assumed a slightly more limited, but reasonable, swim
3A rising (or falling) edge actually generates two events in the queue—one in
the ascending direction and one in the descending direction. These two events
are each stored in the queue for an average time of T=2, resulting in a combined
average time of T .
108 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 14, NO. 2, FEBRUARY 2006
frequency of 2 Hz and a maximum total delay of 2 s. Using
these numbers, we would need 16 queue elements per segment
to handle the resulting expected number of events. We built a
queue with 18 elements per segment in order to demonstrate
a proof-of-concept for our system. A queue of this size will
enable us to model both juvenile lampreys (larger , smaller )
and adult lampreys (smaller , larger ).
The assumptions and implementations (spiking or burst
envelopes) under which we are working affect the size of the
queue. The following three variances on the assumptions are
particularly important.
1) If the events arriving at a particular segment are not uni-
formly distributed and actually arrive in bursts, then more
queue stages would be required to faithfully preserve the
spatiotemporal patterns in the system.
2) In a spiking-neuron implementation, the segmental
event-generation rate is equal to the average spike rate
per neuron multiplied by the number of neurons. This
value—which would likely be much larger than that for a
burst-envelope implementation—would be independent
of the swim frequency of the animal and, thus, would
change the system assumptions.
3) If only a portion of the animal’s body were being mod-
eled while holding constant the intersegmental delay ,
the total number of events to be stored in a single seg-
ment would vary proportionally with the number of seg-
ments .
C. Synaptic Connections
The intersegmental architecture that we have developed is
inherently modular, making the resulting system easily scal-
able through the addition or subtraction of segments. Although
having no impact on the queue size, changes in the number of
segments cause changes in the number of synapses per segment
and is given by
(5)
where is the percentage of the number of segments down the
length of the body in which synaptic connections are made and
is the number of neurons per segment; in our case, .
Therefore, complete connectivity results in
synaptic connections. Note that the number of synapses per seg-
ment increases linearly with the number of segments.
In our present system, we implemented 16 segments
, two neurons per segment , 32 synapses per seg-
ment ( , ), and a queue size of 18 events per
segment . The size of an individual synapse is approxi-
mately 30 m 800 m, or less than 0.5% of the total CPG-chip
area. Thus, to implement a fourfold increase in the number of
segments (to 64 segments), we would need to increase the size
of the CPG chip by approximately a factor of two to accommo-
date the additional synapses.
VI. SYSTEM VERIFICATION
We have designed and successfully tested a prototype com-
munication system fabricated in a 2- m process that is adequate
to model a system of 16 segments with uniform axonal delay and
a queue size of 18 events/segment. Each segment includes an
Fig. 22. Photomicrograph of the communication network chip.
Fig. 23. Eight segments of the prototype system.
AER chip and a CPG chip that contains the silicon neurons and
synapses. The layout of the AER chip, shown in Fig. 22, con-
tains an eight-input arbiter/encoder section, an event processing
section, input and output merge sections, and an asynchronous
decoder that interfaces the communications network with the
synapses. Fig. 23 shows a part of the hardware implementation
of the prototype system. Although intrasegmental oscillators
with spiking neurons can be utilized with the communication
network, we chose to model only slow dynamics of the bursting
neurons through the implementation of the Morris–Lecar model
of a neuron (i.e., burst-envelope neurons) [7], [13].
A. Component Operation
To test the input merge section, we monitored the events and
their direction bits in three consecutive segments ( , ,
). Fig. 24 shows the request signals that exit from the
output merge section of each segment ( , ,
), their corresponding relative addresses ( ,
, ), and their corresponding direction
bits ( , , ). We observe that the
temporal patterns displayed by the events are preserved in the
ascending direction (from to ). Because the direction
bits at each segment are , the observed pattern is
consistent with our design—the events should propagate in
PATEL et al.: AN ASYNCHRONOUS ARCHITECTURE FOR MODELING INTERSEGMENTAL NEURAL COMMUNICATION 109
Fig. 24. Logic analyzer test results of a communication module—input merge
section.
Fig. 25. Logic analyzer test results of a communication module—the portion
of the event-processing section that is responsible for time stamping, storing,
and delaying events.
the ascending direction when . The events do not
propagate in the descending direction, demonstrating that the
input merge section of each segment is effectively filtering the
events. Fig. 24 also demonstrates that the events are delayed
and their relative addresses are incremented at each segment.
To test the portion of the event-processing section that is re-
sponsible for time stamping, storing, and delaying events, we
monitored an event as it entered the input merge section and
counted the number of clock cycles before the event exits at the
output merge section. Fig. 25 shows an event that originated in
and then propagated to and . The event is of de-
scending type and it starts with a relative address of
zero (at ). The event resided in the queue of for the
appropriate number of clock cycles—13 rising edge transitions
occurred in between the input request
and the output request .
B. System Operation
To demonstrate the system behavior without intersegmental
connections, we show in Fig. 26 the autocorrelations of a
neuron in two consecutive individual segments ( , ).
The values of the autocorrelations are significant and are only
present at the times corresponding to the harmonic frequencies
of the burst envelopes of the neurons. Because the synaptic
weights are set equal to zero, the synaptic connections should
not have an effect on system behavior, which is evident from
the figure.
Fig. 26. Autocorrelation of a single neuron in (A) segment four and
(B) segment five.
Fig. 27. Cross correlation between contralateral neurons in (A) segment four
and (B) segment five and (C) between homolog neurons in segment four and
segment five.
In another demonstration of the reliability of the commu-
nication system without intersegmental connections, we show
in Fig. 27 cross correlations between contralateral neurons in
two consecutive segments ( , ) and homolog neurons
in these same two segments. The correlations are computed on
periodic signals that are approximately 400 cycles long. This
result shows that for reasonable time intervals, the reliability of
the communications system is good. Fig. 27(c) also indicates, as
expected, that synchronization did not occur when no interseg-
mental connections were in place; instead, only noise is evident.
Intersegmental connections, however, will result in antiphasic
bursting behavior. These experiments indicate that the commu-
nication network is functioning properly.
110 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 14, NO. 2, FEBRUARY 2006
VII. CONCLUSION
We developed an asynchronous architecture for modeling
intersegmental neural communication that maintains neuro-
biological realism. This unique AER architecture includes a
pipelined broadcast scheme that emulates a large number of
intersegmental connections with distance-dependent delays.
The intrasegmental units are half-center oscillators composed
of silicon neurons, synaptic spread governs the interneuronal
synaptic connections, and its implementation is simplified by
using a relative addressing scheme, as opposed to a global bus
architecture. The architecture is scalable, supports multichip
communication, and operates independently of the type of
silicon neuron (spiking or burst envelopes).
We are using this network to develop full intersegmental coor-
dination systems that combine neural encodings with mechan-
ical actuation [20]. In addition, we are developing more com-
plex CPG circuits [14] and constructing hybrid neural systems
that are composed of one of these more recently improved sil-
icon neurons and a living heart interneuron from the heartbeat
timing network of the medicinal leech [21].
ACKNOWLEDGMENT
The authors would like to thank R. Calabrese and A. Cohen
for providing the biological expertise and inspiration for this
system.
REFERENCES
[1] J. Buchanan, “Identification of interneurons with contralateral, caudal
axons in the lamprey spinal cord: Synaptic interaction and morphology,”
J. Neurophys., vol. 47, pp. 961–975, 1982.
[2] C. A. Mead, Analog VLSI and Neural Systems. Reading, MA: Ad-
dison-Wesley, 1989.
[3] T. Williams, “Phase coupling and synaptic spread in chains of coupled
neuronal oscillators,” Science, vol. 258, pp. 662–665, 1992.
[4] K. A. Boahen, Communicating Neuronal Ensembles Between Neuro-
morphic Chips, ser. Neuromorphic Systems Engineering. Boston,
MA: Kluwer, 1997, ch. 11, pp. 229–261.
[5] M. A. Mahowald, “VLSI analogs of neuronal visual processing: a
synthesis of form and function,” Ph.D. dissertation, California Inst.
Technol., Pasadena, 1992.
[6] S. DeWeerth, G. Patel, M. Simoni, D. Schimmel, and R. Calabrese, “A
VLSI architecture for modeling intersegmental coordination,” in Proc.
17th Conf. Advanced Research in VLSI, A. Ishii and R. Brown, Eds.,
1997, pp. 182–200.
[7] C. Morris and H. Lecar, “Voltage oscillations in the barnacle giant
muscle fiber,” Biophys. J., vol. 35, no. 1, pp. 193–213, 1981.
[8] A. J. Martin, “Synthesis of asynchronous VLSI circuits,” California Inst.
Technol., Pasadena, 1991.
[9] , “Tomorrow’s digital hardware will be asynchronous and verified,”
in Proc. IFIP 12th World Computer Congr., vol. 1, J. van Leeuwen, R.
Aiken, and V. Vogt, Eds., Madrid, Spain, Sep. 1992, pp. 684–695.
[10] I. E. Sutherland, “Micropipelines,” Commun. ACM, vol. 32, no. 6, pp.
720–738, 1989.
[11] F. Prosser, D. Winkel, and E. Brunvand, “A comparison of modular self-
timed design styles,” Comput. Sci. Dept., Indiana Univ., Indianapolis,
IN, Tech. Rep. TR-420, 1994.
[12] M. A. Mahowald and R. Douglas, “A silicon neuron,” Nature, vol. 354,
no. 6354, pp. 515–518, Dec. 1991.
[13] G. Patel and S. DeWeerth, “Analogue VLSI Morris-Lecar neuron,” Elec-
tron. Lett., vol. 33, no. 12, pp. 997–998, 1997.
[14] M. F. Simoni, G. S. Cymbaluyk, M. E. Sorensen, R. L. Calabrese, and
S. P. DeWeerth, “A multiconductance silicon neuron with biologically
matched dynamics,” IEEE Trans. Biomed. Eng., vol. 51, no. 2, pp.
342–354, Feb. 2004.
[15] J. Lazzaro, J. Wawrzynek, M. Mahowald, M. Sivilotti, and D. Gille-
spie, “Silicon auditory processors as computer peripherals,” IEEE Trans.
Neural Netw., vol. 4, no. 3, pp. 523–528, May 1993.
[16] K. A. Boahen, “Point-to-point connectivity between neuromorphic chips
using address events,” IEEE Trans. Circuits Syst. II, Analog Digit. Signal
Process., vol. 47, no. 5, pp. 416–434, May 2000.
[17] C. Diorio, “Floating-gate MOSFETs,” in Analog VLSI: Circuits and
Principles, S.-C. Liu, J. Kramer, G. Indiveri, T. Delbrück, and R. Dou-
glas, Eds. Cambridge, MA: MIT Press, 2002, ch. 4, pp. 93–120.
[18] R. R. Harrison, J. A. Bragg, P. Hasler, B. A. Minch, and S. P. DeWeerth,
“A CMOS programmable analog memory-cell array using floating-gate
circuits,” IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process.,
vol. 48, no. 1, pp. 4–11, Jan. 2001.
[19] S. Grillner, P. Wallen, and L. Brodin, “Neuronal network generating lo-
comotor behavior in lamprey: Circuitry, transmitters, membrane proper-
ties, and simulation,” Annu. Rev. Neurosci., vol. 14, pp. 169–199, 1991.
[20] M. F. Simoni, “Synthesis and analysis of a physical model of biological
rhythmic motor control with sensorimotor feedback,” Ph.D. dissertation,
Georgia Inst. Technol., Atlanta, GA, 2002.
[21] M. Sorensen, S. DeWeerth, G. Cymbalyuk, and R. L. Calabrese, “Using
a hybrid neural system to reveal regulation of neuronal network activity
by an intrinsic current,” J. Neurosci., vol. 24, no. 23, pp. 5427–5438,
Jun. 2004.
Girish N. Patel (S’98–M’99) received the B.S.E.E. degree from California
Polytechnic, San Luis Obispo, in 1988, and the Ph.D. degree from the Georgia
Institute of Technology, Atlanta, in 1999.
He has worked for Texas Instruments Incorporated and Microtune, and is
currently with Alereon, Austin, TX.
Michael S. Reid (M’98) received the B.E.E. degree
from Auburn University, Auburn, AL, in 1988, the
M.B.A. degree from Carnegie Mellon University,
Pittsburgh, PA, in 1994, and the M.S.E.E. degree
from the Georgia Institute of Technology, Atlanta, in
2000. He is currently pursuing the Ph.D. degree in
the School of Electrical and Computer Engineering
at the Georgia Institute of Technology.
David E. Schimmel (S’82–M’90–SM’03) received
the B.S.E.E. (with distinction) and Ph.D. degrees
from Cornell University, Ithaca, NY, in 1984 and
1991, respectively.
He is currently Associate Professor in the School
of Electrical and Computer Engineering at the
Georgia Institute of Technology, Atlanta. He has
been a Visiting Researcher at the University of
Linkoping, Linkoping, Sweden, and a member of
the summer faculty at NASA’s Jet Propulsion Labo-
ratory. He has also been a consultant to a number of
corporations, including IBM Almaden Research Center and Intel. His research
interests include parallel computer architecture and algorithms, VLSI design,
asynchronous systems, and network hardware and software technologies.
Dr. Schimmel is a member of Tau Beta Pi and Eta Kappa Nu.
Stephen P. DeWeerth (S’85–M’90–SM’03) re-
ceived the M.S. degree in computer science and
the Ph.D. degree in computation and neural sys-
tems from the California Institute of Technology,
Pasadena, in 1987 and 1991, respectively.
He is a Professor in the Wallace H. Coulter Depart-
ment of Biomedical Engineering and in the School of
Electrical and Computer Engineering at the Georgia
Institute of Technology and at the Emory University
School of Medicine, Atlanta, GA. His research fo-
cuses on the implementation of neuromorphic elec-
tronic and robotic systems, the development of neural interfacing technologies,
and the study of the biological control of movement.
