An architecture for a Cell Switch Fabric (CSF) with a new bus assignment strategy is presented. The proposed architecture has a modular structure with a chip partitioning oriented to avoid the system from falling down totally, thus achieving expandability and increasing reusability. A discussion about different algorithms for the bus assignment is given. The shared bus is assigned in a cyclic and rotative way, switching microcells instead of cells. The CSF is built in four PCBs where every one has a capacity for four ports at 164.323 Mbps, the external port rate is 155.52 Mbps. A microcontroller realizes some tests and communicates with a PC, which runs a test, verification and CSF configuration program. Some parameters about the CSF behavior are measured too. Each Port was implemented on a CPLD FLEX10K100 and the Switch Control Block on a circuit MAX7128, both from Altera Company.
El/T1 [3] .
The kernel every broadband switch is the design is the major goal of the present work. Main objectives of this paper are:
1. to explain the selected architecture and how it works, 2. to design and to implement the CSF on complex CPLD circuits, 3 . to discuss the switch tests.
CELL SWITCH FABRIC ARCHITECTURE
system which must be able to process per port every cell each 2. 7 gs. An application of this switch fabric is a Data-Voice Switch-see Figure 1 -or an Ethernet Switch, which can be created if all line card interfaces are Ethernet type [1] . The CSF consists of a maximum of N---M-16, internal transfer rate V 164. 323 Mbps a capacity C--NV Architectures [4, 5] based on shared-memory [6, 7] and common bus [8, 9] and their variants [10] have been studied and shown in the literature. To select a specific architecture general design criteria were taken into account including:
A switch or the designed cell switch fabric can, in general, be represented as a 7-tuple (N, M, C, P, S, T, fr), where N is the number of inports, M is the number of outports, C is its capacity, P is the performance-a set of parameters pie P-, fr is the switch routing function, S is a set of cells and T a set of tags related to function.
In a switch the routing function fr can be represented in the following way: given a set of cells S and a set of tags T, the switch routes the cells siES to the corresponding output ports (one or more), i.e., fr S x T-(31(3 C_ S, where every 0i E (3 is obtained from si, also they have the same information content (}, or Vsk E S ==> 0j.E (0.,.) [21. The design and implementation of a CSF is rather difficult, because it is a very hard real time simplicity, to achieve a rapid implementation in hardware, expandability, to achieve modularity and scalability, possibility An object oriented conception of this element helped to dominate the design complexity [12] . For example, a buffer is a queue with the basic operations Put(Q,,x) and Get(Q,), but for a implementation in hardware these operations are too general. Starting from this point, the functionality of an OE can be seen as a queue ofn cells, Q,, where to each cell corresponds a buffer space. Any cell can be seen as a queue of m microcells, qm, (in our case rn--7). Then the total number of queues is in our case, rn x n. Consequently, the number of read/write pointers to manage these queues is excessively large. The mentioned two basic operations take the form: Put(Q,,i,j,x), stores a microcell x, into the queue Q, at the position j of the cell buffer space i, Get(Q,,i,j), returns a microcell x from the queue Q, at the position j of the cell buffer space and sends it through the interface to the associated line circuit, These functions are more specific to implement than their ancestors. In order to avoid a laborious implementation with 2 x rn x n pointers, we used only two reloadable pointers (one for read and one for write operation), which were implemented by FSMs. This type of pointer should be initialized with (i,j) before each operation, but simultaneously it must keep its old content, because this value will be used later to calculate a new pair (i,j). In this way, we can control the successive write or read operations to specific memory locations. This form constitutes the easiest implementation, but it does not take into account that not all I Dk are active simultaneously and therefore waste time slots. The main disadvantages of this assignment policy are: the bus can be assigned to an IE, which is not active and, if there is an OE space for one cell, the probability of success is not equal for all IEs. From the traffic point of view the cyclic assignment leads to the following: as each port has an index, given by a number from 0,...,N-1, those ports with a low index have a probability to find an empty cell buffer space in the destined OE slightly greater than the others. Strictly speaking, the cell loss probability depends on the port index.
Rotative Cyclic Assignment
For this algorithm, the control block introduces specific sequences. In the proposed SCF, the bus is assigned not only in cyclic form but also in a rotative form. In this way, if many cells arrive simultaneously to the switch fabric, the probability of the selection of an IE does not depend from its position in the switch. Therefore, the probability to find an empty cell buffer space in the destined OE is equal for all input ports. Also, this algorithm avoids the unfairness present in the cyclic assignment and its circuiting complexity is slightly higher than the one in the cyclic assignment. Time slots are assigned through a round-robin algorithm, where the time-quantum q r. ) where is the time slot number, k is the frame number, and N is the number of ports.
Proof If the number of time slots -time interval of a microcell i {0, 1,..., }, the frame number k{0, 1,...,} then (i+ k) rood N generate sequences of the form <0,1,2,...,(N-I)>; <l,2,...,(N-1),0>; <2,..., (N-1),1,2> ,...
Dynamic and Sequential Assignment
The principle of dynamic assignment is to distribute time only between the active inports. This is the most complex form, because it is necessary" This work deals with a SCF prototype to validate a design, using CPLDs, a microprocessor, internal and external memories, and software components. System-level partitioning [16] into hard-and software components is a very important task in order to achieve failure-tolerance, testability, scalability and reliability among other factors. Identification of critical segments of code to assure performance on the programmable hardware was and is a very important task. Functions were implemented using FSMs and interconnection of some of them. For the SCF implementation the rotative cyclic bus assignment strategy was selected. In addition to the above mentioned factors the following requirements were decisive for the selection of the architecture: the bandwidth, the internal parallelization level on the shared bus L 64 bits, memory needs for the temporal storing of the cells. Major characteristics of our design are -see Figure 7 :
Architecture with four PCBs and each PCB card has four CPLDs. Each port was implemented on a CPLD FLEX10K100 [17, 18] , occupying 211 pins, 50% of logic resources and 67% memory capacity. The Switch Fabric Control Block was implemented on a circuit MAX7128. The shared bus implementation is a hard problem due to the high operation frequency and the appearance of its harmonics.
Two or more PCB cards can be connected together to enhance the CSF bandwidth.
Partition of the hardware and software and their concurrent design. Use of a microcontroller on board for some tests and for continuous monitoring.
Switch Tests
The importance of test methodologies grows continuously due to the increasing circuit complexity. An example of them is the Built-In Self Test (BIST), in which test functions are embedded into the circuit itself [19] Reference [20] 
