With the drive for software defined radio systems, FPGAs are playing a key role in handling the higher data rates to and from the analog front end of these systems. Furthermore 
Introduction
The traditional building block for software defined radio architectures has been based around the microprocessor or more specifically the DSP processor. This is a device which has a fixed processing architecture and is able to execute different algorithms based on a sequence of instructions typically stored in memory, the software. Although this is a very flexible mechanism and has proved to enable the creation of powerful processing systems, the inheritant fixed architecture does place constraints on the initial design of any system that uses it. The advent of Field Programmable Gate Arrays, FPGAs, greatly relaxes the processing architecture constraints and enables the systems engineer to define custom processing architectures while still having the flexibility of processor software.
Another key element of microprocessor-based systems is the operating system, which typically offers real-time kernels for embedded applications. The operating system provides an abstracted software programming interface to manage the processor, hardware peripherals, memory, software tasks, interprocess and external communications. An interesting question is then raised when an FPGA is employed within the system as to whether an operating system is required or indeed is the concept of an operating system relevant to FPGA based systems.
Furthermore, in the same way that we can have multi-processor systems, i.e. systems containing more than one processor, the same scenario is available with FPGA centric systems, again the question is raised what is a suitable architecture for multi-FPGA management.
FPGA-centric systems for DSP Algorithms
FPGAs have the ability to supercede the processor. FPGAs provide the ability for software configuration and is capable of providing a superset of architecture implementations when compared to the fixed processor architecture. Take the fundamental example of implementing a FIR filter with 'n' taps on a processor and an FPGA, shown in Figure 1 When implemented on a processor architecture which has a fixed processing structure we have to use software instructions to sequence through the data 'n' times to get the output result. Whereas on an FPGA structure we can create a processing structure to match the algorithm we need to process, in this case it is possible to implement the 'n' tap FIR filter directly on the FPGA and hence carry out the complete filter operation in one clock cycle.
Another key element of this feature is the ability to maintain data throughput on the FPGA even if the complexity of the algorithm increases, i.e. if 'n' is increased we can change the processing structure and still carry out the filter operation in one clock cycle. The processor architecture would required an increased number of clock cycles and thus reduce the data throughput rate. The processor data throughput rate is given by
Where R is data throughput rate, n is number of iteration/data sample T is period of processor clock
Comparing the throughput of varying lengths of FIR filter implemented on a 1Ghz processor and a 200Mhz FPGA it can be clearly seen that the real data throughput performance rapidly degrades on the processor architecture. 
The role of the operating system on processors
The goal of DSP software developers for processors is the generation of an application that executes the desired data processing algorithm. This is achieved through integrated development environments which manages code and the underlying compilers and assemblers to create the application for the processor.
A key component, which also accelerates the development of the final application, is the operating system. This is highly optimised code that manages the 'house-keeping' of the processor and the associated system of software and peripherals, the main categories are
• Processor booting and initialisation
• Peripheral Control
• Task management
• Inter-process communication
• External communication (for networked processor systems)
While the application algorithm provides the main data processing part of the system, the operating system provides the fundamental control elements of the system. Of the shelf operating systems are widely available and are generally used over home grown operating systems. Using a standard operating system provide greater flexibility, portability and interoperability between other applications that may be included on the processor.
Multiprocessor Systems
To increase the performance within systems, the architecture model can be extended to include a collection of processor to form a multiprocessor system. Different topologies are applied ranging from symmetrical multiprocessing through to distributed processing. To reflect this some operating systems are extended to support platforms with multiple processors, such as Windows, Linux and VSPworks etc for native integration or extensions to create loosely coupled system based protocols such as Ethernet to generator Beowulf systems for example.
FPGAs and operating systems
Considering FPGAs which are typically looked upon has designing hardware, raising the question of operating systems and FPGAs doesn't at first sight appear to have a connection. This statement would be true for the early generation of FPGAs whose main function was glue logic for systems. Today's FPGAs are a different beast, offering the ability to create very sophisticated systems within the devices. FPGAs have the ability to implement a wealth of different processing architectures. At one extreme one or more processors can be implemented within the device [1] and represents closely our traditional processor models. At the other extreme we can implement directly one or more algorithms onto the logic resources to effectively create a custom machine for carrying out the algorithms [2] . Many hybrid versions are also possible together with dynamic reconfiguration of the devices themselves such that the processing architecture has a temporal dimension also. In the same way that a number of house keeping functions where carried out by the operating system on the processor a comparable requirement is also important to FPGA based computing systems. In a comparable list to processors, FPGA systems require the following house keeping functions
• FPGA booting and initialisation
• Peripheral Control and interfacing
• Task partitioning
• External communication (for networked FPGA systems)
4.1.
Addressing the operating system requirements for FPGAs To address the requirement for control management within the FPGA we will examine tools which address the demands of FPGA centric systems. Firstly the key component is the management of the configuration, initialization and with SRAM based FPGAs the dynamic reconfiguration of the FPGAs. The dynamic reconfiguration can be thought of as task management since the controller needs to manage the sequence of loading the new configurations into the FPGAs in the same way that tasks are scheduled on processors from virtual memory. Task allocation within the FPGA is generally done at design time with the placement of algorithms done during place and route process, although research is on going in this area [3] .
The Field Upgradeable Software Environment, FUSE, [4] provides extensions to the traditional operating system on a controlling processor to control and manage FPGA centric systems. FUSE handles the management at the system level, but we also need to look more closely at the management within the FPGA itself. As FPGA capacities continue to increase the system complexity within the device is increasing. The complexity arises from larger number of algorithms placed within the device which required communications between the different operations. Much development effort has gone into creating tools for generating data processing algorithms, such as System Generator from Xilinx [5] , AccelFPGA [6] from AccelChip [6] etc. However this does leave a need for creating the system control and communication or in other words the inter-process communication both within a device and external to the device. The time taken for creating the control plane for a complex system can be as time consuming as creating the algorithms themselves. It is comparable to creating the operating system everytime that we need to create a software application for a processor.
Managing algorithm communication and control within FPGA systems
DIMEtalk [7] is a tool which can handle the communication of data between different algorithm operating within an FPGA or on different FPGAs. Furthermore it also provides an interface to enable the quick development of the control and communications framework via a GUI interface.
DIMEtalk provides the four main components that are required to create the runtime framework equivalent to the inter-process and multi-processor communication of processor based systems. The four main building blocks are
DIMEtalk Nodes
These are the points where data can enter or leave the network, these are the connection points to the various algorithms within the system.
DIMEtalk Routers
These resources provide the 'intelligence' within the system, they manage the routing of the data from node to node and determine the route that the data takes.
DIMEtalk Bridges
Bridges provide the mechanism to expand to multi-FPGA systems, thus enables the easy expansion to large scalable systems.
DIMEtalk Edges
Within large systems, different protocols may be used for different parts of the system. Edges provide the component for interfacing DIMEtalk to other networks so that data can transverse from the internal DIMEtalk network to an external network, e.g. PCI.
Large systems can now be easily created using these building blocks ranging from a low complexity single FPGA system to very high complexity multiple FPGA system. An example is shown in Figure 5 In this example we observe the use of all the main component that results in a system spanning 2 FPGAs which has 6 nodes where algorithms can be interfaced to, and an edge which enables the system to be interfaced across PCI to a Host PC for example.
5.
Building FPGA centric systems for software radio
With the drive to multi-antenna and multi-channel systems [8] [9] DIMEtalk provides a rapid mechanism for creating the control infrastructure for the system such as those used in software radio systems. Figure 6 shows a FPGA based cPCI platform with 8 input and 8 output analog channels capable of over 3.5Gbytes/sec data transfer. Overlaid in the figure is an illustration of the system that needs to be implemented with the FPGA computing architecture. Within each FPGA the software radio algorithms are created, there are many tools available that can create these ranging from VHDL based tools to MATLAB and C based tools. Also illustrated is the necessary DIMEtalk network which provides the complete infrastructure that manages the system. The system can also be controlled over the PCI bus from the main controller that manages the complete base station for example. 
Conclusions
FPGA are proving to be the next architecture of choice for creating the high performance DSP systems for next generation software radio systems. Typically it has been assumed that the system tools are embryonic or non-existence. A new suite of tools, FUSE and DIMEtalk, have been introduced to show the availability of system level tools for creating the infrastructure framework for scalable FPGA-centric systems. The output from these tools complement the range of advanced tools available for implementing the algorithms from a wide range of languages.
7.

