This paper describes a behavioral synthesis tool called the MATCH compiler developed as part of the DARPA Adaptive Computing Systems program. The MATCH compiler reads in high-level descriptions of DSP applications written in MATLAB, and automatically generates synthesizable RTL models in VHDL. The RTL models can be synthesized using commercial logic synthesis tools and place and route tools onto FPGAs. By linking the two design domains of DSP and FPGA hardware design, the MATCH compiler provides DSP design teams a significant reduction in design labor and time, elimination of misinterpretations and costly design rework, automatic verification of the hardware implementation, and the ability of systems engineers and algorithm developers to perform architectural exploration in the early phases of their development cycle. The paper describes how powerful directives are used to provide high-level architectural tradeoffs for the DSP designer. The MATCH compiler has been transferred to a startup company called AccelChip which has developed a commercial version of the compiler called AccelFPGA. Experimental results are reported using AccelFPGA on a set of nine MATLAB benchmarks that are mapped onto the recent Xilinx Virtex I1 and Altera Stratix FF'GAs. The benchmark programs range in complexity from 20 lines to 170 lines of MATLAB code and produce VHDL code ranging from 1500 to 4500 lines of code. The compilation times range from 3 seconds to 40 seconds.
Introduction
The performance requirements of today's communication systems, such as 3G and 4G wireless communication systems, MPEG4 video and Video over IP, now exceed the capabilities of general-purpose processors. With the introduction of advanced Field-Programmable Gate Array (FPGA) architectures such as the Xilinx Virtex-I1 [14] , and the Altera Stratix [2], a new hardware alternative is available for DSP designers that combines all the benefits of general-purpose processors with the performance advantage of ASICs.
DSP design has traditionally been divided into two types of activities -systems/algorithm development and hardwarehoftware implementation. The majority of DSP system designers and algorithm developers use the MATLAB language [9]. The first step in this flow is the conversion of the floating point MATLAB algorithm, into a fixed point version using quantizers from the Filter Design and Analysis (FDA) Toolbox for MATLAB. Algorithmic tradeoffs such as the precision of filter coefficients and the number of taps used in a filter are performed at the MATLAB level. Hardware design teams take the specifications created by the systems engineers and algorithm developers (in the form of a fixed point MATLAB code) and create a physical implementation of the DSP design. If the target is an FPGA, PLD or ASIC, the first task is to create a register transfer level (RTL) model in a hardware description language (HDL) such as VHDL and Verilog. The RTL HDL is synthesized by a logic synthesis tool, and placed and routed onto an FPGA using backend tools. The process of creating an RTL model and a simulation testbench takes about one to two months with the tools currently used today. Figure 1 . MATCH also allows users to perform quick iterations of hardware designs, allowing area and speed trade offs and architecture exploration. Some of the unique and challenging features of the MATLAB language are the support for array operations (operating on matrices instead of scalars), an interpretive environment where the types and shapes of variables are not declared at compile time but inferred at runtime, and a very powerful set of built in library functions.
Related Work

Directives in MATCH Compiler
directive is prefixed by "%!ACCEL". This makes the directives appear as comments to other environments dealing with MATLAB since all comments in MATLAB start with %. Some of these directives are described in more detail below. 
TARGET Directive
BEGIN-HARDWARE Directive
MATCH allows the user to use hardware partitioning directive to demarcate parts of the input source that are targeted for hardware synthesis and parts that are not. The BEGIN-HARDWARE and END-HARDWARE directives indicate a section of MATLAB code that is intended for hardware synthesis.
SHAPE Directive
MATLAB is designed to be an interpretive environment where the types and shapes of variables are determined at 
STREAM Directive
The purpose of the STREAM directive is the specification of the type of data flow that inputs and outputs of the synthesized hardware will handle. Streaming data is defined as data with a regular rate of flow through the hardware. For systems that will handle streaming data, MATCH supports the automatic creation of ports with the required buffering mechanisms to sustain the regular flow of data with the use of the STREAM directive. These mechanisms include 'double-buffering' to allow concurrent processing of data and buffering of new data samples. 
UNROLL Directive
The UNROLL directive is a mechanism to expand the source MATLAB to create more copies of loop bodies, thereby increasing performance optimizations. Let us consider an example MATLAB for loop.
for i = 1: 16
Without the UNROLL directive, the MATLAB code has one addition and one multiplication operation in the data flow graph of its basic block hence the MATCH compiler will generate an RTL VHDL or Verilog which will use one adder and one multiplier to schedule this computation which will take 16 cycles. If the code were to be unrolled as shown, the loop body will be replicated four times and the loop index in successive copies are incremented. In addition, scalars that carry values from one iteration to another iteration are renamed. For example, the scalar "sum" would be renamed in successive copies. This exposes opportunities to chain operations to the compiler. for i = 1:4: 16
MATCH now recognizes four addition and four multiplication operations in each basic block hence it will schedule it across four cycles using four adders and four multipliers in parallel. The UNROLL directive is therefore used by the user to generate different areadelay hardware alternatives. It is illustrated in Figure 3 . 
Pipeline Directive
&IHATCH S T R B M S
PIw*TCH
PIPELINE Directive
Pipelining increases the throughput of a datapath by introducing registers in the datapath. This increase in throughput is particularly important when the datapath is iterated in the overall design. The PIPELINE directive is placed just before the loop, whose body is to be pipelined. 
end;
The PIPELINE directive is illustrated in Figure 4 .
Results on Benchmarks
The MATCH compiler has been commercialized by a company called AccelChip [15] in a product called AccelFPGA
We now report some experimental results on various benchmark MATLAB programs using the AccelFFGA compiler. Table 2 shows the experimental results of the AccelFPGA version 1.4 compiler on nine MATLAB benchmarks on a Xilinx Virtex I1 device XC2V250. Results are given in terms of resources used, and performance obtained as estimated by the Synplify Pro 7.1 tool executed on the RTL Verilog that was output by AccelFPGA. The resource results are reported in terms of LUTS, Multiplexers, embedded multipliers, ROMS and BlockRAMS used. The performance was measured in terms clock frequency of the design as estimated by the internal clock frequency inferred by the Synplify Pro 7.1 tool, and the latency and throughput of the design in terms of clock cycles by using the ModelSim 5Se RTL simulator. Table 3 shows similar results for the nine MATLAB benchmark examples on an Altera Stratix EPlSlO device. Resources are measured in LUTS, ATOMS, MACS, and DSP Blocks, and performance is again measured in clock frequency, latency and throughput.
Conclusions
This paper described a behavioral synthesis tool called MATCH which reads in high-level descriptions of DSP applications written in MATLAB, and automatically generates synthesizable RTL models and simulation testbenches in VHDL or Verilog. The RTL models can be synthesized using commercial logic synthesis tools and place and route tools onto FPGAs. By linking the two design domains of DSP and FFGA hardware design, MATCH provides DSP design teams a significant reduction in design labor and time, elimination of misinterpretations and costly design rework, automatic verification of the hardware implementation, and the ability of systems engineers and algorithm developers to perform architectural exploration in the early phases of their development cycle.
The paper described how powerful directives are used to provide high-level architectural tradeoffs for the DSP designer. Experimental results were reported on a set of nine MATLAB benchmarks that are mapped onto the recent Xilinx Virtex Il and Altera Stratix FPGAs. 
Acknowledgements
References
[
