Abstract: Model predictive control (MPC) algorithm has been widely applied in industry process control systems since last century, like petro-refining and other chemical processes. However, it is rarely used in field controllers due to the complexity of online optimization. This paper introduces a new MPC system based on the field programmable gate array (FPGA) chip. In order to make MPC controller more efficient, a hybrid way is adopted to design the quadratic programming (QP) solver on the FPGA chip. Thus, the controller can solve the optimization problem of MPC in a pretty short time. The whole control system is implemented on a Virtex-4 FPGA chip and applied to the angle servo control of two motors with a satisfactory result. The experiment shows that it is possible to apply MPC algorithm in field controller by the proposed design.
INTRODUCTION
Due to the capability of handing constraints explicitly, MPC has been widely adopted by industrial process control. However, since the MPC controller generally needs to solve a constrained optimization problem online, the online computational complexity results in most of MPC systems implemented on high performance computers. This restricts its application on field controller and extension to various application fields (see detail in Qin et al. [2003] ).
In order to implement MPC algorithm in field controllers, the online optimization procedure must be completed in a pretty short time by a low-cost computational device. Among other candidate hardware platforms, e.g. advanced RISC machine (ARM), digital signal processing (DSP), application-specific integrated circuit (ASIC), FPGA technology combines flexibility and computing efficiency. FPGA devices contain many programmable logic resources, which can be configured to perform complex functions directly in hardware. With nicely designed architecture, such as pipelining and parallel computing, FPGA may achieve very high processing speed (cf. Tessier [2001] ), much faster than traditional software implementations. This feature makes FPGA suitable for computing intensive tasks. Recent FPGA products integrate basic logic blocks with embedded microprocessors and related peripherals to form a complete embedded system. It not only guarantees the system's performance, but also makes the design of ⋆ Supported by the National Science Foundation of China (Grant No. 60934007, 61074060) , China, Postdoctoral Science Foundation (Grand No. 20090460627) , Shanghai Postdoctoral Scientific Program (Grant No. 10R21414600), and China Postdoctoral Science Foundation Special Support (Grand No. 201003272) .
system more flexible and adaptable, which can greatly reduce the system design period (cf. Kuon et al. [2007] ).
In recent years, using FPGA to implement MPC algorithm has attracted particular interest. Bleris et al. [2006] and Ling et al. [2006] have reported some developments on this topic. In Bleris et al. [2006] , the co-processor is adopted to complete most of the computation of MPC algorithm. But the computation is done in sequential order rather than parallel, which restricts the processing performance. Ling et al. [2006] proposes a MATLAB/Handel-C codesign procedure to implement the MPC on FPGA, where the interior point method is adopted to solve a fixed optimization problem. But this design lacks flexibility, thus requires many efforts to port to real applications.
In this paper, we first analyze the solving procedure of active set method, which is used to solve the QP problem of MPC. This sets up the basis to design the hardware/software structure of the FPGA controller. In the proposed design, the online optimization procedure is solved by a hybrid way, where most computation is done by a highly modularized hardware QP solver. This QP solver is stretchable to maximize logic resource utilization as well as parallel processing capability, and is fit for a series of FPGA devices. The whole design is implemented on a Virtex-4 FPGA device and is tested on an angle tracking system. The experiment results verify the effectiveness of the proposed design, and shows the potential of MPC in field applications. This paper is organized as follows. In Section 2, the knowledge of MPC and active set method is introduced. Section 3 states the design of MPC system on FPGA in detail. An experiment on an angle servo system is presented in Section 4 to test our design.
CONTROL PROBLEM STATEMENT

Model predictive control
The system controlled by MPC can be commonly described as
where x ∈ R n and u ∈ R l are the state and control input vectors respectively, A ∈ R n×n is the state matrix, B ∈ R n×l is the input matrix. As introduced in Morari et al. [1999] , MPC controller can obtain the optimal control inputs
T by solving the following optimization problem
where p is the length of the optimization horizon, m is the length of the control horizon, Q i ∈ R n×n and R i ∈ R l×l are weighting matrices, P f ∈ R n×n is the terminal weighting matrix and
In (2), the system constraints on control inputs and states are generally linear constraints. With (1) and (2), the above optimization problem can be described in the standard QP form
where c(k) ∈ R lm is a function of x(k), H ∈ R lm×lm is a constant matrix determined by weighting matrices P f , Q i and R i , G is a constant matrix determined by E and F .
According to the receding horizon optimization of MPC, the optimization problem (3) generally should be solved right after each sampling instant and u(k) will be acted on the system. The whole procedure will repeat at next sampling instant. Therefore, it normally requires a powerful computational device to solve the online optimization problem (3) in real time, which limits the application of MPC algorithm in field controllers.
Active set method
For MPC controllers, the major procedure of online optimization is to solve the QP problem (3). There are several methods available, such as interior point, active set, and conjugate gradient. Compare to other algorithms, active set method requires less computational effort for small scale problems (see detail in Bartlett et al. [2000] ). On the other hand, the problem scale of field controllers is commonly small. Hence, we choose active set method to solve (3). Active set method converts an inequality constrained QP problem into a series of equality constrained QP problems, then uses Lagrange method to iteratively solve each problem. In this section, we use x to denote the optimize variable to introduce the method in brief. Let x (k) be a feasible solution after the kth iteration, I (k) is the active constraint set, then the equality constrained QP problem is
Denote the optimal solution and Lagrange multiplier be
is the optimal solution of (3). If one or more elements in λ (k) are less than zero, then the corresponding constraint is unnecessary and should be excluded from
, update active constraint set I (k) and begin the next iteration.
Chen [2005] proposes a way to solve (4) by computing
where
here G and b represent the active constraints, and may vary in each iteration.
Note that active set method requires an initial feasible solution x (0) . Usually this solution can be obatined by using linear programming (LP) method (see Fletcher [1970] ).
IMPLEMENTATION OF MPC CONTROL SYSTEM ON FPGA
Structure of MPC control system
The MPC control system generally owns the following functions, which correspond to the blocks in Figure 1 .
Data Acquisition Sample the outputs of the controlled plant. Preparation Prepare data to form the optimization problem (3). Optimization Solve the optimization problem (3). Post-processing Calculate control inputs from the optimal solution. Execution Act control inputs on the controlled plant. Step 1, 2, 4 and 5 vary from system to system. In order to make our design flexible, we use a hybrid structure to make good use of the capability of FPGA systems. A hardware QP solver is designed to accelerate the optimization process (see detail in the following sections), a microprocessor with embedded software are used to control the working flow. With this hybrid structure, only software needs to be modified to adapt different control environments and objects.
Design of QP solver
The hardware QP solver is the core of our design, and is implemented by the logic resources in FPGA. The software runs in the embedded microprocessor of the FPGA. Memory controllers are used to connect external memory devices. All components in this design are connected via the FPGA internal bus. The system interconnection is shown in Fig. 2 .
For active set method adopted to solve the optimization problem (3), a hybrid way is used to design the QP solver, which includes the software part and hardware part. Consider the characteristic of FPGA, we should avoid division and branch operation in hardware to maximize performance and resource utilization according to Xilinx [2008] . Therefore the software part is used to test the stop criterion, update the active constraint set, and control the optimization flow. Meanwhile, the software part is also used to store control parameters because they may be changed at run-time. The hardware part is carefully designed to calculate the optimal solution and Lagrange multiplier of (4) in each iteration. In addition, the QP solver needs to handle QP problem in different scales to meet the requierment of MPC algorithm, and simplify the maintenance and regulation of MPC controllers.
1) Matrix operation
By observing (4) and (5), the basic operations are matrix addition, subtraction, multiplication and inversion, where the dimensions of matrices are arbitrary. By considering the resource limitation of FPGA, we left the inversion operation to software part due to its complexity. Because H −1 is a constant matrix which can be obtained offline, the online matrix inversion is calculated only once per iteration, this design will not affect total performance severely.
For other matrix operations, we design a hardware matrix multiplier and a hardware matrix adder to improve the efficiency. By referring to Dou et al [2005] , where a bandwidth and memory optimized matrix multiplier design is proposed, we decompose matrix multiplication to vector products of each row and column and use a multiplieraccumulator (MAC) to calculate these products. By a pipelined structure, the multiplier takes only one clock cycle to do one floating-point multiplication and accumulation. A special direct memory access (DMA) block is designed to make sure correct data is delivered and calculated on every cycle. The total time on calculation is approximately n 3 cycles (if all matrices are n × n). By a similar structure, the matrix adder has a floatingpoint adder to perform addition and subtraction, with calculation time approximate to n 2 . In these modules, we adopt IEEE-754 compliant single floating-point precision to reduce resource consumption, and improve compatibility. Compared with Dou et al [2005] , our design requires less resource and less control signals, which reduces the complexity of implementation and resource consumption.
2) E-QP solver
Beside data calculation in matrix operations, data transfer between microprocessor and hardware via data bus is also time-consuming. To illustrate this, we investigate matrix multiplication of different scales, count elapsed time in calculation and transfer seperately. Results are listed in Table 1 , where the elapsed time is counted in clock cycles. The result is compared with a pure software approach. We can see that although hardware approach is more efficient than software, considerable amount of time is wasted on data transfer. By investigating (5), except input variables (H −1 , G, c, and b) and output variables (x and λ), other variables are temporary and can be stored locally rather than transferred away and back if we can generate one module to solve (5) together. We name the module as E-QP solver. By default, the E-QP solver contains one matrix multiplier and one adder, and they can function in parallel. Based on this structure, we decompose the solving procedure of (5) into 10 steps listed in Table 2 , each step contains at most one multiplication and one addition/subtraction, which means they can be calculated simutaneously. Designer may add additional multiplier or adder to further increase computing efficiency. 
In Table 2 , there are 16 temporary variables to be stored in the E-QP solver. 8 of them are square matrices and others are vectors. Each variable requires a unique internal address to avoid access conflict. We use on-chip RAM blocks as local memory to store these variables. Several blocks can be combined to form a large memory region, each memory region has two data ports that can be accessed independently. The E-QP solver uses four memory regions to store all 22 variables. Two memory regions are connected to the FPGA data bus for data transfer. We use a multiplexer array to make other data ports accessible by the adder and multiplier. Fig. 3 shows the block diagram. Table 3 illustrates the comparison between the E-QP solver and the solver with independent multiplier and adder to solve (5). The time for matrix inversion is not included. The results show that E-QP solver has a significant performance improvement on the total time of solving (5). 
Implementation on Virtex-4
The design introduced in last section is implemented in the ML403 develop board from Xilinx, Inc. As it is a lowcost platform, the implementation is more close the actual situation of field controllers. This platform has a Virtex-4 series, XC4VFX12-FF668-10 FPGA chip with a hard core The hardware E-QP solver discussed above are written in Verilog HDL. Combining with other hardware modules provided by Xilinx, the whole hardware circuit design is downloaded into the FPGA. Embedded software is developed in C++. 
A CASE STUDY
Test system
The test system is an angle servo control system. As shown in Fig. 5 , there are two DC motors: one is master and the other is slave, where two needles are fixed on the rotors to point the angles of the motors respectively. The controller needs to drive the slave motor's needle to track the master motor's needle, while the rotation speed of the master motor is unknown and unmeasurable. Fig. 6 shows the block diagram of the angle servo control system. The control input is the driving voltage of the slave motor and the output is the angle difference between the two needles, which is the only feedback signal. This system can simulate many practical systems, such as radar tracking systems.
The test system has a differential synchro to measure the angle difference between the two needles. Due to the principle of the differential synchro, there is an induced disturbance on the output signal. Although the low pass filter is used in the differential synchro, it cannot erase the disturbance. For example, Fig. 7 clearly shows the existence of the disturbance when ∆θ = 30
• , where the dashed line is the desired output and the solid line is the actual output of the differential synchro. 
System modeling and controller design
In order to simplify the design, we characterize the slave motor as an inertial system as follows.
where θ is the angle of the rotor, ω is the angular speed, u is the driving voltage, τ is the time constant of the motor, and α is the steady speed factor.
The relation between the steady rotation speed and driving voltage of the slave motor is shown in Fig. 8 . The constraint on the driving voltage isū ≥ u ≥ u, wherē u = 8V and u = −5V . Besides, we obtain α s = 0.59 and τ s = 9ms from the motor manual. The master motor's characteristics are slightly different from the slave motor, but we intentionally make this information unknown to our controller, and do not use it in the design process.
For the angle servo system, let the system state be ∆θ = θ m − θ s and ∆ω = ω m − ω s , where θ m and ω m represent the angle and the angular speed of the master motor respectively, and θ s and ω s represent the angle and the angular speed of the slave motor respectively. Since the angular speed of the master motor cannot be measured, we assume that it is constant. Then the system model can be expressed as
where α s and τ s are slave motor's parameters. In model (7),û m is the driving voltage of slave motor corresponding to the same angular speed of master motor. And we will give an estimation onû m later.
By choosing the sample time T = 5ms, we can get the discrete system model as follows.
x(k + 1) = 1 0.004 0 0.58
T is system state. ∆ω(k) can be approximated by
Then the optimization problem of MPC is derived as
where H and F are constant matrices determined by system model (8) and weighting matrices P f , Q i , and R i .
Since the control goal is angle tracking, the control horizon m is chosen to be 3, which is intended to simplify the online computation. Meanwhile, in order to optimize the performance, the optimized horizon p is chosen to be 10. Because ∆θ is the only measurable output, we choose the weighting on ∆θ larger than that on ∆ω to enhance performance. In addition, since (8) is a critical stable system, we adopt a terminal cost function 1 2 x T P f x in (2) to improve the performance and ensure closed-loop stability rather than using terminal set (cf. Mayne et al. [2000] ). P f is determined by solving the discrete riccati equation of system (7).
Since there are no constraints on system states, U (k) = 0 is a feasible solution for (10), which simplifies the online computation.
The last problem is how to estimateû m . From (8), we havê
Since the sample rate is much faster than the change of u m , it is appropriate to letû m (k) ≈û m (k − 1). In order to reduce the influence from noises, we use a moving average method to filter the high frequency noise and get a more accurate estimation ofû m :
where L is the moving length. Obviously, a larger L will result a smooth control signal, but slower response on angle tracking, and vice versa. In our case, we choose L = 5. 
Test result
We use ML403 FPGA platform to perform the experiment. It is connected to a A/D converter to sample the angle difference ∆θ, and a D/A converter with driving circuit to drive the slave motor. The driving voltage of master motor is set manually.
In our tests, the QP solver takes approximately 0.02ms to solve (5) and (6) in every sample interval. If we use software to solve these equations, this time will grow up to 0.4ms. This result indicates that our QP solver can greatly improve the computing performance of MPC controllers. If the scale of the optimization problem increases, the performance improvement will be more magnificent.
The control result of our controller is also very satisfactory. Fig. 9 shows when the master motor is running with the driving voltage as 5V (about 100r/min), our controller can keep the angle difference within ±0.03rad (±1.7
• ). Fig. 10 shows when the driving voltage of master motor has a sudden change from 0 to 5V, the angle difference reaches its maximum of 0.5rad (29
• ) at first due to the the driving voltage constraint on slave motor, then back to 0 under the action of optimal control inputs calculated by our controller. The regulation time is less than 0.5 second. The driving voltages of the two motors are slightly different due to the different mechanical parameters.
CONCLUSION
This paper mainly focuses on how to achieve high computing performance in FPGA, and introduces a hybrid design to implement MPC on a low-cost FPGA platform. We have designed a QP solver to accelerate the optimization process in MPC algorithm. This QP solver is platform independent, and much faster than traditional approaches. Meanwhile, since the hardware QP solver in our design can be re-used, the procedure of design and maintenance is simplified. This implementation method is also well suited for ASIC design flow, which may greatly cut chip costs and improve performance.
In order to verify our design, we have made an experiment on an angle servo system. By choosing appropriate parameter of MPC, the presented MPC control system can achieve satisfactory performance. This reflects the effectiveness of our design and the potential of MPC algorithm in the applications of field controllers.
