Abstract-This paper describes a design methodology to implement on FPGAs piecewise-affine (PWA) functions based on representation methods from the lattice theory. An off-line automatic processing starts at the algorithmic formulation of the problem, obtains the parameters required by a parameterized digital architecture, and ends with the bitstream to program an FPGA. The methodology has been proven to implement PWA functions on Xilinx FPGAs. The results are compared with other approaches for FPGA implementations of PWA functions.
I. INTRODUCTION
A piecewise-affine (PWA) function, f P W A : D → R, provides a linear (affine) output for each region in which the input domain, D, is partitioned (D ⊂ R n ):
where Φ i ∈ R n , and P i are P non overlapping regions, called polytopes, that induce a polyhedral partition of the domain.
Each polytope is a closed set of points delimited by E edges:
where h j ∈ R n , k j ∈ R, and the E edges are (n-1)-dimensional hyper-planes in the form h T j x + k j = 0. Fig.  1(a) illustrates an example of a PWA function whose input domain is partitioned into 9 polytopes.
Since PWA functions can approximate any non linear function, they have been employed in many application domains [1] - [3] . The need of circuits that implement PWA functions with small size, low power consumption, and high speed, has encouraged the development of different solutions. Although primitive hardware realizations of PWA functions are analog [4] , several digital approaches have been proposed recently [5] - [10] . Digital implementations play a relevant role in the embedded control area and offer competitive implementations in terms of area, power consumption and response times. A digital architecture to implement a PWA function based on a generic form is presented in [6] . This architecture employs a binary search tree and it may be not adequate for certain PWA functions that require a highly deep tree. In [7] , a digital implementation based on a simplicial partition is proposed. This implementation performs an approximation of a generic PWA function and it offers a good trade-off between approximation capability and hardware resources. Its main drawback is the curse of dimensionality, that is, the complexity increases exponentially with the number of inputs of the PWA function. Recently, a digital implementation called Hierarchical PWA functions has been proposed in [8] . It is obtained by interconnecting simple PWA modules of few inputs or small number of polytopes. The main disadvantage of the hierarchical implementation is that there is not a standard configuration of modules that could approximate any PWA function, so that each PWA function has to be studied separately.
Lattice forms can implement any continuos PWA function with potentially no errors and the minimum number of parameters to store [9] . Another advantage of Lattice forms is that such parameters can be obtained by a systematic procedure susceptible of being automated [10] . In order to exploit these advantages, the work presented in this paper describes how such systematic procedure has been automated and included within a design flow that connects the abstract algorithmic level with its physical FPGA implementation.
The paper is organized as follows. Section II summarizes the advantages introduced by the lattice PWA approach. A computer-aided design methodology for their FPGA implementation is presented in Section III. The application domain considered is the growing field of model predictive control [11] . At algorithmic level, Matlab and its Hybrid Toolbox are employed to obtain the explicit PWA function to approximate an optimum model predictive controller as well as its corresponding lattice PWA representation. A Matlab According to [9] , any continuous and explicit PWA function, f P W A (x), can be represented by a lattice PWA function l(x|φ, ψ), such that f P W A (x) = l(x|φ, ψ) in the form:
where
T is a P × (n + 1) parameter matrix whose rows are the coefficients of the affine functions, l j (x), of the P polytopes, and ψ = [ψ ij ] is a P ×P zero-one structure matrix defined as follows.
Assume that P i ,P j are two n-dimensional polytopes where l i (x) and l j (x) are the values of the local affine functions corresponding to those polytopes, then:
where v k are the vertices of
the number of vertices of P i . The main advantage of the lattice representation is that it is not an approximation, that is, the analitic expression in (3) provides an exact representation of the function. The work published in [10] describes a way to simplify the analitic expression in equation (3), maintaining an exact representation. This technique analyzes the parameter matrix (φ), and simplifies it by deleting redundant rows. A row of the simplified parameter matrix corresponds to a super-region, which is a merge of several polytopes. Furthermore, there are inactive regions that can be removed from the analitic expression because they do not contribute to the PWA function value. After such simplifications, the lattice expression is described as follows:
whereψ ∈ R Q×S is the simplified structure matrix andφ ∈ R S×(n+1) is the simplified parameter matrix. The simplification proposed in [10] is more evident in the case of PWA functions with a high number of dimensions. Let us consider, for example, the PWA functions that approximate the optimum model predictive controllers that stabilize a double integrator plant and a triple integrator plant. These examples are two typical plants employed in nonlinear control theory. They have two and three inputs, respectively. In the case of the double integrator, simplification reduces the parameter matrix from 25 to 6 rows. In the case of triple integrator, the rows are reduced from 27 to 5.
III. DESIGN METHOLOGY
The design methology to implement a PWA function based on lattice representation is divided in two main parts: the software realization and the hardware realization. Between them, it is the interface that adapts the output of the software realization to the parameters required by the hardware realization. This is illustatred in Fig. 2 
A. Software realization
The software realization firstly computes the explicit PWA function for a model predictive control problem with the Hybrid Toolbox [12] . After obtaining the parameter matrix, φ, and the structure matrix, ψ, the lattice representation for the PWA is inferred.
1) Hybrid Toolbox: Hybrid Toolbox for Matlab is used as a multiparametric Quadratic Programming (mpQP) solver to obtain the explicit and optimum PWA functions.
2) Parameter and Structure Matrixes: Given a continuous and explicit PWA function, the authors in [10] provide an algorithm to find the simplest lattice representation with the form in (3) . So the methodology continues as follow:
Firstly, there is a Matlab function that generates the parameter, φ, and the structure, ψ, matrixes from the explicit PWA function given by the Hybrid Toolbox. Then, a Row simplification is carried out to obtain a simplified structure matrix ψ ∈ R (P −1)×P . Finally, a Column simplification is also performed to obtain a simplified structure matrixψ ∈ R (P −1)×P and a simplified parameter matrixφ ∈ R (P −1)×(n+1) .
B. Interface Software/Hardware
A Matlab function generates the parameters required for the hardware realization. It sets the number of bits of the inputs, and the outputs, the parameters that the memory should store, and the features of the control circuitry.
C. Hardware realization
The architecture employed implements a PWA function based on the lattice representation described in equation (5) . It is composed of the following main blocks, shown in Fig.  3(a) . 1) Compute: This block calculates the affine expressions for a given input. It may contain a multiplier-accumulator, in case of a serial implementation, or n multipliers and n + 1 adders, in case of a parallel implementation. It also contains a memory that stores the S × (n + 1) coefficients associated with the simplified parameter matrixφ ∈ R S×(n+1) . A parallel version is used herein in order to exploit the resources (multipliers) of the FPGA. Also to exploit the FPGA resources, the parameters are stored in the memory RAM blocks included in the FPGA (BRAMs).
2) Control: This block serializes the max/min operations in equation (5), taking as many steps as ones are in the structure matrix plus its number of rows. It determines the inputs to the block max min (in1, in2), decides the operator (maximum or minimum) to be implemented (max min), and addresses the memory of the Compute block to calculate the affine function (plane). Also, it indicates (with an enable signal) when the last state arrives so as to provide the output of the block max min as a valid output of the system (enable).
3) max min: This block calculates the maximum or the minimum of two affine values. Its inputs, controlled by the Control block, can be the output of the Compute block, an initialization value (0 or 1), or the output of the max min block in a previous state.
The inputs can be loaded in parallel or in serial accordingly to the structure of the Compute block. The designer can select one implementation or another depending mainly on time and area restrictions. Once again to exploit the resources of the FPGA the parallel implementation is set herein.
In order to simplify the microelectronic realization, the range of the input and output values is normalized in the interval [0, 1], and consequently the coefficients are evaluated for this range. Also to simplify the implementation, two auxiliary blocks at the input (Pre) and the output of the system (Post) can be employed. The goal of these blocks is to exploit the possible symmetry of the PWA function to implement. Such strategy reduces the states of the Control block and, hence, the resources required in the implementation.
The circuit architecture described above has been implemented in the design tool for DSPs called Xilinx System Generator, which is integrated into Simulink (Fig. 3(b) ).
D. Simulation and Verification
The Xilinx System Generator toolbox allows to simulate the function implemented introducing the inputs througth the Matlab environment. Since the PWA functions correspond to the control domain, Xilinx System Generator also allows that the controller implemented in the FPGA can interact with a plant model described in Matlab. This verification is known as hardware-in-the-loop testing (Fig. 4) .
IV. CASE STUDIES
The functionalities of the proposed architecture and the design flow have been also analyzed with the application example of regulating to the origin the triple integrator system described in [13] . The design of the optimal PWA control function to implement is performed by the Hybrid Toolbox of Matlab. The resulting PWA function is defined over the domain [−1.5, 1.5] 3 , as shown in Fig. 5 . It has odd symmetry with regards to the vertical axis. The implementation considers the advantage given by the symmetry so that the affine functions that take part in the lattice PWA implementation are five (l 1 tõ l 5 ). The functionl 2 andl 1 are constants equal to the minimum and maximum of the output values, respectively, so they are implemented implicitly. The resulting function to implement in hardware is the following:
The module implementing the interface Software/Hardware allows analyzing how the number of bits in the hardware implementation affects in the control surface generated. The Table I shows the influence in the root mean square error (RMSE) calculated by evaluating the function in 125 (N pts ) points distributed homogeneously over the domain applying: whereû(x i ) is the result given by the Hybrid Toolbox (desired output to provide) andũ(x i ) is the output provided by the lattice PWA implementated in hardware. It this example, 12bits are selected. Open-loop simulations allow evaluating the control surfaces provided by the FPGA implementation as well as their differences with the optimal control surfaces obtained by the software solution (Fig. 6) . The results obtained after closedloop simulations illustrate the evolution of the plant state ( fig.  4(a) ), the control variable ( fig. 4(b) ) and the output ( fig. 4(c) ).
Table II allows comparing the features of the proposed lattice implementation with other existing PWA implementations (all of them with 12 bits) on a Xilinx FPGA (xc3s200-5ftp256). Results obtained offer a good trade-off between area occupation, throughput, and approximation error.
V. CONCLUSIONS
The design methodology presented for FPGA implementation of continuous PWA functions based on lattice representation has been automated with Matlab&Simulink and ISE tools. The parameters required by the digital architecture are obtained from the algorithmic description of the problem. FPGA implementation results for applications in the control domain offer small size, high speed, and potentially no approximation error with regards to the optimum solution. 
