Dynamically reconfigurable hardware has been identified as a promising solution for the design of energyefficient embedded systems. However, its adoption is limited by costly design effort, including verification and validation, which is even more complex than for nondynamically reconfigurable systems. In this article, we propose a tool-supported formal method to automatically design a correct-by-construction control of the reconfiguration. By representing system behaviors with automata, we exploit automated algorithms to synthesize controllers that safely enforce reconfiguration strategies formulated as properties to be satisfied by control. We design generic modeling patterns for a class of reconfigurable architectures, taking into account both hardware architecture and applications, as well as relevant control objectives. We validate our approach on two case studies implemented on FPGAs. 
INTRODUCTION
Dynamically reconfigurable hardware has been identified as a promising solution for the design of energy-efficient embedded systems [Hinkelmann et al. 2009] . A common argument in favor of this kind of architecture is the specialization of processing elements, which can be adapted to application functions in order to minimize the delay, the control cost, and to improve data locality. Another key benefit is the hardware reuse to minimize the area, and therefore the static power and cost. Further advantages, such as hardware updates in long-life products and self-healing capabilities [Paulsson et al. 2006] , are also often mentioned. In the presence of context changes This work is supported by the French ANR project Famous and the Chinese NSFC under grant 61502140. Authors' addresses: X. An, Hefei University of Technology, China; email: xin.an.fr@gmail.com; E. Rutten, Ctrl-A team, INRIA Grenoble, France; email: eric.rutten@inria.fr; J.-P. Diguet, CNRS/Lab-STICC, Lorient, France; email: jean-philippe.diguet@univ-ubs.fr; A. Gamatié, CNRS/LIRMM, Montpellier, France; email: abdoulaye.gamatie@lirmm.fr. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies show this notice on the first page or initial screen of a display along with the full citation. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept., ACM, Inc., 2 Penn Plaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212) 869-0481, or permissions@acm.org. c 2016 ACM 1539-9087/2016/05-ART51 $15.00 DOI: http://dx.doi.org/10. 1145/2873056 (e.g., environment or application functionality), self-adaptive techniques can be applied as a solution to fully benefit from the runtime reconfigurability of a system. Dynamic Partial Reconfiguration (DPR) of FPGA is another accessible solution to implement and experiment with reconfigurable hardware. It has been widely explored and detailed in the literature. However, it appears that such solutions are not extensively exploited in practice for two main reasons: (i) the design effort is extremely high and strongly depends on the available chip and tool versions; and (ii) the simulation process, which is already complex for nonreconfigurable systems, is prohibitively large for reconfigurable architectures. Thus, new adequate methods are required to fully exploit the potential of dynamically reconfigurable and self-adaptive architectures. Here, we are proposing a design methodology for self-adaptive embedded systems. On the one hand, our approach considers reconfigurable architectures as implementation of execution platforms by exploiting their features. On the other hand, for the design of adaptation decision, it relies upon a formal method related to automata-based verification and, more originally, by considering discrete controller synthesis. It is important to note that, in this article, synthesis is not used in the sense of hardware synthesis. Instead, it is considered under the meaning of controller synthesis as a formal operation on automata, as explained in the sequel.
Reconfigurable Architecture Validation Problem
The validation of a reconfigurable architecture includes, on the one hand, the separate validation of each possible configuration of the architecture and, on the other hand, the validation of each transition between pairs of configurations, since the behavior of the system can depend on the memory content and the I/O status modified from one configuration to another. Current validation approaches are based on simulation. For reconfigurable systems, the number of scenarios to deal with rapidly becomes untractable, making simulation less efficient for addressing the design-correctness issue.
Verification is central in electronic system-level design in order to ensure that a system implements its functionality in a correct, efficient, and cost-effective manner [Martin et al. 2010] . This is particularly true for FPGA-based design [Maxfield 2004] . A very popular verification approach is simulation, which can be either cycle-based or event-based. Compared to the latter, the former provides an extremely good visibility into designs for debugging. But, it is potentially more compute-intensive and slower. Another important approach is design testing, which is very useful for large and complex system verification. Its quality depends significantly on the size and relevance of used test benches. The last relevant approach is formal verification-either static, such as model checking and theorem proving to formally verify given design properties, or dynamic, by combining simulation and static formal analysis. Here, dynamic techniques consider assertions and check their possible violations during simulation.
Dynamical reconfiguration requires making decisions about the choice of new configurations, depending on occurring events in a system, on past events and sequence history, and on predictive knowledge about possible outcomes of reconfigurations. Such decision components are difficult to design because of the combinatorics of possible choices, the transversal constraints between them to be respected, and, even more, the history aspects. Formal approaches to the design of reconfiguration controllers can provide tool-supported assistance to this difficult design. In the embedded system domain, formal techniques have been designed largely for safe software design. The evolution of DPR systems makes them amenable to the same kind of techniques and models. Labelled Transition Systems (LTS) or automata are typical effective specification models that can be verified by model checking.
This article aims to deal with the correct dynamical management of reconfigurations. It advocates the design of control loops addressed by Control Theory, which covers both continuous and discrete systems in general. Here, only the latter is considered. The class of Discrete Event Systems is modeled using Petri nets or automata. Based on notions of supervisory control, automated techniques have been defined for Discrete Controller Synthesis (DCS).
The contribution of this article is to exploit these advantages of DCS for designing controllers to manage reconfigurable architectures with practical implementation on dynamically partially reconfigurable FPGAs. This enables us to solve the problems mentioned earlier by 1) automatically generating the code of controllers to be implemented on a processor in charge of the reconfiguration management, considering that the reconfigurable architecture can be generated with recent model-based design flow [Vidal et al. 2009];  2) adopting our approach when automated generation is guaranteed correct by the synthesis algorithm, instead of simulation for verification.
This article relies on previous preliminary results [An et al. 2013a [An et al. , 2013b and brings new improvements in the following directions: (i) it extends the reconfiguration management methodology by describing the modeling concepts, including their generation procedure, at all system levels (architecture, application, and so forth) for applying the DCS technique; (ii) it draws a global design flow applied to two real and concretely developed case studies with runtime image processing and reconfiguration to demonstrate clearly how our advocated DCS technique can be applied; and (iii) quantitative evaluations and analysis regarding the scalability of our advocated approach are reported.
Outline. In the remainder of this article, we first discuss some related work in Section 2. Then, we define the class of reconfigurable architecture that we target, as well as its reconfiguration policies in Section 3. We introduce the modeling and DCS concepts used throughout this article in Section 4. These concepts are considered in Section 5 for building the general behavioral model of the target class of architectures, and for formally specifying their associated control objectives. The resulting general design flow of our approach is presented in Section 6. Its validation is illustrated in Section 7 via some case studies involving a concrete FPGA platform. A discussion of the proposed solution regarding the obtained results is addressed in Section 8. Concluding remarks and perspectives are provided in Section 9.
RELATED WORK
In the literature, most of the existing approaches dealing with the management of reconfigurable embedded systems target the runtime scheduling of application tasks onto a reconfigurable architecture (e.g., Noguera and Badia [2004] and Ghaffari et al. [2007] ) or an architecture including a reconfigurable fabric (e.g., Nollet et al. [2008] ). Since these approaches perform scheduling analysis online, they usually resort to heuristic algorithms to generate fast and lightweight solutions, and are able to deal with the scheduling of applications that are unknown a priori. However, such approaches cannot guarantee optimal solutions and/or strict system constraints due to unknown situations, and are usually validated by (limited) simulations.
Beyond the usual simulation techniques [Aylward et al. 2011; Gong and Diessel 2011] , formal methods provide attractive verification techniques that are applicable to reconfigurable embedded-system designs. Singh and Lillieroth [1999] present a typical study addressing the correctness of reconfigurable cores, such as a 64b adder and an 8b counter. They consider a formalization based on propositional logic and integer arithmetic. They use a theorem-prover at runtime to check whether the dynamically calculated circuits are correct. Their solution applies mainly to circuits that take seconds to verify. Other approaches [Dahmoune and de B. Johnston 2010; Madlener et al. 2010 ] suggest model-checking techniques for the verification of FPGAs and dynamically reconfigurable embedded systems in general. However, none of these solutions deals with the correct design of the reconfiguration control.
The reconfiguration management in DPR technologies is usually addressed by considering manual encoding and analysis, which is tedious and error-prone [Gohringer et al. 2008] . With the foreseeable increase in complexity of such technologies, automatic techniques appear more suitable to better solve the limitations related to the manual approach. Other existing approaches dedicated to self-management of adaptive or reconfigurable systems use, for instance, heuristics and machine-learning techniques. In Sironi et al. [2010] , a system built on a reconfigurable architecture exploits self-adaptivity. It adopts application heartbeats as the monitoring framework and a heuristic mechanism to switch between different configurations. Self-management in the form of self-healing that exploits FPGAs is also proposed in other studies [Paulsson et al. 2006; Jovanović et al. 2008] .
Regarding design infrastructures, an architectural proposal [Majer et al. 2007 ] provides a slot-based organization of a reconfigurable hardware as well as an elaborate communication framework with good reconfiguration support. In Ye et al. [2010] , an adaptive system is implemented on FPGA by means of a programming model and environment for the development of reconfigurable multiprocessor architectures.
Beyond the previous aspects, reconfiguration control is one major issue for adequate system behaviors. Maggio et al. [2012] discuss some approaches applying standard control techniques such as Proportional Integral and Derivative (PID) controller or Petri nets-based control. The same kind of control has also been used for processor and bandwidth allocation in servers [Lu et al. 2002] . A closed-loop control has been applied in Eustache and Diguet [2008] to select hardware/software configurations on an FPGA with a configuration control based on a dataflow model and diffusion mechanisms. We note that such a solution relies on heuristics and empirical laws that prevent instability and select the suitable configurations. In Quadri et al. [2010] , a design flow is proposed from high-level models to automatic code generation for the implementation of reconfigurable FPGA-based systems-on-chip. The system control is modeled manually and integrated into the flow.
Compared to these reconfiguration control techniques, a major advantage of the discrete control approach considered in this article is the enabled formal correctness. In addition, the advocated controllers are generated automatically at design time. To the best of our knowledge, there is no existing work addressing the automatic generation of correct reconfiguration controllers, starting from design to implementation, for FPGA systems. An earlier work [Guillet et al. 2012 ] explored only simple models for tasks, and invariance control objectives that concern state exclusions. Here, we have an extended model structure with application, tasks, and architecture; we use reachability and optimal control beyond invariance. Optimal control concerns the optimization of costs or weights associated with states and/or transitions. By enforcing optimal control, for example, minimizing the system WCET, our approach is thus also able to guarantee performance.
DPR CONTROL PROBLEM
We present informally the considered class of systems with an illustrative example.
Hardware Architecture Model
We consider a multiprocessor architecture implemented on a reconfigurable device composed of a general-purpose processor A0 (e.g., ARM core) and a reconfigurable area (e.g., FPGA-like with power-management capabilities) divided into n reconfigurable tiles. Figure 1(a) shows an illustrative example of four reconfigurable tiles: A1-A4. The communications between architecture components are achieved by a Network-on-Chip (NoC). Each processor and reconfigurable tile implements a NoC Interface (NI). A fixed dual port memory buffer is associated with each tile, which means that at most two tasks can simultaneously access data stored in the shared memory. Reconfigurable tiles can be combined and configured to implement and execute tasks by loading predefined bitstreams, such as tiles A1 and A2 of Figure 1 (a).
The architecture is equipped with a battery supplying the platform with energy. Regarding power management, an unused reconfigurable tile Ai can be put into sleep mode with a clock gated mechanism such that it consumes a minimum static power.
Application Software
We consider system functionality described as a directed, acyclic task graph (DAG). A DAG consists of a set of nodes representing the set of tasks to be executed, and a set of directed edges representing the precedence constraints between tasks. Note that the chosen graph-based representation is seen as a generic representation that enables the description of a large number of applications. It is also a useful abstraction level for dealing with the safe control of tasks by using formal techniques. The coarse grain tasks considered at this abstraction level avoid the burden of associated unnecessary low-level details for defining a suitable solution to the problem. There exist a number of works that show how such a task graph can be derived from an initial application specification described in programming languages such as C [Vallerio and Jha 2003] or some high-level specification languages such as UML MARTE [Guillet et al. 2014] . Dealing with such transformations is out of the scope of this article. Figure 1(b) shows an illustrative example consisting of four tasks: A, B, C, and D.
In our framework, unless otherwise specified, we suppose that each task performs its computation with the following four control points: -being requested or invoked; -being delayed: requested but not yet executed; -being executed: to be executed on the architecture; -notifying execution finish, once it reaches its end.
Occurrences of control points being requested and notifying finishes depend on runtime situations, and are thus uncontrollable. The way of delaying and executing tasks is controlled by a runtime manager designed to achieve system objectives.
Task Implementations
Given a hardware architecture, a task can be implemented in various ways characterized by various parameters of interest, such as the set of used reconfigurable tiles (rs), Figure 2 shows three system configuration examples. In configuration 1, task A is running on tiles A3 and A4, while tiles A1 and A2 are set to the sleep mode. Configurations 2 and 3 show two scenarios, with tasks B and C running in parallel. Once task A finishes its execution according to the graph of Figure 1(b) , the system can go to either configuration 2 or configuration 3 depending on the system requirements. For example, if the current state of the battery level is low, the system would choose configuration 2, as configuration 3 requires the complete circuit surface, therefore consuming more power. On the contrary, when the battery level is high, configuration 3 would be chosen if the user expects a better performance.
System Reconfiguration

System Objectives
System objectives define the system functional and nonfunctional requirements. This section gives the objectives considered in this article for a general dynamically reconfigurable architecture with power-management capabilities. It categorizes them as logical and optimal control objectives. Generally speaking, logical objectives concern exclusions, whereas optimal objectives concern weights and costs.
Considered logical control objectives are as follows:
(1) resource usage constraint: exclusive uses of reconfigurable tiles A1-A4; (2) dual accesses to the shared memory: by at most two functions running in parallel; (3) energy reduction constraint: switch tiles to (a) sleep mode when executing no task; (b) active mode when needed; (4) reachability: DAG execution can always finish once started; (5) power peak of hardware platform is constrained with regard to battery levels.
Optimal control objectives of interest are as follows: (6) minimize power peak of hardware platform; (7) minimize WCET of DAG executions; (8) minimize worst-case energy consumption of system executions. These objectives will be formalized further in Section 5 in terms of the formalisms recalled next in Section 4. Some will be used and validated in Section 7.
MODELING FORMALISM AND DCS
We apply the formal technique of DCS to address the control problem of a dynamically reconfigurable architecture. Before presenting DCS, we first introduce the automatabased synchronous modeling formalism, which is adopted in the article to describe the control problem. We adopt such a modeling formalism because (1) automata-based modeling formalisms are quite natural to model system reconfiguration behaviors, (2) the mathematical foundation of the synchronous model favors system formal analysis, and (3) there exist DCS tools that can exploit synchronous parallel automata.
Modeling Formalism
We adopt the formal framework defined in details elsewhere [Altisen et al. 2003; Dumitrescu et al. 2010] for the automata definition.
Definition 1 (Automaton). An automaton is a tuple S = <Q, q 0 , I, O, T >:
-Q is a finite set of states; -q 0 ∈ Q is the initial state of S; -I is a finite set of input events; -O is a finite set of output events; -T is the transition relation that is a subset of Q × Bool(I) × O * × Q such that Bool(I) is the set of Boolean expressions of I and O * is the power set of O.
Each transition, denoted by q g/a −→ q , has a label of the form g/a, where guard g ∈ Bool(I) must be true for the transition to be taken, and action a ∈ O * is a conjunction of output events, emitted when the transition is taken. State q is the source of the transition, and state q is the destination. A path is a sequence of transitions denoted by
The composition of two automata put in parallel is the synchronous composition, denoted by ||. Given two automata
is called a macro state, for which q 1 and q 2 are its two component states.
The encapsulation operation, defined in Altisen et al. [2003] , is used to enforce the synchronization between two composed automata by means of a variable that is an input on one side and an output on the other side. Let S = <Q, q 0 , I, O, T > be an automaton, and ⊆ I ∪ O be a set of inputs and outputs of S. The encapsulation of S with regard to is the automaton S\ = <Q, q 0 , I\ , O\ , T >, where T is defined
is the set of variables that appear as positive elements in the monomial g, that is,
− is the set of variables that appear as negative elements in the monomial g, that is, g − = {x ∈ g|¬(x g) = g}. Figure 3 gives an example of 51:8
X. An et al. using encapsulation to enforce the synchronization of two automata A and B that are composed by a synchronous composition through variable b.
The automata states can be associated with weights, characterizing quantitative features. We define a cost function C : Q → N to map each state of an LTS to a positive integer value. Costs can also be defined on execution paths across an LTS. For instance, a cost function of path p can be the sum of all the costs of its traversed states. When composing LTSs, the cost values with regard to the resulting global states/transitions can be defined on the basis of the local costs as their sum or the maximal/minimal value.
Based on this definition of automata and other automata-based modeling formalisms presented in this section, formal analysis and verification techniques-such as model checking and discrete controller synthesis-can be applied. In this work, we adopt the discrete controller synthesis technique, which is presented in the next section. [Ramadge and Wonham 1989] , was proposed to deal with the control and coordination problems of discrete event systems. A discrete event system (DES) [Ramadge and Wonham 1989 ] is a discrete-state, event-driven dynamic system that evolves in accordance with the occurrences of discrete events at possibly irregular intervals. An event, for example, may correspond to the invoke or completion of a task, or the failure or frequency switch of a processor. Such systems arise in various domains of our daily life, such as manufacturing, transport, automotive, embedded systems, and health care. These applications have their own design requirements, and require control and coordination to ensure their desired behavior.
Discrete Controller Synthesis
DCS, introduced in the 1980s
The main advantage of the theory is that it separates the concept of open-loop dynamics (i.e., the DES) from feedback control, and allows the autonomic analysis and control of DESs with regard to a given specification of control objectives.
DCS is an operation that applies on a DES presented as, for example, an automaton as defined in Section 4.1. In order to control a DES, that is, enable a controller to influence the evolution of the DES behavior, the occurrences of certain events are under control. The set of events XY of a DES is thus partitioned into two subsets: Y uc and Y c , representing, respectively, the uncontrollable and controllable event sets. Figure 4 shows the principle of DCS. It is applied with a given control objective: a property that has to be enforced by control. The objective is expressed in terms of the system's outputs X. The controller denoted by C is obtained automatically from a system model S and an objective, both specified by a user, via appropriate synthesis algorithms. The synthesis algorithms, which are related to model-checking techniques, automatically compute, by exploring the system state space, a constraint on controllable variables Y c , that is, the controller. Its purpose is to constrain the values of controllable variables Y c , in function of outputs X and uncontrollable inputs Y uc , such that all remaining behaviors satisfy the given objective.
There can be several controllers that meet the same control objective. In the extreme case, a controller can forbid any state transition in order to avoid the invalid states. This is apparently not desirable for target systems. We are interested in maximally permissive controllers, which ensure the largest possible set of correct behaviors of the original uncontrolled system.
More generally, advantages of DCS are high-level specification with declarative control policies, automated synthesis of correct-by-design/construction controllers, and optimality in the sense of maximal permissivity (minimally constraining controller).
In this work, we use DCS operations corresponding to different algorithms to synthesize controllers [Marchand and Samaan 2000] for invariance with regard to a subset of states, reachability of a subset of states, one-step optimization of a cost on the next state, and path optimization of a cost of the bounded path to a target state [Dumitrescu et al. 2010] . Such operations have been implemented in the Sigali DCS tool [Marchand and Samaan 2000] . In particular, we use the user-friendly tool BZR 1 [Delaval et al. 2010] , whose compilation involves the Sigali tool to perform DCS. It employs the automata-modeling formalism of Section 4.1 to describe the target system behavior and a dedicated construct contract to specify the control objectives to be enforced in a declarative style. The compilation of BZR will automatically synthesize a controller enforcing the specified objectives. This controller is then reinjected automatically into the initial BZR program so that an executable program can be generated (in C or Java) for execution. This executable code is used in the experiments described in Section 7.
As for other formal verification techniques, such as model checking, in which complexity is in the worst case polynomial in the size of the state space to handle, DCS is also concerned with the scalability issue (discussed in Section 8). However, compared to them, the main advantage of the DCS is that it is more constructive and is able to produce a maximally permissive and correct solution, while other formal techniques, such as model checking, require a possibly error-prone and over-constraining manual encoding phase before a tedious verification phase [Gamatié et al. 2009 ].
MODELING RECONFIGURATION MANAGEMENT COMPUTATION AS A DCS PROBLEM
We specify the modelling of the computing system behavior and control in terms of labelled automata. System objectives are defined based on the models. We focus on the management of computations on the reconfigurable tiles and dedicate the processor area A0 exclusively to the execution of the resulting controller.
Architecture Behavior
The architecture consists of a processor A0, n reconfigurable tiles {A1, . . . , An}, and a battery (see Figure 1(a) , for which n = 4). Each tile has two execution modes, and the mode switches are controllable. sensor, which emits level up and down events, and keeps track of the current battery level through output st.
It can be observed that the architecture behavior, including the behaviors of the reconfigurable tiles and battery, can be described systematically, and could be specified with some high-level specification languages given some syntactic sugar. As an example, a systematic way to generate such automata of Figure 5 from the UML profile MARTE [Object Management Group 2013] can be found in Guillet et al. [2014] .
Application Behavior
The software application is described as a DAG, which specifies the tasks to be executed and their execution sequences and parallelism. We capture its behavior by defining a scheduler automaton representing all possible execution scenarios. It does so by keeping track of application execution states and emitting the start requests of tasks in reaction to the task finish notifications. Figure 6 shows the scheduler automaton of the application DAG in Figure 1(b) . It starts the execution of the application by emitting event r A , which requests the start of task A, upon receipt of application request event req in the idle state I. Upon receipt of e A notifying the finish or end of A's execution, events r B and r C are emitted together to request the execution of tasks B and C in parallel. Task D is not requested until the execution of both B and C is finished, denoted by events e B and e C . It reaches the final state T , implying the end of the application DAG execution, upon receipt of e D .
Informal Description.
In fact, Figure 6 models the execution behavior of an iteration of the example task graph. Our approach can also deal with pipelined executions of streaming applications described by dataflow models such as Synchronous Data Flow Graphs (SDFGs). For SDFGs, we can build execution models similarly by (1) identifying events that fire task or actor computations when their input tokens are ready, and those that notify ends of computations with corresponding output tokens produced; and (2) representing all the relevant states on which control should be based.
Such a scheduler automaton can be constructed algorithmically from a DAG described application. In the following section, we describe systematically how to obtain such a scheduler automaton from a DAG-described application.
Scheduler Automaton Derivation.
As shown in the example in Figure 6 , the derived scheduler automaton from an application DAG captures the dynamic execution behavior of the application. Its states represent the tasks that are being executed. They are denoted and labeled by the names of these tasks. It has an initial state I, that is, the idle state, which means that the application has not been invoked, and an end state T , which means that the application has finished its execution. The automaton input events are the task end events e 1 , e 2 , . . . , e i , . . . and the application request event req, while its output events are the task request events r 1 , r 2 , . . . , r i , . . . . Its transitions are of the form g/a, where g is a firing condition and a is an action. A firing condition is a Boolean expression of input events, and an action is a conjunction of output events. Note that (1) we suppose that the application is only invoked once. If it is allowed to be repeatedly invoked, the end state would be the same as the initial state. (2) If the graph has a task that has more than one instance, the instances are then seen as different tasks.
Algorithm 1 illustrates how to construct the scheduler automaton for a DAG. It derives the automaton from initial state I to end state T by exploring the state space of the application execution with regard to the DAG.
-Inputs: a directed, acyclic task graph <T, C>, where T and C represent, respectively, the set of tasks and the set of edges. -Local variables and functions used in the algorithm: -s or nextState: a state, with element taskSet representing the set of tasks associated to the state (i.e., the tasks executing in the state); -drawState(s): a function that draws state s, labeled by s.taskSet; -drawTrans(source, sink, transition label): a function that draws a transition from state source to state sink guarded by transition label; -drawnStates: the set of states that have been drawn out; -stateQueue: a FIFO queue, keeping track of the states to be processed, with function popup() to return and delete the first state element, and function add(s) to add state s to the end of the queue; -t i . prec: the set of tasks that immediately precede task t i ; -readyTaskSet: the set of tasks that are enabled to execute; -tc: a set of tasks, or a task combination; -powerSet(a set of tasks) returns the power set of the set o f tasks without ∅; -traversed(s) returns the set of states traversed by some path from state I to state s (states I and s included) with regard to the current drawn automaton, with element taskSet to return the union of the tasks associated with all the states of the set.
Lines 1 to 8 deal with Phase 1, the drawing of the initial state I and the initialization of local variables. At Line 1, the initial state, idle state I, is drawn denoted by drawState(I). The set of drawn states drawnStates is thus initialized to {I} at Line 2. State queue stateQueue stores the states that have been drawn but not processed. It is initialized to have element I at Line 3. Variable readyTaskSet represents the set of tasks that are enabled to execute once some event happens. A task is enabled if all of its precedent tasks have finished their executions. Lines 4 to 8 set readyT askSet to the set of tasks that have no precedent tasks, as such tasks can be executed immediately once the application is invoked/requested denoted by the receipt of event req.
Lines 9 to 44 deal with Phase 2, the sequential processing of the states stored in stateQueue. automaton derivation process finishes when the queue becomes empty. Three types of states are distinguished and processed accordingly. They are initial state I, end state T , and the rest. Due to space limitation, their detailed explanations are omitted here. We refer the readers to Section 5.3 of An [2013] for more details. 
Task Execution Behavior
Before executing a task on a reconfigurable architecture, the task implementation (i.e., a bitstream in the case of FPGA) should be loaded to reconfigure the corresponding tiles if required. The reconfiguration operations inevitably involve some overhead regarding, for example, time and energy. The worst case that can be imagined is that a reconfiguration operation is always required before a task is executed. In this case, the reconfiguration operation and task execution can be treated as a whole. In general, however, whether a reconfiguration operation is required or not before a task is executed depends on the runtime situation, that is, whether the corresponding task implementation is already configured. In this case, the reconfiguration operation and task execution are independent, and should be distinguished and treated accordingly. In the following, we describe the modeling in consideration of the first case (i.e., the worst case), and refer the readers to An et al. [2013b] for the modeling of the second case.
In the worst case, a task implementation is always loaded before being executed. These two consecutive operations are thus combined and treated as one executing operation. In consideration of the four control points of task executions (see Section 3.2), the execution behavior of task A associated with two implementations (see Section 3.3) can be modelled as Figure 7 A . When the execution of task A finishes, that is, the finish or end notification event e A is received, the automaton goes back to idle state I A . Output es represents its execution state. Similar to the architecture behavior in Section 5.1, the task execution behavior can also be described systematically, and be specified with some high-level specification languages. Therefore, the automaton model for task execution can also be generated systematically from some high-level specification languages, as in Guillet et al. [2014] .
Local execution costs. The reconfiguration and execution costs of different task implementations are different. As task reconfiguration operation and execution are combined, their costs thus need to be combined as well. Therefore, three cost parameters are considered here (see Section 3.3). We capture them by associating cost values denoted by a tuple (rs, wt, pp) with the states of task models, where rs ∈ 2 RA (RA is the set of architecture resources), wt ∈ N (the sum of a reconfiguration time value and a WCET value) and pp ∈ N (a power peak). The costs associated with executing states are the values associated with their corresponding implementations. For idle and wait states, rs = ∅, wt = 0, pp = 0. Figure 7 gives the complete model of task A. 
Global System Behavior Model
where q(Id) denotes an arbitrary state of automaton Id.
Global Costs.
The costs defined locally in each task execution model need to be combined into global costs. A system state q is a composition of local states (denoted by q 1 , . . . , q n ). We define its cost from the local ones as follows: -used resources: union of values for local states: rs(q) = rs(q i ), 1 ≤ i ≤ n; -worst-case execution time: This indicates how much time the system takes at most in this current state. It is thus defined as the minimal WCET of all executing tasks in this state, that is, wt(q) = min(wt(q i ), wt(q i ) = 0, 1 ≤ i ≤ n); otherwise, if no task is executing in the state, that is, ∀1 ≤ i ≤ n, wt(q i ) = 0, wt(q) = 0; -power peak: sum of values for local states, that is, pp(q) = ( pp(q i ), 1 ≤ i ≤ n); -worst-case energy consumption: the product of the worst-case execution time and power peak of the system state, that is, we(q) = pp(q) * wt(q). Now, we need to define costs associated with paths to capture the characteristics of system execution behaviors. Given path p = q i → q i+1 → · · · → q i+k , and costs associated with system states, we define costs on path p as follows:
-WCET: sum of WCETs on states on the path, that is, wt( p) = wt(q j ), i ≤ j ≤ i + k; -power peak: maximum on states along the path: pp( p) = max( pp(q j ), i ≤ j ≤ i + k); -worst-case energy consumption: the sum of the worst-case energy consumptions on the states along the path, that is, we( p) = we(q j ), i ≤ j ≤ i + k.
System Objectives
Based on the formal model presented earlier, we formalize the reconfiguration policies of Section 3.5. The two types of system objectives, logical and optimal, are described in terms of the states and costs defined on the states or paths of the model. Logical control objectives. For any system state q, we want to enforce the following:
(1) exclusive uses of reconfigurable tiles by tasks: ∀q i , q j ∈ q, i = j, rs(q i ) rs(q j ) = ∅; (2) dual accesses to shared memory, that is, at most two tasks access at the same time:
, where X i represents the set of executing states of corresponding task; (3.a) switch tile Ai to sleep when executing no task: q j ∈ q, Ai ∈ rs(q j ) ⇒ act i = f alse; (3.b) switch tile Ai to active when executing task(s): ∃q j ∈ q, Ai ∈ rs(q j ) ⇒ act i = true; (4) reachability: Q f is always reachable; (5) battery-level constrained power peak (given threshold values P 0 , P 1 , P 2 ): pp(q) < P 0 (resp. P 1 and P 2 ) when battery level is high (resp. medium and low).
Optimal control objectives. They can be classified into two types: one-step optimal and optimal control on path objectives. We use pseudo-functions max and min in the following to represent the maximization and minimization objectives, respectively.
One-step optimal objectives. One-step optimal objectives aim to minimize or maximize costs associated with states and/or transitions in a single step [Marchand and Samaan 2000] . Objective 6 of Section 3.5 belongs to this type. Optimal control on path objectives. They aim to drive the system from the current state to the target states Q f at the best cost [Dumitrescu et al. 2010] , as in items 7 and 8.
(7) minimize remaining WCET wt from state q: min(wt, q, Q f ); (8) minimize remaining energy consumption we from q: min(we, q, Q f ).
These models can be encoded in BZR to generate automatically a controller satisfying the defined system objectives. The BZR compiler also allows the designers to simulate their designed models, which will be shown in Section 6.1. Implementing such models in two real case studies will be presented in Section 7.
DESIGN FLOW
In this section, we present our design flow for self-adaptive embedded systems from their system specifications toward final system implementations on reconfigurable architectures (see Figure 8) , in which the upper branch, that is, controller generation and simulation by BZR, deals with system reconfiguration management. It models the system reconfiguration control problem by BZR models, and performs the BZR compilation to derive automatically a controller in C code. Along with the generated C code of the controller, BZR also generates some other executable C codes to allow the users to use an associated simulator to perform simulations. The other branch deals with the hardware implementation, which selects and organizes the hardware components according to system specification to realize the system functionality. The final system implementation is derived by integrating the generated controller on the hardware implementation, that is, by the controller integration process. To be more specific, the final implementation implements the generated controller as a software task running on a soft core (i.e., A0) of the hardware implementation.
In the article, we have focused on the modeling of the reconfiguration control problem (as illustrated in Section 3) by BZR-style models (as in Section 5). In the rest of this section, we describe briefly the controller generation and simulation, which includes the description of the generated C code of the controller, controller integration, and a typical experimental setup that describes a typical hardware implementation.
Controller Generation and Simulation
As shown in Figure 8 , by feeding BZR models to the BZR compiler, it produces a controller (in C code) satisfying the defined system objectives. This code is synthesized in another executable C code, which can be compiled for the embedded processor on the target hardware architecture. This C code structure consists of two functions: a reset function sys reset to initialize system state variables and a step function sys step, which performs system state transitions according to the values of system uncontrollable inputs and states, and the computed values of controllable variables. Two additional C files, named main.c and main.h, are generated by the compiler for simulation purposes. All these generated C codes can be fed to the graphical display tool sim2chro (from the Verimag research center 2 ) associated with BZR to perform simulations of the controlled system. This enables the designers to validate and adjust their designs at the early stage before going to final system implementation. Figure 9 shows a simulation scenario of the models in Section 5 for which the 5 logical control objectives of Section 5.5 are illustrated (see Table I for the implementation characteristics of the tasks). At instant 3, labeled 1 in the figure, variables a onA3 and a onA4 become true, which implies that the second implementation of A, which uses tiles A3 and A4, is chosen by the manager. At the same instant, tiles A3 and A4 are switched to the active mode, that is, act3, act4 become true, which corresponds to objective (3.b). At instant 9, as shown by label 2 in the figure, task C finishes its execution by releasing tiles A3 and A4, that is, c onA3 and c onA4 become false. At the same instant, tiles A3 and A4 are switched to the sleep mode, that is, act3, act4 become false. This corresponds to objective (3.a). As shown in label 3, the system power peak pp is always less than 300, even though battery level is high. This is because, first, the tasks cannot change their implementation once executed, and second, down and up events are uncontrollable. The power peak value is thus always kept under 300 to avoid the system going to an invalid state in which a task uses an implementation with a power peak bigger than the value that the lower level allows, that is, 300, and the battery level goes low before it finishes. The exclusive usages of all the tiles (i.e., objective (1)) can also be seen from the figure; for example, for tile A1, the variables t onA1, t = {a, b, c, d} do not have value true at the same time during the simulation. The variable num active f unc representing the number of active tasks is always 1 during simulation, which means that objective (2) is met. Objective (4) is also met as variable target, which represents whether the end state T is reached, becomes true at instant 15.
Controller Integration
With the C code of the controller generated by BZR and described in Section 6.2, we can integrate it (represented by box Controller in Figure 10 ) with the system hardware implementation by using the glue code (right-hand side box of Figure 10 ), which consists of two parts. The initialization part initializes system state variables by invoking the reset function sys reset(), and starts the processing of data (e.g., video stream processed by FPGA-reconfigurable tiles) by processing start(). Then, an infinite loop, which performs the following steps: (1) processing control() monitors the data processing and checks the timing or conditions to be respected before the reconfiguration controller can be invoked, for example, wait the arrival of a new frame of type I or simply wait 10 ms; (2) get yuc() collects the uncontrollable input values from the running system; 
51:18
X. An et al. Fig. 11 . Global structure of the implementation.
(3) sys step takes as input the values of uncontrollable variables (denoted by Y uc ) and the system state variables (denoted by X) and computes the values of the defined controllable variables and, consequently, the new state variables (X) ; (4) conf igure hw sw(X) performs reconfiguration by interpreting the computed values of output variables as system (reconfiguration) actions, loading the right bitreams from a remote server or from a Flash memory and invokes the ICAP driver to execute the FPGA reconfiguration.
Finally, this written code, together with the C code of the controller generated by BZR, is deployed on the CPU managing the reconfigurable hardware, for example, Microblaze.
A Typical Experimental Setup
We consider an ML605 board from Xilinx as our hardware execution platform. It includes a Virtex-6 FPGA (XC6VLX240T) and several I/O interfaces, such as switches, buttons, Compact Flash reader, and an external 512MB DDR3 memory. An Avnet extension card (DVI I/O FMC Module) with 2 HDMI connectors (In and Out) has been plugged into the platform so that it can receive and send video streams through the connectors. Figure 11 illustrates the global structure of our implementation. We have divided the FPGA surface into two regions: static and reconfigurable regions. Nine independent reconfigurable tiles are specified in the reconfigurable region. The reconfigurable tiles are in charge of the executions of reconfigurable tasks. The MB is synthesized on the static region of the FPGA (like A0 in Figure 1(a) ). It executes two main system tasks: the computed controller and the management of the configuration bitstreams. The latter task involves the control of related peripherals (i.e., Compact Flash memory, I/O interrupts, DDR3, ICAP) through corresponding implemented controllers. The external DDR 3 memory is used to buffer the input data-for example, frame pixel data of video streams-and store the software executable, typically the computed controller, to be launched by the MB. We use a compact Flash card to store the bitstreams of different reconfigurable task implementations on each reconfigurable tile. The C code of the computed controller is deployed on the MB as an infinite loop. It is invoked whenever the MB is interrupted. Two additional interrupt controllers (GPIO switches and GPIO buttons) are added for the platform to generate interrupts. They monitor the states of the buttons and switches, and generate interrupts when these states change. Once the controller is invoked, it is able to read the system states and computes out a new configuration for the nine reconfigurable tiles. The MB then selects the appropriate Fig. 12 . The video processing system case study. Each processed image is divided into 9 areas for processing, with those covered by grids called corner areas, and the rest called cross areas.
bitstreams from the Compact Card, and sends them to the ICAP to reconfigure the associated reconfigurable tiles.
CASE STUDIES
We describe two experimental case studies to demonstrate the previous control models on real FPGAs and focus on the modeling and controller generation aspect.
Case Study I: A Video Processing System
7.1.1. Case Study Description. We consider a video processing system to be implemented on an FPGA board, so that the partial reconfigurations of an FPGA controlled by a synthesized controller can be tested and visualized. The processing system (see Figure 12) consists of a camera that captures images to be processed on the FPGA, a dispatcher that feeds 9 reconfigurable tiles, a compositor aggregating pixels, and a screen displaying the processed images. Each captured image is divided into 9 areas, which are processed in parallel by 9 processing elements dynamically configured in the 9 tiles (as we had four in Figure 1(a) ). In this way, when a tile is reconfigured, one can see it on the screen. We consider three filtering algorithms (red, green, and blue ones) that can be implemented on each reconfigurable tile to process images. When configured to process the same image, they have different performance values regarding some characteristics such as power peak and execution time. In the study, we suppose that the power peaks of each tile for running the red, blue, and green filters are 3, 2, and 1, respectively.
We then introduce events that will induce state transitions and thus reconfigurations. First, the processing system can work at two different modes: high and low, controlled by the user through a switch on the platform. The user can also demand the use of the red filters to process the four corner areas of images by means of another switch. Apart from the user demands, the system also needs to respect the following three rules. The four corner areas of the images to be displayed are of the same color, the five cross areas are of the same color, and the color of the four corner areas is different from the color of the five cross areas; the global power peaks of the platform are bounded by 30 (resp., 20) in the high (resp., low) mode, minimizing the power peaks of the next states. A runtime manager is thus required to configure each reconfigurable tile of the FPGA by using one of the three algorithms to filter images in the way satisfying the aforementioned requirements.
7.1.2. System Modeling and Controller Generation. We model the system reconfiguration behavior by using synchronous parallel automata, as in Section 5. DCS is then performed to generate a controller by using BZR. Once the system gets started (modeled as the emission of event s), the controller should decide on the system initial state and configure the nine reconfigurable tiles of the FPGA accordingly. The behaviors of the two switches, denoted by ModeSwitch and CornerColorSwitch, are captured by two Boolean variables ms and gr, respectively, with value true means switch on. The reconfiguration behavior of the system is captured by a three-state automaton (Figure 13(a) ), with Boolean input ms capturing ModeSwitch, and Boolean output h representing whether it is in mode high. Initially in idle state I, once it is started by s, it goes to either High or Low depending on ms.
As the colors of the four corner areas are required to be the same, they always need the same filtering algorithm. We thus use one single automaton (see Figure 13 (b)) to model their choices among the three filtering algorithms. Boolean inputs s and gr represent, respectively, whether the system gets started or not and whether the user has switched on or off the corner color switch. Outputs f c ∈ {cor R, cor B, corG, cor I} and w ∈ {12, 8, 4, 0} represent, respectively, the current state and the weight associated with the state. At the beginning, it is in state I. Once the system gets started, that is, event s is received, it goes to state R, G, or B, meaning that the red, green, or blue filtering algorithm is used for processing the four corner areas of images, depending on the values of controllable variables c1, c2, c3. As running the red filter in a reconfigurable tile has cost 3, and R represents that all the four corner areas run the red filter, we associated state R with cost 12. The same applies to the costs of states G and B. The automaton goes to state R upon the receipt of event gr (i.e., the user switches on CornerColorSwitch), when it is in states G or B. The rest of the transitions (e.g., between G and B) are managed by the controller by evaluating the values of controllable variables c1, c2, c3 according to system requirements.
The modeling for choosing filters for processing the five cross areas is done similarly. The main difference is that the user now has no control over the usage of some filter for processing the four cross areas, that is, the choice among the filters is made by the controller through controllable variables c1, c2, c3. Figure 13(c) shows the model.
At last, all the aforementioned models are composed to derive the global system behavior. We then encode them and the control rules in BZR and employ BZR to automatically synthesize a controller satisfying the control rules. It generated the C code of the controller (with overall size 77.8kB) within 5s (see Table II ).
Case Study II: A Smart Camera Object Detection System
7.2.1. Case Study Description. We have used an advanced industrial conveyor simulator [Bévan et al. 2011 ] and a use case in which parcels can be conveyed from one location to another. Compared with the previous case, the camera is disconnected and the HDMI output of the PC running the simulator is connected to the board video input instead.
The object detection system detects the moving objects on the conveyor belts; characterizes the moving objects in terms of speed, size, color, and moving direction; and makes task implementation choice decisions according to the characteristics of the objects. Figure 14 describes how it works. A camera captures the video frames (Acquisition task) and sends them to the detection algorithm, which is implemented on an FPGA (represented by the gray rectangle). The detection algorithm is specified as a dataflow application in which circles represent tasks and arrows represent the communication channels. Numbers are labelled on channels to represent the corresponding numbers of input and output data tokens. Typically, a data token is one pixel or an integer. Rectangles represent buffers with numbers denoted above to represent buffer sizes. Each video frame is of N × M = 1280 × 720 pixels, and each pixel is 32b. After acquisition, the frame is duplicated and sent to tasks Cleaning and Filtering. Cleaning first applies erosion and dilation filters, then compares pixel-by-pixel the current frame with the previous one. Then, task Labelization identifies the possible object movements according to the comparison results of pixel values and gets the coordinates of the object rectangle: top-left (x, y) position, height (h), and width (w). Task Filtering filters the frame before task OSD (On Screen Display) can be applied.
With the results of task Labelization, the four following tasks (denoted by dotted circles) are used to compute area, direction, speed, and acceleration of moving objects:
-Task Size/Area multiplies the resulting height and width to get the object area. -Task Direction computes direction by comparing current and previous positions. -Task Speed computes the speed of objects by using the two previous positions. -Task Acceleration computes the acceleration by using previous speed and positions.
The task Classify takes the analysis results of the four preceding tasks, and classifies them accordingly. The analysis results of tasks speed, size, and acceleration are classified into one of the three levels: low, medium, and high. The result of direction is classified into one of the four directions: north, west, east, and south. As a result, task Classify produces three events esz, esp, eac to represent the categories (i.e., high, medium, or low) of the size, speed, and acceleration of the moving object, and event edi to represent its direction: north, west, east, or south. Task OSD (On Screen Display) displays a rectangle surrounding the moving object, with input data from the two branches.
In this example, we consider a reconfigurable architecture composed of 9 tiles that can execute the four tasks (size, speed, direction, acceleration) with three configurations with regard to different precisions (QoS). The three configurations-high (H), medium (M), and low (L)-require 3, 2, and 1 tiles, respectively. It means that 4 out of 9 tiles are used if all tasks are running with a low resolution but all tasks cannot run simultaneously with a high resolution. In particular, we suppose that, depending on the moving direction of the detected object, the four reconfigurable tasks are given corresponding preferences (modeled by weights) to use high QoS implementations. We consider the following reconfiguration constraints: the number of available tiles is fixed by 9; if no object is detected, low precision implementations will be used for all tasks; if speed is high, there is no need to use high precision for size; optimizing overall QoS: weighted function QoS = w i * QoS(t i ), where weight values w i for tasks t i depends 51:22
X. An et al. Fig. 15 . The task implementation model task impl, and the weight evaluation model weight eval. on the moving direction detected object; configuration bitstream imposes adjacent tiles (2 or 3) in vertical or horizontal direction for resolution medium and high.
7.2.2. System Modeling and Controller Generation. Each task has three implementations corresponding to three precision levels. Figure 15 (a) models the implementation model. The choices between them are controllable by variables c1, c2, c3. Outputs lp, hp represent which implementation is used, and integer outputs qos and t num represent, respectively, the QoS and number of used tiles of the current implementation.
Depending on the moving direction of the detected object, the four reconfigurable tasks are given corresponding preferences (modeled by weights) to use high QoS implementations. Figure 15 (b) models the weight evaluation model. It has four states-E, W, S, and N-corresponding to the four directions. The Boolean inputs east, south are used to represent the moving directions. The integer outputs w size, and so on, represent the correspondingly evaluated weights for the four tasks. We then encode the models and control objectives in BZR and employ BZR to automatically synthesize a controller satisfying the control rules. By feeding the resulting program to the BZR compiler, it generated the C code of controller (with overall size 298.4KB) within 25s (see Table II ). In both case studies, the partial bitstreams are relatively small (about 50KB) and can be configured in less than 0.5ms with a 100MHz ICAP clock. The maximum camera resolution is 1080x1920 and the frame rate is 60fps, which means a period of 16.6ms. The complete bitstream of a project is about 1.6MB.
DISCUSSION ON SCALABILITY
The major concern of our approach is the scalability issue, which is common to other formal techniques such as model checking. We have carried out extensive experiments to evaluate the scalability of our framework. Table III shows our experimental results to compute the controllers. It gives the time costs for different DCS operations corresponding to different system objectives with regard to different system models and state space sizes. The state space size of each system model is computed by simply multiplying state space sizes of its composed automata. The size of synthesized controllers varies from 50KB (objective (2) on model 4:(2,3,2,3)) to 28MB (objective (7) on model 6:(1 6 )). We have started our experiments from the task graph of Figure 1(b) . We then refine B to 3 tasks to increase the system model to 6 tasks, and at last, refine C to 3 tasks as well to address an 8-task model, as shown in Figure 16 . We use the notation n : (m 1 , . . . m n ) to represent the models, where n denotes the number of tasks, and m i , 2 
51:24
X. An et al. the number of possible implementations of task i. In addition, we use m k to represents k consecutive ms. For example, 4 : (4 4 ) denotes 4 : (4, 4, 4, 4). All experiments are performed on a computer with an Intel(R) Core(TM)2 Duo CPU of 2.33GHz and a 3.8GB main memory.
In our experiments, the DCS of invariance constraints, that is, objectives (1) to (3) and 5, are applied directly to the original system model. On the basis of the resulting controller, the optimal and reachability objectives are then performed. The objectives about invariance and reachability appear promising, while optimal ones are explosive. An interesting point observed is that the time cost is not always increasing as state space size grows. System models consisting of more tasks but less possible implementations could have less synthesis times, for example, DCS operations for 6 : (1 6 ) model take less time than these for 4 : (4 4 ) model. These observations can be explained as follows by the nature of the DCS algorithms corresponding to different control objectives.
The invariance objectives aim to make a subset E of system states invariant. The synthesis algorithm explores the system states from the initial state(s) and their transitions to return a controllable system such that the controllable transitions (1) leading to states that are not in E are inhibited, and (2) leading to states from where a sequence of uncontrollable transitions that can lead to states not in E are inhibited. The computational cost of this DCS operation thus depends not only on the system state space but also the size of target state set E, as well as the number and the guard type of transitions associated with these states. Since more implementations of tasks mean more choices/transitions to explore, this explains why the computational cost with regard to these objectives for more tasks with less implementations can be less.
The reachability objective aims to make a subset E of system states always reachable from current states. Its corresponding algorithm thus explores the system states from the initial state(s) and their transitions to return a controllable system such that the controllable transitions entering subsets of states E from where E is not reachable are disabled. Compared to the synthesis algorithm for invariance objectives, it generally takes more time as, first, it needs to explore states and transitions to search for E and, second, explore the rest states and their transitions to compute the values of controllable variables. This can explain why the time costs with regard to the reachability objective are more than those corresponding to invariance objectives.
The optimal control objectives aim to optimize the costs or weights defined on the states and/or transitions of system automaton models. Their time costs get much higher compared to the two aforementioned objective types, as their DCS algorithms perform not only the state and transition exploration, but also cost computations and comparisons. To improve the scalability of our approach and address systems of more tasks, especially when optimal control objectives are enforced, one can employ more powerful PCs and spend more time on the one hand, or improve the efficiency of employed synthesis tools and employ modular DCS presented in Delaval et al. [2010] on the other hand. The main idea of the modular DCS is to break the system into subsystems by structuring task graphs into hierarchical subgraphs, and perform local DCS for each subsystem before performing the global DCS for the whole system.
CONCLUSION AND PERSPECTIVES
We described the management of dynamically partially reconfigurable FPGAs, for which formal guarantees are given on the behavior of the reconfigurable system in terms of reachable state space. Our contribution consists of a tool-supported method to design safe controllers for dynamically reconfigurable architectures, and its experimental validation on two case studies using an FPGA board. Our approach is to formalize the behaviors of the DPR FPGAs as automata, following a modeling methodology, distinguishing the different levels of hardware architecture, task implementation, and application software. We formulate the reconfiguration policy as properties on the state space of the model, and the reconfiguration control as a Discrete Controller Synthesis problem. The BZR language and compiler is used to implement the models, solve the control problems, and generate executable C code.
Concerning architecture aspects, this formal approach allows the design of complex self-adaptive SoC that are correct by construction. This point is crucial to avoid intractable scenario-based simulations. Our formal approach paves the way to the safe use of future reconfigurable architectures with efficient power gating capabilities (e.g., MTJ-based FPGA [Suzuki et al. 2013] ); that reconfiguration can be exploited to finely adjust power consumption to application requirements. Perspectives are in different directions: concerning formal modeling and control, the exploitation of modular compilation and DCS can improve the scalability of the approach on large systems, provided that they can be structured hierarchically. The extension of DCS to logico-numeric aspects is being integrated in BZR and can support some quantitative aspects of systems, and more elaborate control objectives. Finally, the automatic generation of hardware implementation corresponding to the controllers built from our approach is an interesting perspective. For this purpose, the back end of the BZR compiler needs to be extended for generating VHDL programs and FPGA bitstreams.
