One of the major challenges of computer system design is the management and conservation of energy while satisfying QoS requirements. Recently, Dynamic Voltage and Frequency Scaling (DVFS) has been integrated to various embedded processors as a mean to increase the battery life without affecting the responsiveness of tasks. This paper proposes an enhancement for I-codesign methodology [1] optimizing the energy consumption of the designed system.We propose an energy aware real-time scheduling algorithm. This algorithm makes use of the defferable server for the scheduling of aperiodic tasks along with DVFS. Simulation results demonstrate a decrease in the resulting energy consumption compared to the previously published work.
INTRODUCTION
Energy consumption is one of the major limiting factors of battery powered real-time systems. In this context, optimizing energy consumption without affecting performance while satisfying real time constraints is of major interest. To meet the timing constraints of the system, a scheduler must coordinate a set of tasks at different states (idle, blocked, running) and asks the run-time system to allocate the required resources to their execution. Many objectives must be considered in the design of a scheduling algorithm: (i) Guarantee that tasks with hard timing constraints will always meet their deadlines, (ii) Attain a high degree of schedulable utilization for hard deadline tasks, (iii) Provide fast average response time for tasks with soft deadlines (aperiodic tasks).
To obtain an energy-efficient design, the Dynamic Voltage and Frequency Scaling (DVFS) feature is widely adopted in modern processors (Horowitz et al., 1994) . The basic idea of the DVFS strategy is to reduce a processor's processing frequency, as long as task's timing constraints are not violated. Indeed, the power consumption of the processor is a polynomial of the processing frequency, generally with a degree no less than 2 (Li, 2012) , while the overall execution time of a task is just inversely proportional to the processing frequency. DVFS provides the possibility of minimizing energy consumption given a certain performance/timing requirement.
In an early work, we proposed a methodology called I-codesign for reconfigurable co-design (Ghribi et al., 2016a) . I-codesign presents an abstract model for hardware/software systems allowing early exploration of hardware/software trade-offs and evaluation of design alternatives. This model supports incremental refinement and evaluation at multiple abstraction levels. Its aim is to lead to an efficient implementation and improve overall system performance. The entry point for I-codesign is a hardware/software specification modeled by a DAG (Directed Acyclic Graph) where nodes are software functions. I-codesign maps this specification into a hardware architecture that is mainly an MPSoC. It also defines a new partitioning and mapping techniques for the proposed hardware/software model: a functional algorithm followed by a constructive algorithm and finally an iterative algorithm for optimization. Based on several design constraints such as inclusion/exclusion, communication costs, energy, memory, real-time feasibility and probabilistic estimations, I-codesign takes decisions of near-optimal placement of software functions into the target hardware units.
In this paper, we investigate the opportunity of reducing power consumption of the system to be designed according to I-codesign methodology and thus by introducing the DVFS feature in the proposed scheduling algorithm. We study the scheduling of a heteregenous task set modeled and partitioned according to the I-codesign methodology. An energy aware real-time scheduling algorithm for probabilistic heterogeneous task set is introduced in order to enhance the I-codesign power consumption and response time metrics. DVFS is applied on periodic tasks in order to reduce energy consumption without compromising periodic function deadlines and aperiodic functions responsiveness. In order to apply the proposed scheduling algorithm periodic functions priorities have been redefined according to two constraints : (i) the edge probability connecting the function to their predecessors in the task DAG, (ii) Function hierarchy which refers to the level of the corresponding function on the task DAG representation. The originality of this work resides in including the probabilistic estimation of the task's execution not only in the mapping process but also in the scheduling algorithm. The consideration of the precedence constraint through the proposed DAG hierarchy rule does cooperate and improve the overall system scheduling performance.
The paper proceeds as follows. The next Section describes useful background. Section III presents the I-codesign methodology. In Section IV, the system formalization and the notations used in this paper are developed. Section V exposes the proposed algorithm. Section VI shows simulation results of scheduling of real-time tasks and finally we conclude this paper in Section IV.
RELATED WORK
This section reviews the main approaches for scheduling a mixture of aperiodic tasks and periodic hard real-time tasks. The easiest way to prevent aperiodic tasks from interfering with periodic hard realtime tasks is to schedule them as background tasks executing only at times when there is no periodic task ready for execution. Although this method guarantees the schedulability of a periodic task, the execution of aperiodic tasks may be delayed and their response times are prolonged unnecessarily. The polling server is a periodic task with a period Ts, a capacity Cs and the highest priority (Li-yong et al., 2010) . Every server's activation, it checks if there are any pending aperiodic tasks, if there are, the server uses its capacity to service them until either the task is finished or the server's capacity is depleted. However, if there is no pending aperiodic task, the server remains idle until its next activation which means that even if an aperiodic request occurs in the middle of the server's servicing time, the request will not be treated until the next period as the server will already be inactive (Liyong et al., 2010) . The Priority Exchange (PE) and Deferrable Server (DS) algorithms, introduced by Strosnider in (Strosnider et al., 1995) , overcome the drawbacks associated with polling and background servicing of aperiodic requests. As with polling, the PE and DS algorithms create a periodic task (usually of a high priority) for servicing aperiodic requests. However, unlike polling, these algorithms will preserve the execution time allocated for aperiodic service if, upon the invocation of the server task, no aperiodic requests are pending. These algorithms can yield improved average response times for aperiodic requests because of their ability to provide immediate service for aperiodic tasks. The DS algorithm maintains its aperiodic execution time for the duration of the server's period. Thus, aperiodic requests can be serviced at the server's high priority at anytime as long as the server's execution time for the current period has not been exhausted. At the beginning of the DS's period, the server's high priority execution time is replenished to its full capacity. Unlike the DS algorithm, the PE algorithm preserves its high priority execution time by exchanging it for the execution time of a lower priority periodic task (Desokey et al., 2006) . The DS algorithm can provide better aperiodic responsiveness than polling because it preserves its execution time until it is needed by an aperiodic task. The DS algorithm is a simple algorithm to implement than the PE algorithm, because the DS algorithm always maintains its high priority execution time at its original priority level and never exchanges its execution time with lower priority levels as does the PE algorithm. It also requires less memory space than the PE and much lower computational complexity.
During the past two decades, tremendous works have been done regarding energy-aware scheduling on DVFS-enabled platforms. The application of DVFS algorithm to periodic task set is a well known research area (Tchamgoue et al., 2012; Ansari et al., 2013) . However, few works in literature focuses on DVFS applied to heterogeneous task set comprising of periodic and aperiodic tasks (Dongkun and Jihong, 2004; Shin and Kim, 2006) . The DVFS algorithm focuses on the usage and distribution of available slack time. The total time required by a task to run completely i.e. the actual execution time (aet) is always less than its worst case execution time (wcet). The difference that exists is the slack and it in turn, is utilized for reducing the voltage and frequency dynamically. This paper proposes a new scheduling algorithm with original evaluation metrics for priority calculation along with the defferable server for the scheduling algorithm. In this work, DVFS is incorporated into the scheduling process in order to dynamically redefine the scheduled element's periodicity and reduce the energy consumption.
I-CODESIGN METHODOLOGY
The goal of I-codesign is to achieve a concurrent hardware/software system design. It acts on a probabilistic task model to a hardware architecture in a manner that fulfills all the system requirements and respects the design constraints. I-codesign deals with a set of models and transformations. The main idea behind Icodesign is the use of the probabilistic task model in mapping which embeds useful data for the mapping and further optimization steps. The first step is the functional partitioning algorithm. It evaluates the inclusion/exclusion constraints between task functions and creates clusters depending on this constraint. Couples that are concerned with inclusion or exclusion constraints are placed in either the same or different clusters. Once all the inclusions and exclusions are evaluated, a feasibility analysis is performed. If all clustered functions sets on the created clusters are schedulable on one of the available processors then the schedulability test is validated. Otherwise, the functional partitioning is applied again to create new clusters with schedulable function sets. Since any inclusion/exclusion constraint is hard, the clustered tasks are locked and cannot be moved any more. The second phase is the hierarchical partitioning algorithm. It clusters the remaining functions that have no inclusion/exclusion constraints. The functions are evaluated by their connecting edge's probabilities and high probability values are treated first. The available memory space is evaluated at each iteration. Once all the remaining functions are placed into clusters a feasibility analysis is performed. If all the functions sets on the created clusters are schedulable on one of the available processors then the schedulability test is validated. Otherwise, the hierarchical clustering is applied again to generate clusters with schedulable function sets. The last phase is the kernighan-Lin optimization algorithm. This step evaluates both probability and communication cost on the edges connecting functions by gain calculation. If the gain is positive, then the function is moved to another cluster if its energy consumption on the other cluster is less or equal to its energy consumption on the original cluster. Otherwise it is left on the original cluster.
SYSTEM MODEL AND PROBLEM DEFINITION

Task Model
The software model comprises a set of tasks
, where (i) V i is a set of nodes that correspond to functions, and (ii) E i is a set of arcs which describe connection between functions. The edges are weighted with a couple ≺ Pr,Cc ≻ where Pr is the probability of executing this edge and Cc is the communication cost of data transfer between the two nodes connected with the edge. A task T i is a set of n periodic functions F= {F 1 , F 2 , .., F n }. Each function We also defined inclusion/exclusion constraint. It is used to impose at a couple of functions and/or behaviors to be executed either on the same computing unit or on different ones. The exclusion constraint is modeled within the task representation by marking the symbol ⊂ on the function F i which means that F i must not be executed with its predecessor on the same computing unit. The inclusion constraint is modeled by marking the symbol ⊂ on F i which means that F i must be executed with its predecessor on the same computing unit.
Problem Definition
In a previous work (Ghribi et al., 2016a) , I-codesign is described in detail. For a given task set, I-codesign is applied and as a result we get an optimized mapping of the system tasks into the hardware processing elements. The resulting mapping allows the execution of all possible reconfigurable scenarios of the designed system. It reduces the inter-PEs communications and guarantees the schedulability of tasks. I-codesign has been developed with the assumption that all functions are periodic. In this work we address the scheduling of heterogeneous task set scheduling while reducing the overall energy consumption of the system.
PROPOSED ALGORITHM AND EXAMPLE
In this section we present a scheduling algorithm for heterogeneous task set comprising periodic and aperiodic tasks. This algorithm relies on a deffrebale server for aperiodic tasks. Aperiodic functions will be executed at the maximum frequency and priority in order to achieve lowest response time whereas utilization of periodic functions will be updated according to the DVFS algorithm. Thus, we defined a defferable server T DS having a period denoted P DS and a capacity denoted C DS . For periodic functions, a priority definition is proposed as follows: 
Scheduling Algorithm
12:
The operating frequency selected is the lowest one for which the modified schedulability test succeeds. The voltage, of course, is changed to match the operating frequency. This algorithm is called by Algorithm 1 when a periodic function is specified for execution.
Example
We propose to apply the proposed algorithm on task T 1 presented in figure 3 . The task is composed of a set of periodic function F= {F 1 , F 2 , .., F 7 } and aperiodic functions A= {A 1 , A 2 , A 3 }. The output of Icodesign methodology applied to T 1 are two partitioning clusters. The operating frequencies are f= {0.25, 0.5, 0.75, 1} Ghz. The final clusters are presented in figure 4 . We propose to study the scheduling of cluster C1. The real-time parameters of the periodic and aperiodic functions are presented in Table-I and Table II. Table-I describes the aperiodic task set with arrival time and execution time. Table II describes the periodic tasks proprieties including the defferable server. In order to schedule the functions associated to the cluster C1, the PQ and AQ are populated according to the proposed priority definition and arrival time. For this example, a high priority server is created with an execution time of 2 time units and a period of 5 time units. At time= 0, the server's execution time is brought to its full capacity. This capacity is preserved until the first aperiodic request occurs at time = 5 since there is no pending aperiodic function. Hence, F 1 is executed. The frequency is scaled at the value f= 0.75 Ghz. At time= 5, the periodic request occurs to serve A 1 along with F 2 in the head of the periodic queue. Clearly, A 1 is serviced at the maximum frequency f= 1 Ghz until time= 7. F 2 and F 3 belong to the same DAG level, hence the probability on the edges connecting these functions with F 1 is assessed in order to determine the next function to be executed at time= 7. F 2 is serviced at a scaled frequency eaqual to 0.75 Ghz since EdgeProba (F 2 ) ∨ EdgeProba (F 3 ). At time= 10, the server's execution time at priority 1 is brought to its full capacity and is used to provide immediate service for A 2 at the maximum frequency f= 1 Ghz. At time= 12, F 3 is serviced since there are no periodic function with higher edge probability at its DAG level at the frequency f= 0.75 Ghz followed by F 5 at time= 18 at a scaled frequency of 0.5 Ghz. At time= 20, F 1 is serviced until time= 21 at a frequency f= 0.5 Ghz when it is preempted in order to serve A 3 immediately at the maximum frequency f= 1 Ghz. at time= 23, F 1 continues its execution at f= 0.75 Ghz. Figure 5 illustrates the time-line scheduling of the example as described above. 
SIMULATION RESULTS
In a previous work (Ghribi et al., 2016b) , we developed a co-design execution environment called SPEX. It provides a toolbox that allows the creation of a hardware/software system description according to the proposed design models and that implements the I-codesign algorithms. It proposes a flexible task set generator for different scenarios and purposes. The tool places the software specification following several proposed design constraints as inclusion/exclusion parameters, probabilistic execution of the software tasks, available memory and energy on the hardware units and real-time parameters. To evaluate the new scheduling algorithm several task sets of different dimensions are generated. The generated tasks are passed through SPEX and we obtain mapping scheme of the task set. We developed a new simulation module that implements our scheduling algorithm based on defferable server along with DVFS technique. The simulator populates the periodic and aperidoic queue, runs the specification according to function characteristics and generates estimations of the total execution time and consumed energy. In order to evaluate the proposed scheduling algorithms, various random task sets are generated according to the I-codesign modeling for probabilistic reconfigurable task sets. These tasks are decomposed into elementary functions and then characterized with the different co-design constraints (probability, communication costs, inclusion/exclusion). After applying the I-codesign algorithms, the resulting mapping is passed through the scheduling simulator. The scheduling results are compared to Earliest Deadline First (EDF) algorithm and the Rate Monotonic algorithm. Figures 6 and 7 present the performance results of our scheduling algorithm applied to different task sets along with those of EDF and RM. The comparison between the evaluated approaches has demonstrated that the new I-codesign scheduling algorithm offers better performance results particularly with large utilization factors and high number of nodes on the specification DAGs. These enhancements are due to probabilistic estimation of the communicated functions/behaviors that store dependent tasks with high chances to be executed successively on same PEs. Simulation results show that this contribution has many benefits: (i) the energy consumed during the system execution has been noticeably reduced and (ii) the global execution time has been minimized. Another advantage of I-codesign is its validation tests that includes real-time feasibility which result in avoiding any system fail due to a lack of resources.
CONCLUSIONS
In this work, an energy aware real-time scheduling algorithm with Dynamic Voltage and Frequency Scaling based on the Defferable Server has been proposed and implemented for mixed task set. This new scheduling algorithm is developed in order to enhance the I-codesign methodology. It considers the trade-offs between the energy consumption and the response time. It relies on simple constraints: the DAG hierarchy and the probability of execution for periodic functions. It makes use of DVFS technique in order to reduce the energy consumption of the system. Extensive simulation is carried out on our tasks sets. The results showed that our proposed energy efficient algorithm succeeds in reducing noticeably the energy consumption with no degradation in responsiveness of aperiodic tasks. 
