The Hardware (Ti W)/SofMare (SW) 
Introduction
The embedded systems are strongly used in several fields like industnes, telecommunications, and avionics. They are typically reactive and real time. The design of such complex systems requires high level design tool in order to rapidly select and synthesise promising architectures. Due to hard real time constraints, hardware implementation (e.g. LP based) of critical functions must be performed. It is then necessary to use a softwarehardware codesign approach which must allow a minimal design cost and a minimum time to market, Such systems are increasingly controldominated and data-dependent for optimization purposes. It means that a dynamic scheduling should be theoretically used in order to cope with the load variability. However this kind of scheduling techniques has drawbacks incompatible with embedded systems; actually dynamic scheduling can not guarantee real time constraints and require a compIes implementation. For these reasons, embedded real time operating systems [RTOS) use fixed scheduling policies that consist of a priori time slot reservation for each task. However, worst case execution time (WCET) of tasks must be used to guarantee real time constraints. So in case of very variable execution delay it can lead to a fall of the system performances by means of architecture oversizing. Thus the solution which appears is no longer a real t h e management but rather a QoS management [I] .
For these reasons, we propose to use a notion of QoS instead of RT during the partitioninghcheduling step. Thus we add a new category of' tasks for periodic and aperiodic "soft RT" tasks. This kind of tasks respect the RT constraints with a given probability.
QoS has been often addressed in multimedia, in that case satisfying latency and synchronization requirements. Given many task sets and a processor with multiple voltages, they search all the feasible competitive schedules with the minimal energy consumption and memory requirement a s s d n g that two schedules are competitive if neither outperforms the other in both energy consumption and memory requirement. However, they do not consider the resource sharing possibility between tasks and assume that all tasks are run on the processor.
Compared with this last previous work, the proposed approach differs in three aspects: first, we address the domain of RT HW/SW co-synthesis. Second, we process the problem of QoS in terms of application quality and RT constraints choices. Third, we consider the possibility of hardware resource and coprocessor sharing between tasks.
The rest of the paper is organized as follows. in next section, we present the structure of the proposed co-synthesis framework, the target architecture model, the cost function which includes area and power models, and the hasic RT scheduling assumptions. Section 3 presents the QoS model. A football piayer robot application is experimented in section 4. We draw conclusion in section 5,
Design, Flow
In this section we present the structure of our CAD tool suite for SoC design, architecture model, the cost function, and some basic RT scheduling assumptions
Overview
Our objective is that the QoS aspects of an embedded system should be taken into account starting from the requirements specification phase. Our flow is described in Figure 1 .
We have opted for the task graph defined in [7] in order to use the Radha/Ratan too1 to obtain internal task constraints from Input/Output system constraints.
Each task is described in C code file, the Design Trotter framework [8] first generates a hierarchical data flow graph from which different kind of estimation can be produced like delaylarea of FPGA HW components [I,] or power software estimation by hierarchically combining of CData Flow Graph from [lo] . Another solution consists in using qualified SWNW IP specification.
By combining estimations data, the initial task graph and designer choices a new file is generated. This "File.cde" includzs the tasks constraints selected and the description of all task implementations. The "File.arch" gives the archtectural parameters, like the Vddclock modes, the bus protocol and so on. The final solutions selected after the partitioning/scheduling step based on Simulated Annealing or Branch & bound algorithms are finally stored in the "Fileimp".
The new step which is the subject of this paper is identified in' the figure by dashed rectangle. This step wil1 be described with detail in section 3. 
Cost function
The cost function takes into account the globat area of the SoC and its energy consumption. At a high level of abstraction only relative estimations can be considered €or SW and HW IPS, the cost function is used to guide the selection of reduced set of solutions. In order to eliminate units, relative costs are used to evaluate the cost value for a given schedulable solution S:
With a + P = l and where MinArea is the schedulable solution with minimal area without any power consideration and MinPw the schedulable solution with minimal power without any area consideration. Note that the area cost influences the power consumption through the static power evaluation so the a parameter also acts on the power , optimization.
Area cost
The area cost includes the data and code memory size for software implementations, the area of coprocessors that can be shared by various tasks, the area of hardware accelerators and finally the area of memories added for communication.
Power cost
In this subsection, we outline our model for power consumption. The model for power evaluation is much more compIex. Firstly the dynamic power consumption depends on the SoC activity, which is strongly related to the task scheduling and switching.
Secondly, the evaluation of VLSI technology shows that power consumption [l 11, especially in FPGAs, can no more be neglected. Finally, in mobile embedded systems the important metric is the system life span. It means that the energy use must be optimized. However, in our context of periodic (or periodized) tasks the energy optimization is equivalent to average power minimisation over the hyper period. Our power model for an implementation S is given by:
Where: Pwd is the average dynamic power dissipated during a hyperperiod TG mdPws the average static power.
Dynamic pawer/energy metric
Where: & is the energy consumed during T,, Pw,(i) the average dynamic power of task i , Ci the execution delay, Pi the period, f%d(iwitch) the average power during task switching, Cmit* the average task switching time, Pwd(idZe) the average power when processor idle, and so on.
For flexibility and genericity concems, the average task dynamic power values Pwd(i) are normalised versus the supply voltage, and clock frequency and the average task static power is expressed by area unit (W/gate or W / p 2 as indicated in [ 121).
Static powedenergy metric
The available static power, usually given by means of mW/area, depends mainly of the leakage power, the supply voltage, the transistor count and a technology dependent parameter:
Our model uses Pws{sw) and Pw,fiw) for software and hardware respectively. A dynamic strategy can be 
RT scheduling
The basis for the static QoS management is a set of a RT scheduling assumptions [ 131. In this subsection, we summarize the proposed solution for data dependency between tasks and priority assignment. Informally, we present our approach as follows.
Given a real life RT system, t a s k communicate so they are not independent. Usually before applying the exact response time analysis, the dependencies which can be related to precedence constraints are eliminated by modifying absolute deadlines and release times. As demonstrated in our previous work 
Static QoS manager
In this section, we present the justifications and components of the QoS model. Then, we address the coherence checking for the static QoS manager.
Context definition
One of the major issues in real time embedded systems is the question of the execution time (ET) which can vary depending on data and on environment events. The second point is the question of periodic and aperiodic tasks with veq versatile inter iteration delays. In such uncertain contexts the choice of the worst case can lead to very costly and oversized implementations. As systems are growing in complexity, this overestimation become unacceptable and new approaches must be considered.
Model
We propose to insert a new step within the codesign flow. This step is based on a QoS model and produces the specification file according to the designer choices. Thus, we add a new kind of tasks for periodic and aperiodic "soft real time" tasks. This category of tasks respects the real time constraints w i t h a given probability. Each task can be represented by a QoS array of N parameters:
Where xi is a ratio without unit representing different aspects of QoS measurement. In this paper we consider two dimensions: TQoSJAQoS,RTQoS]
The first term AQoS represents a QoS specific to the task, namely to the application quality. For instance, it can be a data rate for network management task or a number of bits for pixel coding. The second term RTQoS is related to the real time constraints and means the minimum ratio of deadlines that must be met. The QoS task choices are usually not independent and the QoS specification step must check the relation that esists between task application qualities according to his choices. Regarding the real time issue, a task with a RTQoS equals to 1 should not be delayed by another task that is authorized to miss its deadline.
Figure3: Precedence constraints
Where Pi, Ci and Ri are period, execution time and release time respectively of task T,.
Our method is based on a HPF pre-emptive scheduling policy for SW tasks where the priority of task i equals lPi, rhus three cases must be distinguished regarding the data dependency T,-T, from figure 3. 
2.
If PiH'j then we impose to use the notion of critical resource with priority ceiling.
Else if HW-HW or SW-IHW then Rj is shifted.
The release time shifting is used to guarantee data availability when lj wake up.
The QoS management is inserted in the design flow by means of constraints specification. The QoS aware codesign flow is illustrated in the following seciion.
Case study: a football player robot application
The case study described in figure 4 is a football player robot application with video tasks for object detection, HF communications for message exchanging with other devices, motors controls, sensor acquisition, image processing and decision computation. Various HW with different granularities, SW and HW with coprocessor implementations are considered for the set of tasks. Note that T19 is the server task with the lowest priority; it includes all aperiodic software tasks without real time constraints.
Regarding the period values, the video tasks (power). We present relative values to show out the influence of QoS choices. Thus we observe in figure 7 that the power and the area costs can be efficiently reduced when the QaS constraints are relaxed. For instance, by reducing the video data rate, we observe that 40% of power reduction can be obtained for a medium quality. Another point is the cost of hard RT (HRT), actually if a soft RT (SRT) is used and tune to 75% of the WCET, we note that meaningful power and area savings are achieved. 
Conclusions
The design space related to embedded systems is extremely large, it involves functional specification decisions, implementations choices, including SW, HW with various granularities and coprocessors choices and also low level Clock frequency-Vdd couple alternatives. Moreover it requires a complex real-time analysis. A tool is required to handle the problem complexity but this tool must be controllable by the designer in an interactive way. In this paper we have presented the new step that has been added to the codesign flow by means of a QoS specification step within our CAI) tool. The aim of this step is to guide the designer in his implementations choices in terms of trade-off between guarantied QoS and Power / Area costs. This step addresses the static specification at design-time.
However, regarding the uncertainty of many embedded systems where the computation load is data-dependent and driven by user choices, a dynamic QoS management has to be explored, this is the second step of our work currently under development.
