In embedded system, a real-time operating system (RTOs) is often used to structure the application code and ensure that the deadlines are met by reacting on events in the environment by executing the functions within precise time. Most embedded systems are bound to real-time constraints with determinism and latency as a critical metrics. Generally RTOs are implemented in software, which in turns increases computational overheads, jitter and memory footprint which can be reduced even if not remove completely by utilizing latest FPGA technology, which enables the implementation of a full featured and flexible hardware based RTOs. Scheduling algorithms play an important role in the design of real-time systems. This paper proposes the novel FIS based adaptive hardware task scheduler for multiprocessor systems that minimizes the processor time for scheduling activity which uses fuzzy logic to model the uncertainty at first stage along with adaptive framework that uses feedback which allows processors share of task running on multiprocessor to be controlled dynamically at runtime. This Fuzzy logic based adaptive hardware scheduler breakthroughs the limit of the number of total task and thus improves efficiency of the entire real-time system. The increased computation overheads resulted from proposed model can be compensated by exploiting the parallelism of the hardware as being migrated to FPGA.
INTRODUCTION
Today's consumer market is driven by technology innovations. Many technologies that were not available a few years ago are quickly being adopted into common use.
Equipment for these services requires microprocessors inside and can be regarded as embedded system. Embedded devices are often designed to serve their unique purpose and are included in a variety of products within different technical areas such as industrial automation, consumer electronics, automotive industry and communications and multimedia systems. Embedded systems find application in almost all the product ranging from train and airplanes to microwave ovens and washing machines. As semiconductor prices drop and their performance improves, there is a rapid increase in the complexity of embedded applications. The increased complexity of embedded applications and the intensified market pressure to rapidly develop cheaper product have caused the industry to streamline software development. Use of embedded operating system or Real Time Operating System (RTOS) is one technique used to reduce development time of such system as it has effects on hardware abstraction, multitasking, code size, learning curve and the initial investment. Unfortunately, operating systems do introduce several forms of overheads.
FPGAs have been the reconfigurable computing mainstream in recent time. Gate-level reconfigurability supports of FPGA results in reducing the development time to market and cost as compared to ASIC's which can be exploited to harness the benefit of developing the full featured and flexible hardware based RTOs.
Real time systems are embedded systems in which the correctness of application implementations is not only dependent upon the logical accuracy of its computations, but its ability to meet its timing constraints as well [1] . Thus the design of the RTOses have dual goal of minimizing the overheads and maximizing the determinism.
http://dx.doi.org/10.12785/ijcds/050606
http://journals.uob.edu.bh This paper is organized as follows. Section 2 is an overview of the Hardware/Software co-design approaches. Section 3 describes related work of other research projects, proposed model is discussed in section 4 and section 5 covers summary and conclusion from mainly previous work and related work.
HARDWARE SOFTWARE CO-DESIGN ARCHITECTURE
RTOs are often used in embedded systems to structure the application code to ensure that deadlines are met. The notions of best-effort and real-time processing have fractured into a spectrum of processing classes with different timeliness requirements including desktop multimedia, soft real-time, firm real-time, adaptive soft real-time and traditional hard real-time [2] [3] [4] . Many RealTime systems are hard and missing deadline is catastrophic where as in soft real-time system, occasional violation of deadline may not result in useless execution of the application but decreases utilization [5] .
Traditionally RTOS's are implemented in software, but major drawbacks of standard software based RTOS's is that they suffer from computational overheads, indeterminism, jitter and often a large memory footprint. RTOS computational overheads is caused mainly by tick interrupt management, which get even worse with more task and high tick frequencies, but also task scheduling , resource allocation and de-allocation, deadlock detection and various other OS/API functions take execution time from the task running on the CPU.
Embedded system always consists of software and hardware components and can no longer depend in independent hardware or software solutions to real time problem due to cost, efficiency, flexibility, upgradability, scalability and development time.
Task implemented as software programs running on microprocessor have the properties of high flexibility but poor performance. On the other hand, task implemented as hardware modules placed in Hardware have the characteristics of high performance along with low flexibility and high cost. The FPGA technology, which can be programmed virtually an n number of times (depends upon the technology), which paved the way for enhanced flexibility and made it possible to implement established software algorithms in hardware i.e. real-time kernel activity like scheduling, inter-process communications, interrupt management, resource management, synchronization and time management controls. Algorithm implemented in hardware has unique characteristics of high level parallelism and improved determinism that consequently decreases system overhead, improve predictability and increases response time.
As a tradeoffs, reconfigurable and hardware/software co-design approaches that offer real time capabilities while maintaining flexibility to support increasing complex systems become more feasible solution to allow software tasks running on a microprocessor along with hardware task running in an FPGA device (Figure 1 ). This hardware/software co-design approach reach a level of maturity that are allowing system designers to perform operating systems core and housekeeping functionality such as time management and task scheduling in hardware harness the advantages of higher level program development while achieving the performance potential offered by executions of these functions in parallel hardware circuits. 
RELATED WORK
The main source of indeterminism in real time systems are varying instruction cycle time caused by pipeline, caches, varying execution time of RTOs kernel functions, external asynchronous interrupts etc. By migrating real time kernel from software to hardware it is possible to remove jitter, lessen CPU overhead and improve the indeterminism due to cache and pipeline problems. Various models and systems have been proposed [6] to overcome this problem and some of them were discussed in remaining section.
Lennart Lindh et al. [7] proposed a system FASTCHART, an RISC based uniprocessor system which puts ID of tasks into various queues. It consists of hardware based RT kernel capable of handling 64 tasks with 8 different priorities.
http://journals.uob.edu.bh and ASICs and Co-simulation is provided using Ptolemy environment [9] . Complexity of processors makes static estimation is difficult and does no support for large design as it generates customized C-code for selected processors only.
Lennart Lindh et al. [10] also proposed FASTHARD which supports features like rendezvous, external interrupts, periodic start and termination of task without CPU interference. However system is limited in supports for customization and scalability. It is extension to earlier work FASTCHART, based on general purpose processors. Paper does not provided any benchmarks or test results.
The COSYMA system proposed by [11] uses simulated annealing for partitioning which can be fine or coarse grained, to speedup software executions to meet timing constraints. It does not support burst-mode communication. List and path based techniques are used to estimate execution time of hardware. J. Adomat et al. [12] come up with RTU (Real Time Unit), a multi-processor system which uses single interrupt input of each CPU to control and context switching. Lindh et al. [13] also proposes extensible multiprocessor system -SARA, which can be used together with RTU to remove the all scheduling and tick processing overheads.
STRON system, based on µTRON project proposed by T. Nakano et al. [14] come up with hardware kernel which implements system calls and functionality results in increasing speedup and reducing jitter. This hardware kernel is supported by small micro kernel has been implemented to take care of the features not implemented in hardware. This system has tick frequency limitations and does not have hardware support to prevent unbounded priority inversion.
In order to minimize hardware cost while maintaining timing constraints, R. Gupta et al. developed VULCAN [15] Hardware/Software partitioning tool, which uses heuristic graph partitioning algorithm that runs in polynomial time. The original description was in Hardware-C [16] , which is mapped to fine grained Control-Data Flow Graph.
Hardware software co-design framework for embedded system-CHINOOK, proposed by P. Chou et al. [17, 18] is an automated interface synthesis which supports mapping of an embedded system model to one or more processor and peripherals. Though more emphasis is put on distributed architecture which ensuring timing constraints but system is inflexible and more complex.
A heterogeneous hardware/software DSP system CoWare in [19] proposed by H. De. Man et al., is basis of commercial CoWare N2C [20] . This system supports the re-use and encapsulation of hardware and software by a clear separation between functional and communication behavior of a system components. Though this system allows co-specification using VHDL, DFL, Sliage & C languages, but imposes increased demands on generation of exhaustive library elements.
Bjorn B. Brandenburg et al. [21] discuss a soft realtime extension of the Linux kernel, the LITMUS RT project with focus on multiprocessor real-time scheduling and synchronization. It supports the sporadic task model with both partitioned and global scheduling [22] . The primary goal is to provide a useful experimental platform for applied real-time systems research but LITMUS RT failed to establish as stable interfaces.
F-Timer framework suggested by A. Parisoto et al. [23] is FPGA based task scheduler capable of managing 32 tasks with 64 different priorities which is targeted at general purpose processor. System does not have any hardware support for task synchronization and resource handling. Paper does not discussed about scheduling algorithm employed.
Spring kernel is basically designed for large and complex multiprocessor based RTOS proposed by J. Stankovic et al. [24, 25] takes a radically different approach to task scheduling which is based on dynamic and speculative planning implemented through heuristic algorithm and tree search. Fine granularity of task deadlines is possible at the cost of large amount of precalculation overheads which affects the performance.
Hardware scheduling accelerator which can be configured for several different algorithms is proposed by J. Hildebrandt et al in [26, 27] . This hardware implementation of dynamic scheduling coprocessor also supports advanced Enhanced Least Laxity First (ELLF) algorithm. This system could not address trashing of task but increases the overall determinism at the cost of higher complex logic.
δ-Framework-a hardware/software co-design RTOs framework proposed by V. Mooney et al. in [28] , supports 30 different processors. The system is cost effective as far as overall speedup and hardware area (number of gates) is concerned. This framework generates all HDL code which can be implemented in FPGA. More work on SOC was conducted [29] to integrate priority inheritance and deadlock avoidance mechanism.
Configurable hardware scheduler with improved response time, interrupt latencies, CPU utilization has been design and developed by V. Mooney et al. [30] , http://journals.uob.edu.bh which also supports high tick frequency. This model supports three different algorithms which can be change at run time dynamically and interrupt controller in scheduler supports 8 external interrupts each can be configured for dispatching a specific task.
Issues of extension to OS and flexibility arises out of moving entire OS to hardware can be overcome in model propose by Z. M. Wirthlin et al. in [31] . The nanoprocessor provides upgradability, flexibility and also enhancing the execution time by moving selected inefficient OS services in hardware to save on power consumption to a great extent as shown in [32] .
Paul Kohot et al. in [33] , developed Real-Time Manager (RTM) which leverages the potential of hardware parallelism, In this system, routine housekeeping tasks are implemented in hardware and thus free the processor for critical functions which boosts the overall performance. RTM supports static priority scheduling and handles task, time and event management. The author claims RTM decreases RTOS overheads by 90% decreases response latency by 81%.
Problem arises out of low tick granularity can which cause jitter and result in deadline misses is overcome by M.Vetromille et al. [34] in their proposed system HaRTS. The HaRTS supports high tick frequency and thus reduce jitter without lower CPU available time for task to process. Though it is more complex to implements but it requires less chip area and uses less power than additional processor.
The Hardware RTOS implemented for accelerating eCos, HW-eCos is interfaced to an ARM processor requires fewer gates to implement and provides better speedup. Communication speed between RTOS and hardware overshadowed the speed gain by hardware scheduler is overcome by S. Chandra et al. in [35] by intelligent design. Paper does not discuss the number of tasks and resources supported by this system. SRTOS proposed by Z. Murtaza, S. Khan et al. [36] aims at real-time DSP application which is targeted on AVZ21 DSP processor. Though this paper doesn't provide any experimental test result but system supports additional instruction for fast resource allocation and context switching.
M. Song et al. [37] come up with H-Kernel, an outcome of through use of FPGA and thoughtful HW/SW co-design for specific application. Though system become more complex and bulky as number of task increases but increase in performance in the tune of 50-60%, is achievable with the system with small numbers of task.
Sebastien Pillement et al. [38] proposed DART -an FPGA based reconfigurable architecture which deals concurrently with high-performance, flexibility and lowenergy constraints. Flexibility of FPGAs is achieved at a very high silicon cost interconnecting huge amount of processing primitives. These interconnection and configuration overheads result in energy waste. DART was designed as a platform-based architecture which define cluster level interface to implement user dedicated logic which allows for the integration of applicationspecific operators which efficiently support bit-level parallelism. The main concern of this class of architectures is high reconfiguration overhead.
ARPA-MT multi-threading processor with five stage pipeline system is proposed by A. S. R. Oliveira et al. [39] . This system supports heterogeneous task and context switches without hampering the processor performance.
Latency introduced due to PLB bus interface in the system can be removed by better and more direct connections between CPU and coprocessor as proposed by Luis Almeida et al. in [40, 41] OReK_CoP i.e. Hardware implementation of OReK Real-Time Kernel. All kernel functions execute in absolute time and almost in parallel, without interfering CPU which improves determinism and improve resource utilization. Xaingrong Zhou, Peter Petrov et al. [42] presented model by converging compiler, micro-architecture and OS kernel to reduce the context switching cost and improve overall responsiveness which the main source of performance degradation in most of the HW SW based solutions. In this proposed model context switching may be deferred until next switch point to limit the number of context registers required to hold state. Though this arrangement results in more deadline miss which can be avoided by more complex and good RTOS kernel design. ARTESSO architecture as proposed by N. Maruyama et al. in [43] , ported RTOS, checksum calculation, memory copying and TCP header rearrangement to hardware. It uses novel virtual queue instead of FIFO based queues used in RTU and STRON, which are logic expensive. The author claims that this system is 6-9 times faster than STRON and 7 times more energy efficient than its software counterpart.
Numbers of research projects have approached the task of designing OS for FPGA based reconfigurable computers (RC). By providing native kernel support for FPGA hardware Hayden Kwok-Hay et al. [44] [45] [46] proposed BORPH, an operating system designed for FPGA-based RC. BORPH offers a homogeneous UNIX interface for both software and hardware processes. Hardware processes inherit the same level of service from the kernel.
http://journals.uob.edu.bh Static scheduling of DAGs (Direct Acyclic Graph) on multi-reconfigurable-unit system under strict real-time constraints and from a parallel processing perspective is proposed by Ikbel Belaid et al. [47] . Clustering the task, mapping the task in these clusters and placing these clusters on reconfigurable devices, dynamic partial reconfiguration and efficient placement are achieved. However, this approach face difficulty in dealing with nondeterministic systems with run-time characteristics that are not well known before the DAG running and this approach will work only for small DAGs.
HartOS-
Hardware implemented Real-Time Operating System is proposed by Lange A.B. et al. [48, 49] is designed to be very flexible and support most of the features normally found in a standard software RTOS directly in hardware without sacrificing flexibility. The HartOS's ability to run kernel at a higher clock frequency than the microprocessor, enables more tasks to be processed serially at the same tick frequency and thus speed up the part of the API functions executed in the kernel.
Comparative study of various methodologies/models reviewed in the literature is given in the Table 1 [50] .
Methodology/ Model Architecture Used & Claims by Authors

FASTCHART (1991)
Hybrid [7] RISC based processor with Load Store architecture. Migrated full kernel to Hardware to improve determinism and remove jitter.
POLIS (1991)
Hybrid [8] Co-design Finite State Machine (CFSM) design. Flexibility to evaluate HW/SW partitioning, architecture & scheduler through mixed implementation of SW & ASICs.
FASTHARD (1992)
Hybrid [10] Memory mapped design (address/data bus). HW based RT Kernel to support external interrupts & rendezvous.
RTU (1994 )
H/W based [12] Memory mapped design (VME bus). Supports multiple task, binary semaphores, event flags, watchdogs with minimum overheads and improved predictability.
Silicon TRON (1995)
Hybrid [14] Memory mapped design (address/data bus). Improve determinism and supports task mgt., flags, semaphores, timers & external interrupt.
VULCAN (1995)
Hybrid [15] CDFG based fine grained mapping design. Hardware/software partitioning results in reducing the overall cost.
CHINOOK (1996)
Hybrid [17] Distributed Architecture. Supports mapping of processor & peripherals with strict timing constraints with automated interface synthesis.
COWARE (1996)
Hybrid [19] Memory mapped design (address/data bus). Supports re-use, encapsulation of HW & SW by separation of functional behavior to supports heterogeneous HW/SW DSP systems.
COSYMA (1997)
Hybrid [11] Memory mapped design (address/data bus). Uses novel list & path-based scheduling to estimate HW execution time & speedup SW executions to meet timing constraints.
F-Timer (1997)
Hybrid [23] Memory mapped design (address/data bus). Supports external interrupts by reducing overall RTOs overheads with improved determinism.
Spring Coproc (1999)
Hybrid [25] Memory mapped design (address/data bus). Supports fine granularity of task deadlines & multiprocessors with guaranteed scheduling without blocking resources.
ELLF Sched. Coproc. (2000)
Hybrid [26] Memory mapped design (address/data bus). Supports ELLF algorithm with dynamic priority calculation by exploring parallelism in HW.
The δ-Framework (2002)
Hybrid [28] Memory mapped design (address/data bus). Uses less nos. of gates for equivalent HW area targeted for HW/SW co-design.
Mooney (2003)
Hybrid [29] Memory mapped and instruction set acceleration based design. Configurable scheduler which supports Priority based, Rate monotonic & EDF algorithms & high tick rate.
Nano-processor (2003)
Hybrid [31] Memory mapped design (address/data bus). Provides flexibility of choosing services to perform in HW with faster execution with compatibility with range of hardware.
RT Task Manager (2003)
Hybrid [33] Memory mapped design (address/data bus). Supports static priority & handles task, time & event mgt. with same tree by migrating routine task to HW.
HaRTS (2006)
Hybrid [34] OPB Bus Scheme based design. Requires less power, less chip area and supports high tick frequency and granularity with lowering jitters.
LITMUSRT (2006) S/W based [21]
Push/Pull approach. Effective testbed to evaluate diff RT Scheduler & also supports G-EDF based scheduling with private queue for each processor.
HW-eCos (2006)
Hybrid [35] Memory mapped design (address/data bus). Removes context switching overheads through interrupt line to CPU, reduce code size and thus improve performance.
Silicon RTOS (2006)
Hybrid [36] Memory mapped design (address/data bus). Supports external interrupt management & uses priority based scheduling to make RT DSP applications efficient.
H-Kernel (2007)
Hybrid [37] Memory mapped design (address/data bus). Supports priority based task, interrupt, event & time mgt through H-kernel and performance through thoughtful HW/SW co-design.
OReK_CoP (2009)
Hybrid [41] PLB bus interface with stack based priority ceiling design. Ported OReK kernel to HW to improve performance & supports asynchronous interrupt handling which improve determinism
Xiangrong et al (2010)
S/W based [42] Micro-architecture & OS kernel. Uses micro-architecture to lower context switching and improve responsiveness.
ARTESSO (2010)
Hybrid [43] TCP/IP protocol. Improve throughput by moving TCP Header calculations to HW & supports priority based FCFS scheduler by using novel virtual queue structure.
BORPH (2011)
S/W based [45] OS uses Virtual file system. Reduces context switching drastically by exploiting the benefits of parallelism and FPGA reconfigurability.
ARPA-MT (2011)
Hybrid [38] Stack based priority ceiling design. Specialized, Predictable and customized Processor design which supports heterogeneous task & schedules using RM or EDF protocol.
HartOS (2012)
Hybrid [48] FSL-AXI stream interface. Interrupt handled as task & mutex are protected by stack based priority ceiling which reduces jitters and memory footprints. Scheduling algorithm plays as important role in the design of real-time systems which involves allocation of resources and time to jobs in such way that certain performance requirements are met.
Most of the model discussed and reviewed are mainly focused on to improve the performance by migrating some of the house keeping routine jobs from software to hardware with a aim to leverage the potential of parallel processing of hardware which can further be improved to a greater extent if more realistic scheduling algorithm is devise and migrate it on hardware to assist processor and RTOs so as to increase the overall performance without increasing memory footprint and power consumptions.
HARDWARE TASK SCHEDULER
Mostly researchers dealing with real-time system scheduling, assumes scheduling constraints to be precise. But in practical reality, the values of these parameters are vague in most of the cases. To overcome these limitation of vagueness of jobs scheduling parameters [51] , Fuzzy logic play important role in generating most optimal scheduling which enhance the utilization of the resources and thus increases the overall schedulability of the system by treating these vague scheduling parameters are treated as fuzzy variables. In this research paper, a two phase adaptive scheduling algorithm is developed and migrated on FPGA to harness the potential of parallel processing which will compensate added computational cost for executing of complex fuzzy algorithms.
Architecture
We proposed Fuzzy Inference System (FIS) based adaptive hardware task scheduler framework which is discussed in subsequent paragraph basically consists of:
1. Global Fuzzy scheduler -Long term scheduler.
(FIS 1) 2. Local Adaptive scheduler -Short term scheduler.
(FIS II) Both of these scheduler work in cascade and are migrated on hardware which will work in synchronous with processor and RTOs to fulfill the overall systems objectives as illustrated in figure 2 .
To build a fuzzy system, inputs and output(s) to it must be first selected and partitioned into appropriate conceptual categories which actually represent a fuzzy set on a given input or output domain. Parameters which affects the schedulers performance are selected as input to the Fuzzy Inference System (FIS) [52, 53] , which consist of five stages:
1. Fuzzifying inputs 2. Applying fuzzy operators 3. Applying implication methods 4. Aggregating outputs 5. De-fuzzifying outputs Here Madani's Fuzzy inference method of TSK or simply Sugeno method of fuzzy inference may be used [54] [55] [56] .
Block diagrams of FIS I and FIS II along with the parameters selected as Input and Output are along with surface viewer are shown in figure 3 . The working of proposed novel Two phase Fuzzy Inference System based hardware task scheduler which uses fuzzy logic to model is depicted asAn arrival of new task in system initiates the application. These new task are stored in Arrival Queue in First-in-First-out manner (FIFO) waiting to be get processed by the Fuzzy Inference System (Phase I). Task entering the systems are tagged with some basic parameters which play important role in scheduling these task. These jobs are stored in sorted order as per newly calculate Job Processing Priority (JPP). Task queued in Global queue are feed to Fuzzy Inference System (Phase II). Local Queue holds the task in sorted order as per the Job Final Priority (JFP) calculated by FIS 2. Master controller keeps track of actual execution time (AET) of each task being processed and if the difference between Worst Case Execution Time (WCET) and AET for a task in beyond certain threshold value i.e. δ (t), then is it notified back to FIS II which will update the value of WECT by AET and consider this new updated value of WCET during next scheduling cycle. Task blocks on shared resources are stored in Block Queue where semaphore is used to resolve the deadlock and task are moved from block queue to Waiting Queue if the task is yet to be complete. These tasks are then added back to Arrival Queue along with newly entered task in FIFO order.
4.1.1
Adaptive Fuzzy Scheduling Under traditional task model like periodic, sporadic etc., the schedulability of system is based on each task's worst-case execution time (WCET), which defined the maximum amount of time each of its jobs can execute. The disadvantage of using WCETs is that system may be deemed un-schedulable even if they would function correctly most of the time when deployed. This drawback can be overcome by making our scheduler adaptive to the runtime varying conditions, to allocate per-task processors time share, instead of always using constant share allocation based on constant WCET and readjusting the priority of task. When there is variation in the WCET and the actual execution time of a particular job beyond some predetermined threshold value, adaptive task schedulers is invoked with actual execution time and reschedule the task and refresh and reorder the tasks in local queue accordingly. This results into adjusting the per task processor time share based on the runtime conditions which will effectively increases the overall schedulability and processor utilization. Overall quality-of-service (QoS) can be improved by ignoring the transient overload conditions. Dispatcher will dispatch the task from local queue to processors bank to get serve.
Further resource synchronization is used to optimize scheduling of the tasks blocked on shared resource which are parked on blocked or waiting queue. Task blocks on shared resources are stored in Block Queue are moved from block queue to Waiting Queue if the task is yet to be complete. These tasks are then added back to Arrival Queue along with newly entered task in FIFO order. Resource synchronization module which implements priority queue with aging to avoid the task starvation and thus improve chance of fair treatments to all the tasks in the queue is used to remove the deadlocks on resources among task from block task queue which will increase the overall performance of the RTOs. Processors share allocations are adjusted using feedback and resource synchronization techniques [58] .
Fine grained time management and frequent sorting and re-arrangements of tasks in Local Queue and Waiting Queue increases the CPU overhead and thus affects the processor utilization which can be overcome by implementing these queues as hardware priority queue fig.4 A hardware Intellectual Property (IP) can be used for implementing routine frequently used housekeeping activities like scheduling, inter-process communication and time management control from the software OSkernel to hardware unit. This result in significantly reducing the overhead by migrating kernel services to hardware which will improve the response time by increasing the CPU utilization. A hardware kernel executes in parallel to the CPU, minimizes the processor time for scheduling activity and thus relieves pressure from the CPU which gets almost full execution time for the application tasks. There is less software code in memory since the functionality is implemented in hardware instead [23] .
A software OS will generate a clock tick interrupt to the CPU when either it is executed or the lists of tasks (queues) are worked at or new periodic delay times are calculated for the tasks. With the hardware kernel in the system, it checks all queues concurrently and only generates an interrupt to the CPU when there is to be a task switch [59, 60] . Another advantage of having the kernel in hardware is the possibility to use complex scheduling algorithms, unlimited of different queue types without any performance loss.
When real-time kernels are implemented in software, one of the disadvantages is that the execution time for the service calls will have a minimum and a maximum time [61, 62] . The time gap can be big and the worst-case time is one of the factors that will decide the utilization factor of the system. The scheduling time varies with the number of tasks and scheduling algorithm and must be bounded by a pessimistic worst case execution time, which decrease the determinism.
We have proposed two phase FIS based hardware task scheduler which uses fuzzy logic to model the uncertainty at first stage along with adaptive framework that uses feedback in second stage. Scheduling based on static WCET will results in lower utilization of processors, which can be overcome by adaptive feedback mechanism which will update the WCET parameter of the task with AET, if the difference between the WCET & AET is exceeding the pre define threshold value τ, which allows processors share of task running on multiprocessor to be controlled dynamically at runtime and thus increases the overall processor utilization and thus the schedulability. Further, Starvation of low priority task problem is overcome by Resource synchronization module which in turns avoids the aging of task. Because of high granularity, frequent sorting and updation of the tasks in queue increases the overhead which can be reduced to greater extent by using Hardware Priority Queue [63] to store the task which increase the sorting speed and thus lessen the burden of CPU. This increases http://journals.uob.edu.bh the overall utilization of CPU and increases the schedulability of the tasks.
Our future work is to map this proposed model on MicroBlaze soft processor core as MicroBlaze FPGA designs are readily available and can be implemented with little effort. The FreeRTOS port in MicroBlaze is being targeted to be modified and run tasks concurrently on multiple processors as FreeRTOS provides simple, easy to use and highly portable kernel. The aim to produce a version of FreeRTOS that supports multi-core hardware and efficient hardware based task scheduler.
