10 research outputs found

    A fuzzy logic based dynamic reconfiguration scheme for optimal energy and throughput in symmetric chip multiprocessors

    Get PDF
    Embedded systems architectures have traditionally often been investigated and designed in order to achieve a greater throughput combined with minimum energy consumption. With the advent of reconfigurable architectures it is now possible to support algorithms to find optimal solutions for an improved energy and throughput balance. As a result of ongoing research several online and offline techniques and algorithm have been proposed for hardware adaptation. This paper presents a novel coarse-grained reconfigurable symmetric chip multiprocessor (SCMP) architecture managed by a fuzzy logic engine that balances performance and energy consumption. The architecture incorporates reconfigurable level 1 (L1) caches, power gated cores and adaptive on-chip network routers to allow minimizing leakage energy effects for inactive components. A coarse grained architecture was selected as to be a focus for this study as it typically allows for fast reconfiguration as compared to the fine-grained architectures, thus making it more feasible to be used for runtime adaption schemes. The presented architecture is analyzed using a set of OpenMP based parallel benchmarks and the results show significant improvements in performance while maintaining minimum energy consumption

    A Fuzzy Logic Reconfiguration Engine for Symmetric Chip Multiprocessors

    Get PDF
    Recent developments in reconfigurable multiprocessor system on chip (MPSoC) have offered system designers a great amount of flexibility to exploit task concurrency with higher throughput and less energy consumption. This paper presents a novel fuzzy logic reconfiguration engine (FLRE) for coarse grain MPSoC reconfiguration that facilitates to identify an optimum balance between power and performance of the system. The FLRE is composed on two levels of abstraction layers. The system selects an optimal configuration of Level 1 / Level 2 cache size and Associativity, processor operating frequency and voltage, the number of cores based on miss rate, and energy and throughput information of the system both at core and SoC level. An 8-core symmetric chip multiprocessor has been used to evaluate the proposed scheme. The results show an overall decrease of energy consumption with not more than 30% decrease in the throughput

    Load Balancing and Efficient Memory Usage for Homogeneous Distributed Real-Time Embedded Systems

    Get PDF
    International audienceThis paper deals with load balancing and efficient memory usage for homogeneous distributed real-time embedded applications with dependence and strict periodicity constraints. Most of load balancing heuristics tend to minimize the total execution time of distributed applications by equalizing the workloads of processors. In addition, our heuristic satisfies dependence and strict periodicity constraints which are of great importance in embedded systems. However, since resources are limited some tasks distributed onto a processor may require more data memory than available. Thus, we propose a fast heuristic achieving both load balancing and efficient memory usage under dependence and strict periodicity constraints. Complexity and theoretical performance studies have showed that the proposed heuristic is respectively efficient and fast. Thus, an efficient memory usage is also necessary, especially in embedded systems where memory is limited. Although the total execution time of tasks is minimized some tasks could not be executed because the processors where they were distributed do not own enough memory to store the data used by these tasks. However, memory usage plays a significant role in determining the applications performances

    Load Balancing and Efficient Memory Usage for Homogeneous Distributed Real-Time Embedded Systems

    Get PDF
    International audienceThis paper deals with load balancing and efficient memory usage for homogeneous distributed real-time embedded applications with dependence and strict periodicity constraints. Most of load balancing heuristics tend to minimize the total execution time of distributed applications by equalizing the workloads of processors. In addition, our heuristic satisfies dependence and strict periodicity constraints which are of great importance in embedded systems. However, since resources are limited some tasks distributed onto a processor may require more data memory than available. Thus, we propose a fast heuristic achieving both load balancing and efficient memory usage under dependence and strict periodicity constraints. Complexity and theoretical performance studies have showed that the proposed heuristic is respectively efficient and fast. Thus, an efficient memory usage is also necessary, especially in embedded systems where memory is limited. Although the total execution time of tasks is minimized some tasks could not be executed because the processors where they were distributed do not own enough memory to store the data used by these tasks. However, memory usage plays a significant role in determining the applications performances

    Schedulability conditions for non-preemptive hard real-time tasks with strict period

    Get PDF
    International audiencePartial answers have been provided in the real-time literature to the question whether preemptive systems are better than non-preemptive systems. This question has been investigated by many authors according to several points of view and it still remains open. Compared to preemptive real-time scheduling, non-preemptive real-time scheduling and the corresponding schedulability analyses have received considerable less attention in the research community. However, non-preemptive scheduling is widely used in industry, and it may be preferable to preemptive scheduling for numerous reasons. This approach is specially well suited in the case of hard real-time systems on the one hand where missing deadlines leads to catastrophic situations, and on the other hand where resources must not be wasted. In this paper, we firstly present the non-preemptive model of task with strict period, then we propose a schedulability condition for a set of such tasks, and finally we give a scheduling heuristic based on this condition

    An Approach to Manage Reconfigurations and Reduce Area Cost in Hard Real-Time Reconfigurable Systems

    Get PDF
    This article presents a methodology to build real-time reconfigurable systems that ensure that all the temporal constraints of a set of applications are met, while optimizing the utilization of the available reconfigurable resources. Starting from a static platform that meets all the real-time deadlines, our approach takes advantage of run-time reconfiguration in order to reduce the area needed while guaranteeing that all the deadlines are still met. This goal is achieved by identifying which tasks must be always ready for execution in order to meet the deadlines, and by means of a methodology that also allows reducing the area requirements

    A Heuristic Approach to Schedule Periodic Real-Time Tasks on Reconfigurable Hardware

    No full text
    This paper deals with scheduling periodic real-time tasks on reconfigurable hardware devices, such as FPGAs. Reconfigurable hardware devices are increasingly used in embedded systems. To utilize these devices also for systems with real-time constraints, predictable task scheduling is required. We formalize the periodic task scheduling problem and propose two preemptive scheduling algorithms. The first is an adaption of the well-known Earliest Deadline First (EDF) technique to the FPGA execution model. Although the algorithm reveals good scheduling performance, it lacks an efficient schedulability test and requires a high number of FPGA configurations. The second algorithm uses the concept of servers that reserve area and execution time for other tasks. Tasks are successively merged into servers, which are then scheduled sequentially. While this method is inferior to the EDF-based technique regarding schedulability, it comes with a fast schedulability test and greatly reduces the number of required FPGA configurations

    Task scheduling and placement for reconfigurable devices

    Get PDF
    Partially reconfigurable devices allow the execution of different tasks at the same time, removing tasks when they finish and inserting new tasks when they arrive. This dissertation investigates scheduling and placing real-time tasks (tasks with deadline) on reconfigurable devices. One basic scheduler is the First-Fit scheduler. By allowing the First-Fit scheduler to retry tasks while they can satisfy their deadlines, we found that its performance can be enhanced to be better than other schedulers. We also proposed a placement idea based on partitioning the reconfigurable area into regions of various widths, assigning a task to a region based on its width. This idea has a similar rejection rate to a First-Fit scheduler that retries placing tasks and performs better than the First-Fit that does not retry tasks. Also, this regions-based scheduling method has a better running time. Managing how the space will be shared among tasks is a problems of interest. The main function of the free-space manager is to maintain information about the free space (areas not used by active tasks) after any placement or deletion of a task. Speed and efficiency of the free-space data structure are important as well as its effect on scheduler performance. We introduce the use of maximal horizontal strips and maximal vertical strips to represent free space. This resulted in a faster free space manager compared to what has been used in the area. Most researchers in the area of scheduling on reconfigurable devices assumed a homogeneous FPGA with only CLBs in the reconfigurable area. Most reconfigurable devices offered in the market, however, are not homogeneous but heterogeneous with other components between CLBs. We studied the effect of heterogeneity on the performance of schedulers designed for a homogeneous structure. We found that current schedulers result in worse performance when applied to a heterogeneous structure, but by simple modifications, we can apply them to a heterogeneous structure and achieve good performance. Consequently, the approach of studying homogeneous FPGAs is a valid one, as the scheduling ideas discovered there do carry over to heterogeneous FPGAs

    Accelerating the execution of time consuming software applications by configuring special hardware during the program execution on multiprocessor computers

    Get PDF
    За разлику од рачунара који се заснивају на контроли тока (енг. control-flow), чији су процесори способни за обављање свих инструкција дефинисаних архитектуром рачунара, а од којих сваки у једном тренутку обавља највише неколико инструкција, код рачунара заснованих на протоку података се хардвер конфигурише тако да се просторно распореде компоненте од којих је свака у стању да изврши само инструкцију за коју је предвиђена. Извршавање се своди на проток података кроз такав хардвер. Главне одлике овакве архитектуре рачунара су већа проточност података и смањена потрошња електричне енергије. Иако хардверске архитектуре рачунара засноване на протоку података постоје деценијама, технологија је тек недавно омогућила њихово равноправно коришћење са рачунарима заснованим на контроли тока, чиме проблем распоређивања послова између хардвера заснованог на протоку података и конвенционалних процесора све више добија на значају. Неке од временски захтевних апликација већи део времена извршавања проводе у цикличном понављању истих операција. Уколико су те итерације међусобно независне, или се могу довести у такав облик, онда је њихово извршавање погодно обавити употребом реконфигурабилног хардвера и парадигме засноване на протоку података. Ова теза описује постојеће метеде и предлаже нове за прављање распореда извршавања послова на оваквим архитектурама рачунара у циљу побољшања перформанси, при чему су само неке од апликација погодне за убрзавање коришћењем реконфигурабилног хардвера и парадигме засноване на протоку података. Предлажу се и временско и просторно дељење реконфигурабилног хардвера од стране конвенционалних процесора...In contrast to control-flow computer architectures, whose processors are capable of executing all instructions defined by the architecture, while each processor executes only up to few instructions simultaneously, hardware dataflow architectures are based on configuring hardware by spreading components capable of executing one instruction each over the surface. Computation is based on dataflow through the hardware. Main characteristics of this architecture are higher data throughput and reduced power consumption. Some of the computation demanding applications spend most of the execution time in iterating over the same set of instructions. Although hardware dataflow architectures exist for decades, due to the technology limitations, they have became valuable for executing such applications only recently. Therefore, the problem of scheduling jobs on dataflow hardware and conventional processors becomes increasingly important. Some of the computation demanding applications spend most of the execution time in executing for loops. If iterations are mutually independent, or if they can be transformed in such a form, then these applications are suitable for executing on dataflow hardware. This thesis presents available methods for creating schedules for this kind of architectures in order to reduce total execution times, and proposes new ones. Sharing the dataflow hardware in both time and space is proposed. Scheduling jobs on this architecture belongs to the NP problem class and scheduling time is considered as an overhead, so the algorithms use heuristics and search possible combinations of jobs only up to appropriate depth. Results confirm that this architecture can reduce total execution time and reveal the conditions under which the acceleration is possible..
    corecore