7 research outputs found

    Real-Time Scheduling on Heterogeneous SoC Architectures Using A Neural Network

    Get PDF
    Introduction Several scheduling algorithms have been developed for constraint satisfaction in real-time systems. Optimality is difficult to reach, and the problem becomes NP-hard when a large set of constraints must be satisfied. To solve this type of problem, approximate methods are used, such as Artificial Neural Networks (ANNs). Neural networks have demonstrated their efficiency in optimization problems. They converge in a reasonable time if the number of neurons and connections between neurons can be limited. Another limitation concerns the need to regularly re-initialize the network when it converges towards a stable state which does not belong to the set of valid solutions. On the other hand, embedded applications are usually implemented on complex System-on-Chip (SoC) which are built around heterogeneous processing units. On such platform, task instantiation on execution resources is realized by using the scheduling service of an OS. As each task can be defined for several targets, this service must decide, on-line, on which resource the task should be instantiated. In this work, we propose an on-line scheduling based on a neural network for heterogeneous system-on-chip (SoC) architectures with a limited number of neurons...Les technologies de conception de circuits intĂ©grĂ©s permettent aujourd’hui de concevoir des systĂšmes complets et complexes sur une seule et mĂȘme puce. On parle alors de systĂšmes sur puce, ou encore de System-on-Chip (SoC). Ces systĂšmes ont en charge l’exĂ©cution d’applications complexes, composĂ©es de nombreuses tĂąches, le tout Ă©tant orchestrĂ© par un systĂšme d’exploitation dont l’un des rĂŽles principaux consiste Ă  ordonnancer les tĂąches et Ă  les allouer aux ressources de calcul. L’une des particularitĂ©s de ces architectures concerne l’hĂ©tĂ©rogĂ©nĂ©itĂ© des cibles d’exĂ©cution qui rend le problĂšme de l’ordonnancement particuliĂšrement dĂ©licat et complexe. Notons de plus que le critĂšre temps rĂ©el des applications s’exĂ©cutant sur ce type de plate forme nĂ©cessite l’étude de solutions d’ordonnancement efficaces, notamment en terme de temps de calcul. Dans ce papier, nous prĂ©sentons nos travaux de modĂ©lisation du problĂšme de l’ordonnancement pour architectures multi-processeurs hĂ©tĂ©rogĂšnes par utilisation de rĂ©seaux de neurones. Des travaux prĂ©cĂ©dents ont montrĂ© qu’une structure de rĂ©seaux de neurones suivant le modĂšle de Hopfield peut ĂȘtre dĂ©finie pour ordonnancer des tĂąches sur une architecture homogĂšne. Une extension Ă  ces travaux a montrĂ© qu’il Ă©tait possible de prendre en compte l’hĂ©tĂ©rogĂ©nĂ©itĂ© de l’architecture mais au prix d’un grand nombre de neurones supplĂ©mentaires. De plus, ces solutions posent un problĂšme de convergence important qui se traduit par un temps de convergence assez long et le besoin de rĂ©-initialiser le rĂ©seau de neurones lorsque celui-ci se stabilise dans un Ă©tat qui n’est pas une solution valide. Pour contrer ces principaux inconvĂ©nients, nous proposons une nouvelle structure basĂ©e sur la mise en place de neurones inhibiteurs. Ces neurones particuliers permettent de limiter le nombre de neurones nĂ©cessaires Ă  la modĂ©lisation et permettent surtout de se passer de rĂ©-initialisations pour atteindre la convergence. Nous illustrons l’apport de notre proposition en comparant les solutions classiques Ă  base de rĂ©seaux de neurones de Hopfield avec notre proposition. Nous montrons que le nombre de neurones est assez largement rĂ©duit et surtout qu’il n’est plus nĂ©cessaire de rĂ©-initialiser le rĂ©seau pour assurer sa convergence, ce qui laisse envisager une implĂ©mentation efficace de ce type de structure

    싀시간 멀티윔얎 í”ŒëŁšìŽë“œ 슀쌀쀄링에서 전ìČŽ 시슀템의 시간 · 밀도 튞레읎드였프

    Get PDF
    í•™ìœ„ë…ŒëŹž (ë°•ì‚Ź)-- 서욞대학ꔐ 대학원 êł”êłŒëŒ€í•™ ì „êž°Â·ì»Ží“ší„°êł”í•™ë¶€, 2017. 8. 읎찜걎.Recent parallel programming frameworks such as OpenCL and OpenMP allow us to enjoy the parallelization freedom for real-time tasks. The parallelization freedom creates the time vs. density tradeoff problem in fluid scheduling, i.e., more parallelization reduces thread execution times but increases the density. By system-widely exercising this tradeoff, this dissertation proposes a parameter tuning of real-time tasks aiming at maximizing the schedulability of multicore fluid scheduling. The experimental study by both simulation and actual implementation shows that the proposed approach well balances the time and the density, and results in up to 80% improvement of the schedulability.1 Introduction 1 1.1 Motivation and Objective 1 1.2 Approach 3 1.3 Organization 4 2 Related Work 6 2.1 Real-Time Scheduling 6 2.1.1 Workload Model 6 2.1.2 Scheduling on Multicore Systems 7 2.1.3 Period Control 9 2.1.4 Real-Time Operating System 10 2.2 Parallel Computing 10 2.2.1 Parallel Computing Framework 10 2.2.2 Shared Resource Management 12 3 System-wide Time vs. Density Tradeoff with Parallelizable Periodic Single Segment Tasks 14 3.1 Introduction 14 3.2 Problem Description 14 3.3 Motivating Example 21 3.4 Proposed Approach 26 3.4.1 Per-task Optimal Tradeoff of Time and Density 26 3.4.2 Peak Density Minimization for a Task Group with the Same Period 27 3.4.3 Heuristic Algorithm for System-wide Time vs. Density Tradeoff 38 3.5 Experimental Results 45 3.5.1 Simulation Study 45 3.5.2 Actual Implementation Results 51 4 System-wide Time vs. Density Tradeoff with Parallelizable Periodic Multi-segment Tasks 64 4.1 Introduction 64 4.2 Problem Description 64 4.3 Extension to Parallelizable Periodic Multi-segment Task Model 70 4.3.1 Peak Density Minimization for a Task Group of Multi-segment Tasks with Same Period 71 4.3.2 Heuristic Algorithm for System-wide Time vs. Density Tradeoff 78 5 Conclusion 81 5.1 Summary 81 5.2 Future Work 82 References 84 Appendices 100 A Period Harmonization 100Docto

    An Efficient Online Benefit-aware Multiprocessor Scheduling Technique for Soft Real-Time Tasks Using Online Choice of Approximation Algorithms

    Get PDF
    Maximizing the benefit gained by soft real-time tasks in many applications and embedded systems is highly needed to provide an acceptable QoS (Quality of Service). Examples of such applications and embedded systems include real-time medical monitoring systems, video- streaming servers, multiplayer video games, and mobile multimedia devices. In these systems, tasks are not equally critical (or beneficial). Each task comes with its own benefit-density function which can be different from the others’. The sooner a task completes, the more benefit it gains. In this work, a novel online benefit-aware preemptive approach is presented in order to enhance scheduling of soft real-time aperiodic and periodic tasks in multiprocessor systems. The objective of this work is enhancing the QoS by increasing the total benefit, while reducing flow times and deadline misses. This method prioritizes the tasks using their benefit-density functions, which imply their importance to the system, and schedules them in a real-time basis. The first model I propose is for scheduling soft real-time aperiodic tasks. An online choice of two approximation algorithms, greedy and load-balancing, is used in order to distribute the low- priority tasks among identical processors at the time of their arrival without using any statistics. The results of theoretical analysis and simulation experiments show that this method is able to maximize the gained benefit and decrease the computational complexity (compared to existing algorithms) while minimizing makespan with fewer missed deadlines and more balanced usage of processors. I also propose two more versions of this algorithm for scheduling SRT periodic tasks, with implicit and non-implicit deadlines, in addition to another version with a modified loadbalancing factor. The extensive simulation experiments and empirical comparison of these algorithms with the state of the art, using different utilization levels and various benefit density functions show that these new techniques outperform the existing ones. A general framework for benefit-aware multiprocessor scheduling in applications with periodic, aperiodic or mixed real-time tasks is also provided in this work.Computer Science, Department o

    Real-time operating system support for multicore applications

    Get PDF
    Tese (doutorado) - Universidade Federal de Santa Catarina, Centro TecnolĂłgico, Programa de PĂłs-Graduação em Engenharia de Automação e Sistemas, FlorianĂłpolis, 2014Plataformas multiprocessadas atuais possuem diversos nĂ­veis da memĂłria cache entre o processador e a memĂłria principal para esconder a latĂȘncia da hierarquia de memĂłria. O principal objetivo da hierarquia de memĂłria Ă© melhorar o tempo mĂ©dio de execução, ao custo da previsibilidade. O uso nĂŁo controlado da hierarquia da cache pelas tarefas de tempo real impacta a estimativa dos seus piores tempos de execução, especialmente quando as tarefas de tempo real acessam os nĂ­veis da cache compartilhados. Tal acesso causa uma disputa pelas linhas da cache compartilhadas e aumenta o tempo de execução das aplicaçÔes. AlĂ©m disso, essa disputa na cache compartilhada pode causar a perda de prazos, o que Ă© intolerĂĄvel em sistemas de tempo real crĂ­ticos. O particionamento da memĂłria cache compartilhada Ă© uma tĂ©cnica bastante utilizada em sistemas de tempo real multiprocessados para isolar as tarefas e melhorar a previsibilidade do sistema. Atualmente, os estudos que avaliam o particionamento da memĂłria cache em multiprocessadores carecem de dois pontos fundamentais. Primeiro, o mecanismo de particionamento da cache Ă© tipicamente implementado em um ambiente simulado ou em um sistema operacional de propĂłsito geral. Consequentemente, o impacto das atividades realizados pelo nĂșcleo do sistema operacional, tais como o tratamento de interrupçÔes e troca de contexto, no particionamento das tarefas tende a ser negligenciado. Segundo, a avaliação Ă© restrita a um escalonador global ou particionado, e assim nĂŁo comparando o desempenho do particionamento da cache em diferentes estratĂ©gias de escalonamento. Ademais, trabalhos recentes confirmaram que aspectos da implementação do SO, tal como a estrutura de dados usada no escalonamento e os mecanismos de tratamento de interrupçÔes, impactam a escalonabilidade das tarefas de tempo real tanto quanto os aspectos teĂłricos. Entretanto, tais estudos tambĂ©m usaram sistemas operacionais de propĂłsito geral com extensĂ”es de tempo real, que afetamos sobre custos de tempo de execução observados e a escalonabilidade das tarefas de tempo real. Adicionalmente, os algoritmos de escalonamento tempo real para multiprocessadores atuais nĂŁo consideram cenĂĄrios onde tarefas de tempo real acessam as mesmas linhas da cache, o que dificulta a estimativa do pior tempo de execução. Esta pesquisa aborda os problemas supracitados com as estratĂ©gias de particionamento da cache e com os algoritmos de escalonamento tempo real multiprocessados da seguinte forma. Primeiro, uma infraestrutura de tempo real para multiprocessadores Ă© projetada e implementada em um sistema operacional embarcado. A infraestrutura consiste em diversos algoritmos de escalonamento tempo real, tais como o EDF global e particionado, e um mecanismo de particionamento da cache usando a tĂ©cnica de coloração de pĂĄginas. Segundo, Ă© apresentada uma comparação em termos da taxa de escalonabilidade considerando o sobre custo de tempo de execução da infraestrutura criada e de um sistema operacional de propĂłsito geral com extensĂ”es de tempo real. Em alguns casos, o EDF global considerando o sobre custo do sistema operacional embarcado possui uma melhor taxa de escalonabilidade do que o EDF particionado com o sobre custo do sistema operacional de propĂłsito geral, mostrando claramente como diferentes sistemas operacionais influenciam os escalonadores de tempo real crĂ­ticos em multiprocessadores. Terceiro, Ă© realizada uma avaliação do impacto do particionamento da memĂłria cache em diversos escalonadores de tempo real multiprocessados. Os resultados desta avaliação indicam que um sistema operacional "leve" nĂŁo compromete as garantias de tempo real e que o particionamento da cache tem diferentes comportamentos dependendo do escalonador e do tamanho do conjunto de trabalho das tarefas. Quarto, Ă© proposto um algoritmo de particionamento de tarefas que atribui as tarefas que compartilham partiçÔes ao mesmo processador. Os resultados mostram que essa tĂ©cnica de particionamento de tarefas reduz a disputa pelas linhas da cache compartilhadas e provĂȘ garantias de tempo real para sistemas crĂ­ticos. Finalmente, Ă© proposto um escalonador de tempo real de duas fases para multiprocessadores. O escalonador usa informaçÔes coletadas durante o tempo de execução das tarefas atravĂ©s dos contadores de desempenho em hardware. Com base nos valores dos contadores, o escalonador detecta quando tarefas de melhor esforço o interferem com tarefas de tempo real na cache. Assim Ă© possĂ­vel impedir que tarefas de melhor esforço acessem as mesmas linhas da cache que tarefas de tempo real. O resultado desta estratĂ©gia de escalonamento Ă© o atendimento dos prazos crĂ­ticos e nĂŁo crĂ­ticos das tarefas de tempo real.Abstracts: Modern multicore platforms feature multiple levels of cache memory placed between the processor and main memory to hide the latency of ordinary memory systems. The primary goal of this cache hierarchy is to improve average execution time (at the cost of predictability). The uncontrolled use of the cache hierarchy by realtime tasks may impact the estimation of their worst-case execution times (WCET), specially when real-time tasks access a shared cache level, causing a contention for shared cache lines and increasing the application execution time. This contention in the shared cache may leadto deadline losses, which is intolerable particularly for hard real-time (HRT) systems. Shared cache partitioning is a well-known technique used in multicore real-time systems to isolate task workloads and to improve system predictability. Presently, the state-of-the-art studies that evaluate shared cache partitioning on multicore processors lack two key issues. First, the cache partitioning mechanism is typically implemented either in a simulated environment or in a general-purpose OS (GPOS), and so the impact of kernel activities, such as interrupt handlers and context switching, on the task partitions tend to be overlooked. Second, the evaluation is typically restricted to either a global or partitioned scheduler, thereby by falling to compare the performance of cache partitioning when tasks are scheduled by different schedulers. Furthermore, recent works have confirmed that OS implementation aspects, such as the choice of scheduling data structures and interrupt handling mechanisms, impact real-time schedulability as much as scheduling theoretic aspects. However, these studies also used real-time patches applied into GPOSes, which affects the run-time overhead observed in these works and consequently the schedulability of real-time tasks. Additionally, current multicore scheduling algorithms do not consider scenarios where real-time tasks access the same cache lines due to true or false sharing, which also impacts the WCET. This thesis addresses these aforementioned problems with cache partitioning techniques and multicore real-time scheduling algorithms as following. First, a real-time multicore support is designed and implemented on top of an embedded operating system designed from scratch. This support consists of several multicore real-time scheduling algorithms, such as global and partitioned EDF, and a cache partitioning mechanism based on page coloring. Second, it is presented a comparison in terms of schedulability ratio considering the run-time overhead of the implemented RTOS and a GPOS patched with real-time extensions. In some cases, Global-EDF considering the overhead of the RTOS is superior to Partitioned-EDF considering the overhead of the patched GPOS, which clearly shows how different OSs impact hard realtime schedulers. Third, an evaluation of the cache partitioning impacton partitioned, clustered, and global real-time schedulers is performed.The results indicate that a lightweight RTOS does not impact real-time tasks, and shared cache partitioning has different behavior depending on the scheduler and the task's working set size. Fourth, a task partitioning algorithm that assigns tasks to cores respecting their usage of cache partitions is proposed. The results show that by simply assigning tasks that shared cache partitions to the same processor, it is possible to reduce the contention for shared cache lines and to provideHRT guarantees. Finally, a two-phase multicore scheduler that provides HRT and soft real-time (SRT) guarantees is proposed. It is shown that by using information from hardware performance counters at run-time, the RTOS can detect when best-effort tasks interfere with real-time tasks in the shared cache. Then, the RTOS can prevent best effort tasks from interfering with real-time tasks. The results also show that the assignment of exclusive partitions to HRT tasks together with the two-phase multicore scheduler provides HRT and SRT guarantees, even when best-effort tasks share partitions with real-time tasks

    The Case for Fair Multiprocessor Scheduling

    No full text
    Partitioning and global scheduling are two approaches for scheduling real-time tasks on multiprocessors. Though partitioning is sub-optimal it has traditionally been preferred; this is mainly due to the fact that well-understood uniprocessor scheduling algorithms can be used on each processor. In recent years, global scheduling algorithms based on the concept of "proportionate fairness" (Pfairness)have received considerabl attention. Pfairal)[[2#3) are of interest because they are currentl the onl known method for optimal) schedul13 periodic, sporadic, and "rate-based" task systems on mul6[]2 cessors. In addition, there has been growing practical interest in schedul32 with fairness guarantees. However, the frequency of context switching and migration in Pfair-scheduls systems has l) to some questions concerning the practicalR y of Pfair schedul#6) In this paper, we investigate this issue by comparing the PD Pfairalir)2B1 to the EDF-FF partitioning scheme, which uses "first fit" (FF)as a partitioning heuristic and theearlBB):[B2B1)l first (EDF)al):R3R# for per-processor schedul1): We present experimental resul) that show that is competitive with, and in some cases outperforms, EDF-FF. These resulL suggest that Pfair schedulR) is aviabl all1):R] e to partitioning. Furthermore, as discussed herein, Pfair scheduling provides many additional benefits, such assimpl and efficient synchronization, temporal isolL1]): faul tol1]):R6 and support for dynamic tasks
    corecore