Search CORE

603 research outputs found

An Efficient Thread Mapping Strategy for Multiprogramming on Manycore Processors

Author: Tousimojarad Ashkan
Vanderbauwhede Wim
Publication venue: 'IOS Press'
Publication date: 01/03/2014
Field of study

The emergence of multicore and manycore processors is set to change the parallel computing world. Applications are shifting towards increased parallelism in order to utilise these architectures efficiently. This leads to a situation where every application creates its desirable number of threads, based on its parallel nature and the system resources allowance. Task scheduling in such a multithreaded multiprogramming environment is a significant challenge. In task scheduling, not only the order of the execution, but also the mapping of threads to the execution resources is of a great importance. In this paper we state and discuss some fundamental rules based on results obtained from selected applications of the BOTS benchmarks on the 64-core TILEPro64 processor. We demonstrate how previously efficient mapping policies such as those of the SMP Linux scheduler become inefficient when the number of threads and cores grows. We propose a novel, low-overhead technique, a heuristic based on the amount of time spent by each CPU doing some useful work, to fairly distribute the workloads amongst the cores in a multiprogramming environment. Our novel approach could be implemented as a pragma similar to those in the new task-based OpenMP versions, or can be incorporated as a distributed thread mapping mechanism in future manycore programming frameworks. We show that our thread mapping scheme can outperform the native GNU/Linux thread scheduler in both single-programming and multiprogramming environments.Comment: ParCo Conference, Munich, Germany, 201

arXiv.org e-Print Archive

Enlighten

"Virtual malleability" applied to MPI jobs to improve their execution in a multiprogrammed environment"

Author: Utrera Iglesias Gladys Miriam
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2007
Field of study

This work focuses on scheduling of MPI jobs when executing in shared-memory multiprocessors (SMPs). The objective was to obtain the best performance in response time in multiprogrammed multiprocessors systems using batch systems, assuming all the jobs have the same priority. To achieve that purpose, the benefits of supporting malleability on MPI jobs to reduce fragmentation and consequently improve the performance of the system were studied. The contributions made in this work can be summarized as follows:· Virtual malleability: A mechanism where a job is assigned a dynamic processor partition, where the number of processes is greater than the number of processors. The partition size is modified at runtime, according to external requirements such as the load of the system, by varying the multiprogramming level, making the job contend for resources with itself. In addition to this, a mechanism which decides at runtime if applying local or global process queues to an application depending on the load balancing between processes of it. · A job scheduling policy, that takes decisions such as how many processes to start with and the maximum multiprogramming degree based on the type and number of applications running and queued. Moreover, as soon as a job finishes execution and where there are queued jobs, this algorithm analyzes whether it is better to start execution of another job immediately or just wait until there are more resources available. · A new alternative to backfilling strategies for the problema of window execution time expiring. Virtual malleability is applied to the backfilled job, reducing its partition size but without aborting or suspending it as in traditional backfilling. The evaluation of this thesis has been done using a practical approach. All the proposals were implemented, modifying the three scheduling levels: queuing system, processor scheduler and runtime library. The impact of the contributions were studied under several types of workloads, varying machine utilization, communication and, balance degree of the applications, multiprogramming level, and job size. Results showed that it is possible to offer malleability over MPI jobs. An application obtained better performance when contending for the resources with itself than with other applications, especially in workloads with high machine utilization. Load imbalance was taken into account obtaining better performance if applying the right queue type to each application independently.The job scheduling policy proposed exploited virtual malleability by choosing at the beginning of execution some parameters like the number of processes and maximum multiprogramming level. It performed well under bursty workloads with low to medium machine utilizations. However as the load increases, virtual malleability was not enough. That is because, when the machine is heavily loaded, the jobs, once shrunk are not able to expand, so they must be executed all the time with a partition smaller than the job size, thus degrading performance. Thus, at this point the job scheduling policy concentrated just in moldability.Fragmentation was alleviated also by applying backfilling techniques to the job scheduling algorithm. Virtual malleability showed to be an interesting improvement in the window expiring problem. Backfilled jobs even on a smaller partition, can continue execution reducing memory swapping generated by aborts/suspensions In this way the queueing system is prevented from reinserting the backfilled job in the queue and re-executing it in the future.Postprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Secretaría de Estado de Cultura

Designing a fuzzy scheduler for hard real-time systems

Author: Lee Jonathan
Natarajan Swami
Pfluger Nathan
Yen John
Publication venue
Publication date
Field of study

In hard real-time systems, tasks have to be performed not only correctly, but also in a timely fashion. If timing constraints are not met, there might be severe consequences. Task scheduling is the most important problem in designing a hard real-time system, because the scheduling algorithm ensures that tasks meet their deadlines. However, the inherent nature of uncertainty in dynamic hard real-time systems increases the problems inherent in scheduling. In an effort to alleviate these problems, we have developed a fuzzy scheduler to facilitate searching for a feasible schedule. A set of fuzzy rules are proposed to guide the search. The situation we are trying to address is the performance of the system when no feasible solution can be found, and therefore, certain tasks will not be executed. We wish to limit the number of important tasks that are not scheduled

NASA Technical Reports Server

OS Scheduling Algorithms for Memory Intensive Workloads in Multi-socket Multi-core servers

Author: Durbhakula Murthy
Publication venue
Publication date: 23/09/2018
Field of study

Major chip manufacturers have all introduced multicore microprocessors. Multi-socket systems built from these processors are routinely used for running various server applications. Depending on the application that is run on the system, remote memory accesses can impact overall performance. This paper presents a new operating system (OS) scheduling optimization to reduce the impact of such remote memory accesses. By observing the pattern of local and remote DRAM accesses for every thread in each scheduling quantum and applying different algorithms, we come up with a new schedule of threads for the next quantum. This new schedule potentially cuts down remote DRAM accesses for the next scheduling quantum and improves overall performance. We present three such new algorithms of varying complexity followed by an algorithm which is an adaptation of Hungarian algorithm. We used three different synthetic workloads to evaluate the algorithm. We also performed sensitivity analysis with respect to varying DRAM latency. We show that these algorithms can cut down DRAM access latency by up to 55% depending on the algorithm used. The benefit gained from the algorithms is dependent upon their complexity. In general higher the complexity higher is the benefit. Hungarian algorithm results in an optimal solution. We find that two out of four algorithms provide a good trade-off between performance and complexity for the workloads we studied

arXiv.org e-Print Archive

Crossref

Research Archive of Indian Institute of Technology Hyderabad

Operations research and computers

Author: Kleijnen J.P.C.
Publication venue
Publication date
Field of study

operational research

Research Papers in Economics

Performance Evaluation of Real-Time Scheduling Heuristics for Energy Harvesting Systems

Author: Chetto Maryline
Zhang Hui
Publication venue: HAL CCSD
Publication date: 18/12/2010
Field of study

International audienceEnergy constrained systems can increase their usable lifetimes by extracting energy from their environment. This is known as energy harvesting. This paper investigates scheduling issues in uni-processor real time embedded systems using regenerative energy. Task scheduling should account for the properties of the regenerative energy source which fluctuates, capacity of the energy storage as well as deadlines of the time critical tasks that characterize most of real time embedded systems. In this context, designing efficient scheduling strategies is significantly more complex compared to conventional real-time scheduling. In this paper we compare several scheduling heuristics with the optimal algorithm known as LSA (Lazy Scheduling Algorithm). We report results of an experiment study in terms of percentage of deadlines satisfied

"Virtual malleability" applied to MPI jobs to improve their execution in a multiprogrammed environment"

Author: Utrera Iglesias Gladys Miriam
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/03/2010
Field of study

Tesis Doctorals en Xarxa

Model Checking Real Time Java Using Java PathFinder

Author: Lindstrom Gary
Mehlitz Peter C.
Visser Willem
Publication venue
Publication date: 25/05/2005
Field of study

The Real Time Specification for Java (RTSJ) is an augmentation of Java for real time applications of various degrees of hardness. The central features of RTSJ are real time threads; user defined schedulers; asynchronous events, handlers, and control transfers; a priority inheritance based default scheduler; non-heap memory areas such as immortal and scoped, and non-heap real time threads whose execution is not impeded by garbage collection. The Robust Software Systems group at NASA Ames Research Center has JAVA PATHFINDER (JPF) under development, a Java model checker. JPF at its core is a state exploring JVM which can examine alternative paths in a Java program (e.g., via backtracking) by trying all nondeterministic choices, including thread scheduling order. This paper describes our implementation of an RTSJ profile (subset) in JPF, including requirements, design decisions, and current implementation status. Two examples are analyzed: jobs on a multiprogramming operating system, and a complex resource contention example involving autonomous vehicles crossing an intersection. The utility of JPF in finding logic and timing errors is illustrated, and the remaining challenges in supporting all of RTSJ are assessed

NASA Technical Reports Server