Search CORE

1,548 research outputs found

Analysing multiprogramming queues by generating functions

Author: Adan I.J.B.F.
Wessels J.
Zijm W.H.M.
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/1991
Field of study

The generating function approach for analysing queueing systems has a longstanding tradition. One of the highlights is the seminal paper by Kingman on the shortest queue problem, where the author shows that the equilibrium probabilities P_{m,n} of the queue lengths can be written as an infinite sum of products of powers. The same approach is used by Hofri to prove that for a multiprogramming model with two queues the boundary probability P_{0,n} can be expressed as an infinite sum of powers. The present paper shows that the latter representation does not always hold, which implies that the multiprogramming problem is essentially more complicated than the shortest queue problem. However, it appears that the generating function approach is very well suited to show when such a representation is available and when not

Pure OAI Repository

An Efficient Thread Mapping Strategy for Multiprogramming on Manycore Processors

Author: Tousimojarad Ashkan
Vanderbauwhede Wim
Publication venue: 'IOS Press'
Publication date: 01/03/2014
Field of study

The emergence of multicore and manycore processors is set to change the parallel computing world. Applications are shifting towards increased parallelism in order to utilise these architectures efficiently. This leads to a situation where every application creates its desirable number of threads, based on its parallel nature and the system resources allowance. Task scheduling in such a multithreaded multiprogramming environment is a significant challenge. In task scheduling, not only the order of the execution, but also the mapping of threads to the execution resources is of a great importance. In this paper we state and discuss some fundamental rules based on results obtained from selected applications of the BOTS benchmarks on the 64-core TILEPro64 processor. We demonstrate how previously efficient mapping policies such as those of the SMP Linux scheduler become inefficient when the number of threads and cores grows. We propose a novel, low-overhead technique, a heuristic based on the amount of time spent by each CPU doing some useful work, to fairly distribute the workloads amongst the cores in a multiprogramming environment. Our novel approach could be implemented as a pragma similar to those in the new task-based OpenMP versions, or can be incorporated as a distributed thread mapping mechanism in future manycore programming frameworks. We show that our thread mapping scheme can outperform the native GNU/Linux thread scheduler in both single-programming and multiprogramming environments.Comment: ParCo Conference, Munich, Germany, 201

arXiv.org e-Print Archive

Enlighten

Enabling preemptive multiprogramming on GPUs

Author: Cabezas Javier
Gelado Fernandez Isaac
Navarro Nacho
Ramírez Bellido Alejandro
Tanasic Ivan
Valero Cortés Mateo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

GPUs are being increasingly adopted as compute accelerators in many domains, spanning environments from mobile systems to cloud computing. These systems are usually running multiple applications, from one or several users. However GPUs do not provide the support for resource sharing traditionally expected in these scenarios. Thus, such systems are unable to provide key multiprogrammed workload requirements, such as responsiveness, fairness or quality of service. In this paper, we propose a set of hardware extensions that allow GPUs to efficiently support multiprogrammed GPU workloads. We argue for preemptive multitasking and design two preemption mechanisms that can be used to implement GPU scheduling policies. We extend the architecture to allow concurrent execution of GPU kernels from different user processes and implement a scheduling policy that dynamically distributes the GPU cores among concurrently running kernels, according to their priorities. We extend the NVIDIA GK110 (Kepler) like GPU architecture with our proposals and evaluate them on a set of multiprogrammed workloads with up to eight concurrent processes. Our proposals improve execution time of high-priority processes by 15.6x, the average application turnaround time between 1.5x to 2x, and system fairness up to 3.4x.We would like to thank the anonymous reviewers, Alexan- der Veidenbaum, Carlos Villavieja, Lluis Vilanova, Lluc Al- varez, and Marc Jorda on their comments and help improving our work and this paper. This work is supported by Euro- pean Commission through TERAFLUX (FP7-249013), Mont- Blanc (FP7-288777), and RoMoL (GA-321253) projects, NVIDIA through the CUDA Center of Excellence program, Spanish Government through Programa Severo Ochoa (SEV-2011-0067) and Spanish Ministry of Science and Technology through TIN2007-60625 and TIN2012-34557 projects.Peer ReviewedPostprint (author’s final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

Accounting of computer system use in EXEC 8

Author: Rothstein A. L.
Publication venue
Publication date
Field of study

EXEC 8 modified multiprogramming system to log core time and central processing uni

NASA Technical Reports Server

The PISCES 2 parallel programming environment

Author: Pratt Terrence W.
Publication venue
Publication date
Field of study

PISCES 2 is a programming environment for scientific and engineering computations on MIMD parallel computers. It is currently implemented on a flexible FLEX/32 at NASA Langley, a 20 processor machine with both shared and local memories. The environment provides an extended Fortran for applications programming, a configuration environment for setting up a run on the parallel machine, and a run-time environment for monitoring and controlling program execution. This paper describes the overall design of the system and its implementation on the FLEX/32. Emphasis is placed on several novel aspects of the design: the use of a carefully defined virtual machine, programmer control of the mapping of virtual machine to actual hardware, forces for medium-granularity parallelism, and windows for parallel distribution of data. Some preliminary measurements of storage use are included

NASA Technical Reports Server

Optimized Round Robin CPU Scheduling Algorithm

Author: Dr. P. Suresh Varma
Dr. Sukumar Babu
Neelima Priyanka
Publication venue: Global Journals Inc. (US)
Publication date: 31/07/2012
Field of study

One of the fundamental function of an operating system is scheduling. There are 2 types of uni-processor operating system in general. Those are uni-programming and multi-programming. Uni-programming operating system execute only single job at a time while multiprogramming operating system is capable of executing multiple jobs concurrently. Resource utilization is the basic aim of multiprogramming operating system. There are many scheduling algorithms available for multi-programming operating system. But our work focuses on design and development aspect of new and novel scheduling algorithm for multi-programming operating system in the view of optimization. We developed a tool which gives output in the form of experimental results with respect to some standard and new scheduling algorithms e.g. First come first serve, shortest job first, round robin, optimal and a novel cpu scheduling algorithm etc

Global Journal of Computer Science and Technology (GJCST)

Adaptive space-time sharing with SCOJO.

Author: Huang Xuemin
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2004
Field of study

Coscheduling is a technique used to improve the performance of parallel computer applications under time sharing, i.e., to provide better response times than standard time sharing or space sharing. Dynamic coscheduling and gang scheduling are two main forms of coscheduling. In SCOJO (Share-based Job Coscheduling), we have introduced our own original framework to employ loosely coordinated dynamic coscheduling and a dynamic directory service in support of scheduling cross-site jobs in grid scheduling. SCOJO guarantees effective CPU shares by taking coscheduling effects into consideration and supports both time and CPU share reservation for cross-site job. However, coscheduling leads to high memory pressure and still involves problems like fragmentation and context-switch overhead, especially when applying higher multiprogramming levels. As main part of this thesis, we employ gang scheduling as more directly suitable approach for combined space-time sharing and extend SCOJO for clusters to incorporate adaptive space sharing into gang scheduling. We focus on taking advantage of moldable and malleable characteristics of realistic job mixes to dynamically adapt to varying system workloads and flexibly reduce fragmentation. In addition, our adaptive scheduling approach applies standard job-scheduling techniques like a priority and aging system, backfilling or easy backfilling. We demonstrate by the results of a discrete-event simulation that this dynamic adaptive space-time sharing approach can deliver better response times and bounded relative response times even with a lower multiprogramming level than traditional gang scheduling.Dept. of Computer Science. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2004 .H825. Source: Masters Abstracts International, Volume: 43-01, page: 0237. Adviser: A. Sodan. Thesis (M.Sc.)--University of Windsor (Canada), 2004

Scholarship at UWindsor

Scheduling issues on IBM p690: Performance Analysis with the PARbench Environment

Author: Dietze H.
Nagel W. E.
Trenkler B.
Publication venue: John von Neumann Institute for Computing
Publication date: 01/01/2006
Field of study

Juelich Shared Electronic Resources