36,960 research outputs found
Task scheduling techniques for asymmetric multi-core systems
As performance and energy efficiency have become the main challenges for next-generation high-performance computing, asymmetric multi-core architectures can provide solutions to tackle these issues. Parallel programming models need to be able to suit the needs of such systems and keep on increasing the application’s portability and efficiency. This paper proposes two task scheduling approaches that target asymmetric systems. These dynamic scheduling policies reduce total execution time either by detecting the longest or the critical path of the dynamic task dependency graph of the application, or by finding the earliest executor of a task. They use dynamic scheduling and information discoverable during execution, fact that makes them implementable and functional without the need of off-line profiling. In our evaluation we compare these scheduling approaches with two existing state-of the art heterogeneous schedulers and we track their improvement over a FIFO baseline scheduler. We show that the heterogeneous schedulers improve the baseline by up to 1.45 in a real 8-core asymmetric system and up to 2.1 in a simulated 32-core asymmetric chip.This work has been supported by the Spanish Government (SEV2015-0493), by the Spanish Ministry of Science and Innovation (contract TIN2015-65316-P), by Generalitat de
Catalunya (contracts 2014-SGR-1051 and 2014-SGR-1272), by the RoMoL ERC Advanced Grant (GA 321253) and the
European HiPEAC Network of Excellence. The Mont-Blanc project receives funding from the EU’s Seventh Framework Programme (FP7/2007-2013) under grant agreement
no 610402 and from the EU’s H2020 Framework Programme (H2020/2014-2020) under grant agreement no 671697. M.
MoretĂł has been partially supported by the Ministry of Economy and Competitiveness under Juan de la Cierva postdoctoral fellowship number JCI-2012-15047. M. Casas
is supported by the Secretary for Universities and Research of the Ministry of Economy and Knowledge of the Government of Catalonia and the Cofund programme of the Marie
Curie Actions of the 7th R&D Framework Programme of the European Union (Contract 2013 BP B 00243).Peer ReviewedPostprint (author's final draft
An Adaptive Scheduling Algorithm for Dynamic Jobs for Dealing with the Flexible Job Shop Scheduling Problem
Modern manufacturing systems build on an effective scheduling scheme that makes full use of the system resource to increase the production, in which an important aspect is how to minimize the makespan for a certain production task (i.e., the time that elapses from the start of work to the end) in order to achieve the economic profit. This can be a difficult problem, especially when the production flow is complicated and production tasks may suddenly change. As a consequence, exact approaches are not able to schedule the production in a short time. In this paper, an adaptive scheduling algorithm is proposed to address the makespan minimization in the dynamic job shop scheduling problem. Instead of a linear order, the directed acyclic graph is used to represent the complex precedence constraints among operations in jobs. Inspired by the heterogeneous earliest finish time (HEFT) algorithm, the adaptive scheduling algorithm can make some fast adaptations on the fly to accommodate new jobs which continuously arrive in a manufacturing system. The performance of the proposed adaptive HEFT algorithm is compared with other state-of-the-art algorithms and further heuristic methods for minimizing the makespan. Extensive experimental results demonstrate the high efficiency of the proposed approach
A fast, effective local search for scheduling independent jobs in heterogeneous computing environments
The efficient scheduling of independent computational jobs in a heterogeneous computing (HC) environment is an important problem in domains such as grid computing. Finding optimal schedules for such an environment is (in general) an NP-hard problem, and so heuristic approaches must be used. Work with other NP-hard problems has shown that solutions found by heuristic algorithms can often be improved by applying local search procedures to the solution found. This paper describes a simple but effective local search procedure for scheduling independent jobs in HC environments which, when combined with fast construction heuristics, can find shorter schedules on benchmark problems than other solution techniques found in the literature, and in significantly less time
Adaptive Dispatching of Tasks in the Cloud
The increasingly wide application of Cloud Computing enables the
consolidation of tens of thousands of applications in shared infrastructures.
Thus, meeting the quality of service requirements of so many diverse
applications in such shared resource environments has become a real challenge,
especially since the characteristics and workload of applications differ widely
and may change over time. This paper presents an experimental system that can
exploit a variety of online quality of service aware adaptive task allocation
schemes, and three such schemes are designed and compared. These are a
measurement driven algorithm that uses reinforcement learning, secondly a
"sensible" allocation algorithm that assigns jobs to sub-systems that are
observed to provide a lower response time, and then an algorithm that splits
the job arrival stream into sub-streams at rates computed from the hosts'
processing capabilities. All of these schemes are compared via measurements
among themselves and with a simple round-robin scheduler, on two experimental
test-beds with homogeneous and heterogeneous hosts having different processing
capacities.Comment: 10 pages, 9 figure
Pipelining the Fast Multipole Method over a Runtime System
Fast Multipole Methods (FMM) are a fundamental operation for the simulation
of many physical problems. The high performance design of such methods usually
requires to carefully tune the algorithm for both the targeted physics and the
hardware. In this paper, we propose a new approach that achieves high
performance across architectures. Our method consists of expressing the FMM
algorithm as a task flow and employing a state-of-the-art runtime system,
StarPU, in order to process the tasks on the different processing units. We
carefully design the task flow, the mathematical operators, their Central
Processing Unit (CPU) and Graphics Processing Unit (GPU) implementations, as
well as scheduling schemes. We compute potentials and forces of 200 million
particles in 48.7 seconds on a homogeneous 160 cores SGI Altix UV 100 and of 38
million particles in 13.34 seconds on a heterogeneous 12 cores Intel Nehalem
processor enhanced with 3 Nvidia M2090 Fermi GPUs.Comment: No. RR-7981 (2012
- …