Search CORE

22,633 research outputs found

Thermal aware task assignment for multicore processors using genetic algorithm

Author: Parwez Mohammed
Sulaiman Diary R.
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/10/2023
Field of study

Microprocessor power and thermal density are increasing exponentially. The reliability of the processor declined, cooling costs rose, and the processor's lifespan was shortened due to an overheated processor and poor thermal management like thermally unbalanced processors. Thus, the thermal management and balancing of multi-core processors are extremely crucial. This work mostly focuses on a compact temperature model of multicore processors. In this paper, a novel task assignment is proposed using a genetic algorithm to maintain the thermal balance of the cores, by considering the energy expended by each task that the core performs. And expecting the cores’ temperature using the hotspot simulator. The algorithm assigns tasks to the processors depending on the task parameters and current cores’ temperature in such a way that none of the tasks’ deadlines are lost for the earliest deadline first (EDF) scheduling algorithm. The mathematical model was derived, and the simulation results showed that the highest temperature difference between the cores is 8 °C for approximately 14 seconds of simulation. These results validate the effectiveness of the proposed algorithm in managing the hotspot and reducing both temperature and energy consumption in multicore processors

Institute of Advanced Engineering and Science

Cache Interference-aware Task Partitioning for Non-preemptive Real-time Multi-core Systems

Author: Pimentel A.D.
Shen Y.
Xiao J.
Publication venue
Publication date: 01/05/2022
Field of study

Shared caches in multi-core processors introduce serious difficulties in providing guarantees on the real-time properties of embedded software due to the interaction and the resulting contention in the shared caches. Prior work has studied the schedulability analysis of global scheduling for real-time multi-core systems with shared caches. This article considers another common scheduling paradigm: partitioned scheduling in the presence of shared cache interference. To achieve this, we propose CITTA, a cache interference-aware task partitioning algorithm. We first analyze the shared cache interference between two programs for set-associative instruction and data caches. Then, an integer programming formulation is constructed to calculate the upper bound on cache interference exhibited by a task, which is required by CITTA. We conduct schedulability analysis of CITTA and formally prove its correctness. A set of experiments is performed to evaluate the schedulability performance of CITTA against global EDF scheduling and other greedy partition approaches such as First-fit and Worst-fit over randomly generated tasksets and realistic workloads in embedded systems. Our empirical evaluations show that CITTA outperforms global EDF scheduling and greedy partition approaches in terms of task sets deemed schedulable

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Resource Scheduling Strategy for Performance Optimization Based on Heterogeneous CPU-GPU Platform

Author: Fang Juan
Xiang Wei
Zhang Mengyuan
Zhou Kuan
Publication venue: 'Computers, Materials and Continua (Tech Science Press)'
Publication date: 01/01/2022
Field of study

In recent years, with the development of processor architecture, heterogeneous processors including Center processing unit (CPU) and Graphics processing unit (GPU) have become the mainstream. However, due to the differences of heterogeneous core, the heterogeneous system is now facing many problems that need to be solved. In order to solve these problems, this paper try to focus on the utilization and efficiency of heterogeneous core and design some reasonable resource scheduling strategies. To improve the performance of the system, this paper proposes a combination strategy for a single task and a multi-task scheduling strategy for multiple tasks. The combination strategy consists of two sub-strategies, the first strategy improves the execution efficiency of tasks on the GPU by changing the thread organization structure. The second focuses on the working state of the efficient core and develops more reasonable workload balancing schemes to improve resource utilization of heterogeneous systems. The multi-task scheduling strategy obtains the execution efficiency of heterogeneous cores and global task information through the processing of task samples. Based on this information, an improved ant colony algorithm is used to quickly obtain a reasonable task allocation scheme, which fully utilizes the characteristics of heterogeneous cores. The experimental results show that the combination strategy reduces task execution time by 29.13% on average. In the case of processing multiple tasks, the multi-task scheduling strategy reduces the execution time by up to 23.38% based on the combined strategy. Both strategies can make better use of the resources of heterogeneous systems and significantly reduce the execution time of tasks on heterogeneous systems

ResearchOnline at James Cook University

Scheduling Independent Tasks on Multi-cores with GPU Accelerators

Author: Kedad-Sidhoum Safia
Monna Florence
Mounié Grégory
Trystram Denis
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 26/08/2013
Field of study

Best PaperInternational audienceMore and more computers use hybrid architectures combin-ing multi-core processors and hardware accelerators like GPUs (Graphics Processing Units). We present in this paper a new method for scheduling efficiently parallel applications with

m

CPUs and

k

GPUs, where each task of the application can be processed either on a core (CPU) or on a GPU. The objective is to minimize the makespan. The corresponding scheduling problem is NP-hard, we propose an efficient approximation algorithm which achieves an approximation ratio of

\frac{4}{3} + \frac{1}{3k}

. We first detail and analyze the method, based on a dual approximation scheme, that uses a dynamic programming scheme to balance evenly the load between the heterogeneous resources. Finally, we run some simulations based on realistic benchmarks and compare the solution obtained by a relaxed version of this method to the one provided by a classical greedy algorithm and to lower bounds on the value of the optimal makespan

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Semi-automated parallel programming in heterogeneous intelligent reconfigurable environments (SAPPHIRE)

Author: Stanek Sean
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2012
Field of study

In recent years, as we come closer to approaching physical limits in making smaller (and faster) computer processors, focus has instead been turned toward including multiple processor cores in each device. While this technically allows for more computational power as compared with only one traditional processor core, conventional software can typically only make use of a single processor. Furthermore, we see an increasing number of stream programs that process streams of data such as a stream of images or audio. For stream programs to effectively utilize multi-core processors, multithreading is the key, but it may be difficult to implement in practice depending on the complexity of the programs. We present SAPPHIRE: Semi-Automated Parallel Programming in Heterogeneous Intelligent Reconfigurable Environment, a middleware and SDK for developing multithreaded stream programs. In this middleware, we implement our semi-automated program construction technique which is designed to aid in writing multithreaded software by reducing needed complexity and lines of code written by software developers. We also present a novel static task-scheduling algorithm for stream programs with heterogeneous implementation choices. Our algorithm is capable of scheduling stream programs with provably near-optimal results given a specific set of assumptions, without requiring the unrolling of the task graph. Unrolling the task graph greatly increases the size of the input to the NP-Complete part of the task-scheduling problem as in related work. Finally, we present two case study programs implemented using SAPPHIRE. One case study, EM-Capture, has analyzed over 50 billion frames of endoscopy video in real-time in a real hospital, discerning over 71,000 unique endoscopy procedures. The other case study, EM-Feedback-RT, is a collaborative extension to EM-Capture, and is an attempt to provide real-time quality analysis feedback to physicians during a colonoscopy exam

Digital Repository @ Iowa State University (ISU)

Scheduling and Tuning Kernels for High-performance on Heterogeneous Processor Systems

Author: Fang Ye
Publication venue: LSU Digital Commons
Publication date: 01/01/2016
Field of study

Accelerated parallel computing techniques using devices such as GPUs and Xeon Phis (along with CPUs) have proposed promising solutions of extending the cutting edge of high-performance computer systems. A significant performance improvement can be achieved when suitable workloads are handled by the accelerator. Traditional CPUs can handle those workloads not well suited for accelerators. Combination of multiple types of processors in a single computer system is referred to as a heterogeneous system. This dissertation addresses tuning and scheduling issues in heterogeneous systems. The first section presents work on tuning scientific workloads on three different types of processors: multi-core CPU, Xeon Phi massively parallel processor, and NVIDIA GPU; common tuning methods and platform-specific tuning techniques are presented. Then, analysis is done to demonstrate the performance characteristics of the heterogeneous system on different input data. This section of the dissertation is part of the GeauxDock project, which prototyped a few state-of-art bioinformatics algorithms, and delivered a fast molecular docking program. The second section of this work studies the performance model of the GeauxDock computing kernel. Specifically, the work presents an extraction of features from the input data set and the target systems, and then uses various regression models to calculate the perspective computation time. This helps understand why a certain processor is faster for certain sets of tasks. It also provides the essential information for scheduling on heterogeneous systems. In addition, this dissertation investigates a high-level task scheduling framework for heterogeneous processor systems in which, the pros and cons of using different heterogeneous processors can complement each other. Thus a higher performance can be achieve on heterogeneous computing systems. A new scheduling algorithm with four innovations is presented: Ranked Opportunistic Balancing (ROB), Multi-subject Ranking (MR), Multi-subject Relative Ranking (MRR), and Automatic Small Tasks Rearranging (ASTR). The new algorithm consistently outperforms previously proposed algorithms with better scheduling results, lower computational complexity, and more consistent results over a range of performance prediction errors. Finally, this work extends the heterogeneous task scheduling algorithm to handle power capping feature. It demonstrates that a power-aware scheduler significantly improves the power efficiencies and saves the energy consumption. This suggests that, in addition to performance benefits, heterogeneous systems may have certain advantages on overall power efficiency

Louisiana State University