807 research outputs found

    05101 Abstracts Collection -- Scheduling for Parallel Architectures: Theory, Applications, Challenges

    Get PDF
    From 06.03.05 to 11.03.05, the Dagstuhl Seminar 05101 ``Scheduling for Parallel Architectures: Theory, Applications, Challenges\u27\u27 was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general

    Optimal processor assignment for pipeline computations

    Get PDF
    The availability of large scale multitasked parallel architectures introduces the following processor assignment problem for pipelined computations. Given a set of tasks and their precedence constraints, along with their experimentally determined individual responses times for different processor sizes, find an assignment of processor to tasks. Two objectives are of interest: minimal response given a throughput requirement, and maximal throughput given a response time requirement. These assignment problems differ considerably from the classical mapping problem in which several tasks share a processor; instead, it is assumed that a large number of processors are to be assigned to a relatively small number of tasks. Efficient assignment algorithms were developed for different classes of task structures. For a p processor system and a series parallel precedence graph with n constituent tasks, an O(np2) algorithm is provided that finds the optimal assignment for the response time optimization problem; it was found that the assignment optimizing the constrained throughput in O(np2log p) time. Special cases of linear, independent, and tree graphs are also considered

    Period and Computational Elasticity for Adaptive Real-Time Systems

    Get PDF
    A wide range range of real-world applications (including multimedia players, ad-hoc communication networks, online trading, radar tracking software, and other adaptive control algorithms) need adaptive adjustment to their resource utilizations at run-time, while still maintaining real-time guarantees. The elastic task model of soft real-time systems allows for the run-time manipulation of tasks’ processor utilizations in order to maintain a system-wide quality of service or accommodate needs of other tasks by assigning each task a period within a specified range. As originally presented, only sequential tasks executing on a single processor were considered. However, in the two decades since the elastic task model was first introduced, multiprocessor systems have become increasingly prevalent. This dissertation appropriately extends the elastic task model to include both multiprocessor scheduling of sequential adaptive tasks and scheduling of adaptive tasks with internal parallelism. It also introduces novel elastic concepts in which 1) tasks can vary their computational loads rather than their periods and 2) the more realistic scenario in which tasks are allowed to adapt among a discrete set of candidate processor utilizations rather than over a continuous range. A runtime system for parallel elastic tasks is also presented and used to demonstrate the benefit of discrete elastic scheduling by enabling adaptation in the application domain of real-time hybrid simulation (RTHS)

    Task Scheduling for Multiprocessor Systems Using Queuing Theory

    Get PDF
    This research focuses on comparing different multi-processor task scheduling algorithms. Each algorithm has been simulated using one of queuing theory models in Operations Research (OR) science to evaluate its behavior and efficiency. The comparison includes an analysis of the behavior of central processing unit (CPU) when receiving number of jobs at four random job duration patterns that are; (random, ascending, descending, and volatile low-high). Microsoft Excel 2010 was used to form the data of each case, and the result shows convergence and divergence among the studied algorithms at different patterns. Also it has been found that the Fleischer algorithm is very efficient in enhancing and minimizing the waiting duration for each job at the total job queue of the CPU. Keywords: Operations Research, Queuing Theory, Multiprocessor, Scheduling Algorithms, Simulation

    CoreTSAR: Task Scheduling for Accelerator-aware Runtimes

    Get PDF
    Heterogeneous supercomputers that incorporate computational accelerators such as GPUs are increasingly popular due to their high peak performance, energy efficiency and comparatively low cost. Unfortunately, the programming models and frameworks designed to extract performance from all computational units still lack the flexibility of their CPU-only counterparts. Accelerated OpenMP improves this situation by supporting natural migration of OpenMP code from CPUs to a GPU. However, these implementations currently lose one of OpenMP’s best features, its flexibility: typical OpenMP applications can run on any number of CPUs. GPU implementations do not transparently employ multiple GPUs on a node or a mix of GPUs and CPUs. To address these shortcomings, we present CoreTSAR, our runtime library for dynamically scheduling tasks across heterogeneous resources, and propose straightforward extensions that incorporate this functionality into Accelerated OpenMP. We show that our approach can provide nearly linear speedup to four GPUs over only using CPUs or one GPU while increasing the overall flexibility of Accelerated OpenMP

    Towards Efficient Explainability of Schedulability Properties in Real-Time Systems

    Get PDF
    The notion of efficient explainability was recently introduced in the context of hard-real-time scheduling: a claim that a real-time system is schedulable (i.e., that it will always meet all deadlines during run-time) is defined to be efficiently explainable if there is a proof of such schedulability that can be verified by a polynomial-time algorithm. We further explore this notion by (i) classifying a variety of common schedulability analysis problems according to whether they are efficiently explainable or not; and (ii) developing strategies for dealing with those determined to not be efficiently schedulable, primarily by identifying practically meaningful sub-problems that are efficiently explainable

    Summary of research in applied mathematics, numerical analysis, and computer sciences

    Get PDF
    The major categories of current ICASE research programs addressed include: numerical methods, with particular emphasis on the development and analysis of basic numerical algorithms; control and parameter identification problems, with emphasis on effective numerical methods; computational problems in engineering and physical sciences, particularly fluid dynamics, acoustics, and structural analysis; and computer systems and software, especially vector and parallel computers

    Preliminary Report on High-Performance Computational Structures for Robot Control

    Get PDF
    In this report we present some initial results of our work completed thus far on Computational Structures for Robot Control . A SIMD architecture with the crossbar interprocessor network which achieves the parallel processing execution time lower bound of o( [a1n ]), where a1 is a constant and n is the number of manipulator joints, for the computation of the inverse dynamics problem, is discussed. A novel SIMD task scheduling algorithm that optimizes the parallel processing performance on the indicated architecture is also delineated. Simulations performed on this architecture show speedup factor of 3.4 over previous related work completed for the evaluation of the specified problem, is achieved. Parallel processing of PUMA forward and inverse kinematics solutions is next investigated using a particular scheduling algorithm. In addition, a custom bit-serial array architecture is designed for the computation of the inverse dynamics problem within the bit-serial execution time lower bound of o(c1k + c2kn), where c1 and c2 are specified constants, k is the word length, and n is the number of manipulator joints. Finally, mapping of the Newton-Euler equations onto a fixed systolic array is investigated. A balanced architecture for the inverse dynamics problem which achieves the systolic execution time lower bound for the specified problem is depicted. Please note again that these results are only preliminary and improvements to our algorithms and architectures are currently still being made
    • …
    corecore