256 research outputs found

    Economical Task Scheduling Algorithm for Grid Computing Systems

    Get PDF
    Task duplication is an effective scheduling technique for reducing the response time of workflow applications in dynamic grid computing systems. Task duplication based scheduling algorithms generate shorter schedules without sacrificing efficiency but leave the computing resources over consumed due to the heavily duplications. In this paper, we try to minimize the duplications of tasks from the schedule obtained using an effective duplication based scheduling heuristic without affecting the overall schedule length (makespan) of grid application. Here, we suggested an economical duplication based intelligent scheduling heuristic called economical duplication scheduling in grid (EDS-G). The simulation results show that EDS-G algorithm generates better schedule with lesser number of duplications and remarkably less resource consumption as compared with HLD, LDBS in the simulated heterogeneous grid computing environments

    CASCH: a tool for computer-aided scheduling

    Get PDF
    A software tool called Computer-Aided Scheduling (CASCH) for parallel processing on distributed-memory multiprocessors in a complete parallel programming environment is presented. A compiler automatically converts sequential applications into parallel codes to perform program parallelization. The parallel code that executes on a target machine is optimized by CASCH through proper scheduling and mapping.published_or_final_versio

    Energy-aware task scheduling on heterogeneous computing systems with time constraint

    Get PDF
    As a technique to help achieve high performance in parallel and distributed heterogeneous computing systems, task scheduling has attracted considerable interest. In this paper, we propose an effective Cuckoo Search algorithm based on Gaussian random walk and Adaptive discovery probability which combined with a cost-to-time ratio Modification strategy (GACSM), to address task scheduling on heterogeneous multiprocessor systems using Dynamic Voltage and Frequency Scaling (DVFS). First, to overcome the shortcomings of poor performance in exploitation of the cuckoo search algorithm, we use chaos variables to initialize populations to maintain the population diversity, a Gaussian random walk strategy to balance the exploration and exploitation capabilities of the algorithm, and an adaptive discovery probability strategy to improve population diversity. Then, we apply the improved Cuckoo Search (CS) algorithm to assign tasks to resources, and a widely used downward rank heuristic strategy to find the corresponding scheduling sequence. Finally, we apply a cost-to-time ratio improvement strategy to further improve the performance of the improved CS algorithm. Extensive experiments are conducted to evaluate the effectiveness and efficiency of our method. The results validate our approach and show its superiority in comparison with the state-of-the-art methods.Zexi Deng, Zihan Yan, Huimin Huang, Hong Shen ... et al

    FPGA Implementation of Data Flow Graphs for Digital Signal Processing Applications

    Get PDF
    A rapid growth in digital signal processing applications has increased the requirement for high-speed digital systems. Multiprocessor systems are the best choice for these applications. A prior sequence of operations should be applied to the operations that described the nature of these applications before hardware implementation is produced. These operations should be scheduled and hardware allocated. This paper proposes a new scheduling technique for digital signal processing (DSP) applications has been represented by data flow graphs (DFGs). In addition, hardware allocation is implemented in the form of embedded system. A proposed scheduling technique also achieves the optimal scheduling of a DFG at design time. The optimality criteria considered in this algorithm are the maximum throughput within the available hardware resources. The maximum throughput is achieved by arranging the DFG nodes according to their inter-related data dependencies. Then, two nodes can be clustered into one compound task to reduce the overall execution time by minimizing the number of tasks to be executed that minimizing the number of cycles to execute them. Then each task is presented in form of instruction to be executed in the hardware system. A hardware system is composed of one or multiple homogenous pipelined processing elements and it is designed to meet the maximum-rate schedule.  Two implementations are proposed of the system architecture according to the number of the processing elements, namely:  the serial system and the parallel system. The serial system comprises one processing element where all tasks are processed sequentially, whilst the parallel system has four processing elements to execute tasks concurrently. These systems consist mainly of seven units: central shared memory, state table, multiway function unit buffer, execution array, processing element/s, instruction buffer and the address generation unit. The hardware components were built on an FPGA chip using Verilog HDL. In synthesis results, the parallel system has better system performance by 25.5% than the serial system. While the serial system requires smaller area size, which described by the number of slice registers and the number of the slice lookup tables (LUTs) than the parallel one. The relationship between the number of instructions that are executed in both systems, and the system area and the system performance that presented by system frequency, are studied. By increasing memories size in both systems, the system performance isn’t affected as in a serial system, and it is slightly decreased as the parallel system by 1.5% to 4.5%. In terms of the systems area, both serial system area and parallel system area are increased and in some cases are doubled. The proposed scheduling technique is shown to outperform the retaining technique, which we have chosen to compare with.  The serial system has better performance by 19.3% higher system frequency than a retiming technique. And the parallel system also outperforms the retaining technique by 51.2% higher system frequency in synthesis results

    QoS-aware predictive workflow scheduling

    Full text link
    This research places the basis of QoS-aware predictive workflow scheduling. This research novel contributions will open up prospects for future research in handling complex big workflow applications with high uncertainty and dynamism. The results from the proposed workflow scheduling algorithm shows significant improvement in terms of the performance and reliability of the workflow applications

    HSIP: A Novel Task Scheduling Algorithm for Heterogeneous Computing

    Get PDF

    Contention energy-aware real-time task mapping on NoC based heterogeneous MPSoCs

    Get PDF
    © 2018 IEEE. Network-on-Chip (NoC)-based multiprocessor system-on-chips (MPSoCs) are becoming the de-facto computing platform for computationally intensive real-time applications in the embedded systems due to their high performance, exceptional quality-of-service (QoS) and energy efficiency over superscalar uniprocessor architectures. Energy saving is important in the embedded system because it reduces the operating cost while prolongs lifetime and improves the reliability of the system. In this paper, contention-aware energy efficient static mapping using NoC-based heterogeneous MPSoC for real-time tasks with an individual deadline and precedence constraints is investigated. Unlike other schemes task ordering, mapping, and voltage assignment are performed in an integrated manner to minimize the processing energy while explicitly reduce contention between the communications and communication energy. Furthermore, both dynamic voltage and frequency scaling and dynamic power management are used for energy consumption optimization. The developed contention-aware integrated task mapping and voltage assignment (CITM-VA) static energy management scheme performs tasks ordering using earliest latest finish time first (ELFTF) strategy that assigns priorities to the tasks having shorter latest finish time (LFT) over the tasks with longer LFT. It remaps every task to a processor and/or discrete voltage level that reduces processing energy consumption. Similarly, the communication energy is minimized by assigning discrete voltage levels to the NoC links. Further, total energy efficiency is achieved by putting the processor into a low-power state when feasible. Moreover, this approach resolves the contention between communications that traverse the same link by allocating links to communications with higher priority. The results obtained through extensive simulations of real-world benchmarks demonstrate that CITM-VA approach outperforms state-of-the-art technique and achieves an average 30% total energy improvement. Additionally, it maintains high QoS and robustness for real-time applications
    • …
    corecore