6 research outputs found

    Optimal assignment of task modules with precedence for distributed processing by graph matching and state-space search

    Get PDF
    [[abstract]]A graph matching approach to optimal assignment of task modules with varying lengths and precedence relationship in a distributed computing system is proposed. Inclusion of module precedence into the optimal solution is made possible by the use of topological module orderings. Two graphs are defined to represent the processor structure and the module precedence relationship, respectively. Assignment of the task modules to the system processors is transformed into a type of graph matching. The search of optimal graph matching corresponding to optimal task assignment is formulated as a state-space search problem which is then solved by theA* algorithm in artificial intelligence. Illustrative examples and experimental results are included to show the effectiveness of the proposed approach

    Optimal assignment of task modules with precedence for distributed processing by graph matching and state-space search

    No full text
    [[abstract]]A graph matching approach to optimal assignment of task modules with varying lengths and precedence relationship in a distributed computing system is proposed. Inclusion of module precedence into the optimal solution is made possible by the use of topological module orderings. Two graphs are defined to represent the processor structure and the module precedence relationship, respectively. Assignment of the task modules to the system processors is transformed into a type of graph matching. The search of optimal graph matching corresponding to optimal task assignment is formulated as a state-space search problem which is then solved by theA* algorithm in artificial intelligence. Illustrative examples and experimental results are included to show the effectiveness of the proposed approach

    Data-aware workflow scheduling in heterogeneous distributed systems

    Get PDF
    Data transferring in scientific workflows gradually attracts more attention due to large amounts of data generated by complex scientific workflows will significantly increase the turnaround time of the whole workflow. It is almost impossible to make an optimal or approximate optimal scheduling for the end-to-end workflow without considering the intermediate data movement. In order to reduce the complexity of the workflow-scheduling problem, most researches done so far are constrained by many unrealistic assumptions, which result in non-optimal scheduling in practice. A constraint imposed by most researchers in their algorithms is that a computation site can only start the execution of other tasks after it has completed the execution of the current task and delivered the data generated by this task. We relax this constraint and allow overlap of execution and data movement in order to improve the parallelism of the tasks in the workflow. Furthermore, we generalize the conventional workflow to allow data to be staged in(out) from(to) remote data centers, design and implement an efficient data-aware scheduling strategy. The experimental results show that the turnaround time is reduced significantly in heterogeneous distributed systems by applying our scheduling strategy. To reduce the end-to-end workflow turnaround time, it is crucial to deliver the input, output and intermediate data as fast as possible. However, it is quite often that the throughput is much lower than expected while using single TCP stream to transfer data when the bandwidth of the network is not fully utilized. Multiple TCP streams will benefit the throughput. However, the throughput does not increase monotonically when increasing the number of parallel streams. Based on this observation, we propose to improve the existing throughput prediction models, design and implement a TCP throughput estimation and optimization service in the distributed systems to figure out the optimal configurations of TCP parallel streams. Experimental results show that the proposed estimation and optimization service can predict the throughput dynamically with high accuracy and the throughput can be increased significantly. Throughput optimization along with data-aware workflow scheduling allows us to minimize the end-to-end workflow turnaround time successfully

    Modelling and scheduling of heterogeneous computing systems

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH
    corecore