Optimization and Management of Large-scale Scientific Workflows in Heterogeneous Network Environments: From Theory to Practice

Abstract

Next-generation computation-intensive scientific applications feature large-scale computing workflows of various structures, which can be modeled as simple as linear pipelines or as complex as Directed Acyclic Graphs (DAGs). Supporting such computing workflows and optimizing their end-to-end network performance are crucial to the success of scientific collaborations that require fast system response, smooth data flow, and reliable distributed operation.We construct analytical cost models and formulate a class of workflow mapping problems with different mapping objectives and network constraints. The difficulty of these mapping problems essentially arises from the topological matching nature in the spatial domain, which is further compounded by the resource sharing complicacy in the temporal dimension. We provide detailed computational complexity analysis and design optimal or heuristic algorithms with rigorous correctness proof or performance analysis. We decentralize the proposed mapping algorithms and also investigate these optimization problems in unreliable network environments for fault tolerance.To examine and evaluate the performance of the workflow mapping algorithms before actual deployment and implementation, we implement a simulation program that simulates the execution dynamics of distributed computing workflows. We also develop a scientific workflow automation and management platform based on an existing workflow engine for experimentations in real environments. The performance superiority of the proposed mapping solutions are illustrated by extensive simulation-based comparisons with existing algorithms and further verified by large-scale experiments on real-life scientific workflow applications through effective system implementation and deployment in real networks

    Similar works