Scientific workflows bridge scientific challenges with computational
resources. While dispel4py, a stream-based workflow system, offers mappings to
parallel enactment engines like MPI or Multiprocessing, its optimization
primarily focuses on dynamic process-to-task allocation for improved
performance. An efficiency gap persists, particularly with the growing emphasis
on conserving computing resources. Moreover, the existing dynamic optimization
lacks support for stateful applications and grouping operations. To address
these issues, our work introduces a novel hybrid approach for handling stateful
operations and groupings within workflows, leveraging a new Redis mapping. We
also propose an auto-scaling mechanism integrated into dispel4py's dynamic
optimization. Our experiments showcase the effectiveness of auto-scaling
optimization, achieving efficiency while upholding performance. In the best
case, auto-scaling reduces dispel4py's runtime to 87% compared to the baseline,
using only 76% of process resources. Importantly, our optimized stateful
dispel4py demonstrates a remarkable speedup, utilizing just 32% of the runtime
compared to the contender.Comment: 13 pages, 13 figure