An architecture for automatic ML/AI workflow management and supervision

Abstract

Scientific computation problems have been faced with the need to analyze increasing amounts of data as part of their application workflows, and the science-based model is being combined with big data and machine learning models to solve complex problems and phenomena [1][2]. The machine learning workflow is composed of some reproducible steps that can be executed as a pipeline to build a model efficiently by saving iteration time, helping in debugging and detecting [3]. Currently, businesses and researchers are investigating and improving the methodology of developing and deploying machine learning workflows in both training and inference phases, which helps the data science team focus on their requirements and the data engineer team deploy and operate machine learning workflows efficiently and automatically [4]. This work presents an architecture for automatic machine learning workflows, which provides capabilities of monitoring and automatic management on the end-to-end life-cycle of machine learning workflows, including tracking and observing at the training stage, and releasing, monitoring, deployment, auto-detecting and infrastructure management at the inference stage. To validate feasibility, we have conducted a case study based on our architecture and deployed it in the cloud, and showed its automation

    Similar works