A Taxonomy of Tools for Reproducible Machine Learning Experiments

Abstract

The broad availability of machine learning (ML) libraries and frameworks makes the rapid prototyping of ML models a relatively easy task to achieve. However, the quality of prototypes is challenged by their reproducibility. Reproducing an ML experiment typically entails repeating the whole process, from data collection to model building, other than multiple optimization steps that must be carefully tracked. In this paper, we define a comprehensive taxonomy to characterize tools for ML experiment tracking and review some of the most popular solutions under the lens of the taxonomy. The taxonomy and related recommendations may help data scientists to more easily orient themselves and make an informed choice when selecting appropriate tools to shape the workflow of their ML experiments

    Similar works