2 research outputs found
PrIU: A Provenance-Based Approach for Incrementally Updating Regression Models
The ubiquitous use of machine learning algorithms brings new challenges to
traditional database problems such as incremental view update. Much effort is
being put in better understanding and debugging machine learning models, as
well as in identifying and repairing errors in training datasets. Our focus is
on how to assist these activities when they have to retrain the machine
learning model after removing problematic training samples in cleaning or
selecting different subsets of training data for interpretability. This paper
presents an efficient provenance-based approach, PrIU, and its optimized
version, PrIU-opt, for incrementally updating model parameters without
sacrificing prediction accuracy. We prove the correctness and convergence of
the incrementally updated model parameters, and validate it experimentally.
Experimental results show that up to two orders of magnitude speed-ups can be
achieved by PrIU-opt compared to simply retraining the model from scratch, yet
obtaining highly similar models.Comment: 28 Pages, published in 2020 ACM SIGMOD International Conference on
Management of Data (SIGMOD 2020