Time series anomaly detection is a prevalent problem in many application
domains such as patient monitoring in healthcare, forecasting in finance, or
predictive maintenance in energy. This has led to the emergence of a plethora
of anomaly detection methods, including more recently, deep learning based
methods. Although several benchmarks have been proposed to compare newly
developed models, they usually rely on one-time execution over a limited set of
datasets and the comparison is restricted to a few models. We propose
OrionBench -- a user centric continuously maintained benchmark for unsupervised
time series anomaly detection. The framework provides universal abstractions to
represent models, extensibility to add new pipelines and datasets,
hyperparameter standardization, pipeline verification, and frequent releases
with published benchmarks. We demonstrate the usage of OrionBench, and the
progression of pipelines across 15 releases published over the course of three
years. Moreover, we walk through two real scenarios we experienced with
OrionBench that highlight the importance of continuous benchmarks in
unsupervised time series anomaly detection