Although the pre-training followed by fine-tuning paradigm is used
extensively in many fields, there is still some controversy surrounding the
impact of pre-training on the fine-tuning process. Currently, experimental
findings based on text and image data lack consensus. To delve deeper into the
unsupervised pre-training followed by fine-tuning paradigm, we have extended
previous research to a new modality: time series. In this study, we conducted a
thorough examination of 150 classification datasets derived from the Univariate
Time Series (UTS) and Multivariate Time Series (MTS) benchmarks. Our analysis
reveals several key conclusions. (i) Pre-training can only help improve the
optimization process for models that fit the data poorly, rather than those
that fit the data well. (ii) Pre-training does not exhibit the effect of
regularization when given sufficient training time. (iii) Pre-training can only
speed up convergence if the model has sufficient ability to fit the data. (iv)
Adding more pre-training data does not improve generalization, but it can
strengthen the advantage of pre-training on the original data volume, such as
faster convergence. (v) While both the pre-training task and the model
structure determine the effectiveness of the paradigm on a given dataset, the
model structure plays a more significant role