Optimization in semi-supervised classification of multivariate time series

Abstract

Abstract. In this thesis, I study methods that classify time series in a semi-supervised manner. I compare the performance of models that assume independent and identically distributed observations against models that assume nearby observations to be dependent of each other. These models are evaluated on three real world time series data sets. In addition, I carefully go through the theory of mathematical optimization behind two successful algorithms used in this thesis: Support Vector Data Description and Dynamic Time Warping. For the algorithm Dynamic Time Warping, I provide a novel proof that is based on dynamic optimization. The experiments in this thesis suggest that the assumption of observations in time series to be independent and identically distributed may deteriorate the results of semi-supervised classification. The novel self-training method presented in this thesis called Peak Evaluation using Perceptually Important Points shows great performance on multivariate time series compared to the methods currently existing in literature. The feature subset selection of multivariate time series may improve classification performance, but finding a reliable unsupervised feature subset selection method remains an open question

    Similar works