4 research outputs found
Back to Basics: A Sanity Check on Modern Time Series Classification Algorithms
The state-of-the-art in time series classification has come a long way, from
the 1NN-DTW algorithm to the ROCKET family of classifiers. However, in the
current fast-paced development of new classifiers, taking a step back and
performing simple baseline checks is essential. These checks are often
overlooked, as researchers are focused on establishing new state-of-the-art
results, developing scalable algorithms, and making models explainable.
Nevertheless, there are many datasets that look like time series at first
glance, but classic algorithms such as tabular methods with no time ordering
may perform better on such problems. For example, for spectroscopy datasets,
tabular methods tend to significantly outperform recent time series methods. In
this study, we compare the performance of tabular models using classic machine
learning approaches (e.g., Ridge, LDA, RandomForest) with the ROCKET family of
classifiers (e.g., Rocket, MiniRocket, MultiRocket). Tabular models are simple
and very efficient, while the ROCKET family of classifiers are more complex and
have state-of-the-art accuracy and efficiency among recent time series
classifiers. We find that tabular models outperform the ROCKET family of
classifiers on approximately 19% of univariate and 28% of multivariate datasets
in the UCR/UEA benchmark and achieve accuracy within 10 percentage points on
about 50% of datasets. Our results suggest that it is important to consider
simple tabular models as baselines when developing time series classifiers.
These models are very fast, can be as effective as more complex methods and may
be easier to understand and deploy
Classification of cow diet based on milk mid infrared spectra: a data analysis competition at the "International workshop of spectroscopy and chemometrics 2022"
In April 2022, the Vistamilk SFI Research Centre organized the second edition
of the "International Workshop on Spectroscopy and Chemometrics - Applications
in Food and Agriculture". Within this event, a data challenge was organized
among participants of the workshop. Such data competition aimed at developing a
prediction model to discriminate dairy cows' diet based on milk spectral
information collected in the mid-infrared region. In fact, the development of
an accurate and reliable discriminant model for dairy cows' diet can provide
important authentication tools for dairy processors to guarantee product origin
for dairy food manufacturers from grass-fed animals. Different statistical and
machine learning modelling approaches have been employed during the workshop,
with different pre-processing steps involved and different degree of
complexity. The present paper aims to describe the statistical methods adopted
by participants to develop such classification model.Comment: 27 pages, 9 figure
Mid infrared spectroscopy and milk quality traits: a data analysis competition at the “International Workshop on Spectroscopy and Chemometrics 2021”
datasetchemometric data analysis challenge has been arranged during the first edition of the
“International Workshop on Spectroscopy and Chemometrics”, organized by the Vistamilk
SFI Research Centre and held online in April 2021. The aim of the competition was to build a
calibration model in order to predict milk quality traits exploiting the information contained
in mid-infrared spectra only. Three different traits have been provided, presenting heterogeneous
degrees of prediction complexity thus possibly requiring trait-specific modelling
choices. In this paper the different approaches adopted by the participants are outlined and
the insights obtained from the analyses are critically discussed