23 research outputs found
Invariant Factorization Of Time-Series
Time-series classification is an important domain of machine learning and a
plethora of methods have been developed for the task. In comparison to existing
approaches, this study presents a novel method which decomposes a time-series
dataset into latent patterns and membership weights of local segments to those
patterns. The process is formalized as a constrained objective function and a
tailored stochastic coordinate descent optimization is applied. The time-series
are projected to a new feature representation consisting of the sums of the
membership weights, which captures frequencies of local patterns. Features from
various sliding window sizes are concatenated in order to encapsulate the
interaction of patterns from different sizes. Finally, a large-scale
experimental comparison against 6 state of the art baselines and 43 real life
datasets is conducted. The proposed method outperforms all the baselines with
statistically significant margins in terms of prediction accuracy
Optimal Time-Series Motifs
Motifs are the most repetitive/frequent patterns of a time-series. The
discovery of motifs is crucial for practitioners in order to understand and
interpret the phenomena occurring in sequential data. Currently, motifs are
searched among series sub-sequences, aiming at selecting the most frequently
occurring ones. Search-based methods, which try out series sub-sequence as
motif candidates, are currently believed to be the best methods in finding the
most frequent patterns.
However, this paper proposes an entirely new perspective in finding motifs.
We demonstrate that searching is non-optimal since the domain of motifs is
restricted, and instead we propose a principled optimization approach able to
find optimal motifs. We treat the occurrence frequency as a function and
time-series motifs as its parameters, therefore we \textit{learn} the optimal
motifs that maximize the frequency function. In contrast to searching, our
method is able to discover the most repetitive patterns (hence optimal), even
in cases where they do not explicitly occur as sub-sequences. Experiments on
several real-life time-series datasets show that the motifs found by our method
are highly more frequent than the ones found through searching, for exactly the
same distance threshold.Comment: Submitted to KDD201
Scalable Discovery of Time-Series Shapelets
Time-series classification is an important problem for the data mining
community due to the wide range of application domains involving time-series
data. A recent paradigm, called shapelets, represents patterns that are highly
predictive for the target variable. Shapelets are discovered by measuring the
prediction accuracy of a set of potential (shapelet) candidates. The candidates
typically consist of all the segments of a dataset, therefore, the discovery of
shapelets is computationally expensive. This paper proposes a novel method that
avoids measuring the prediction accuracy of similar candidates in Euclidean
distance space, through an online clustering pruning technique. In addition,
our algorithm incorporates a supervised shapelet selection that filters out
only those candidates that improve classification accuracy. Empirical evidence
on 45 datasets from the UCR collection demonstrate that our method is 3-4
orders of magnitudes faster than the fastest existing shapelet-discovery
method, while providing better prediction accuracy.Comment: Under review in the journal "Knowledge and Information Systems"
(KAIS
Scalable Pareto Front Approximation for Deep Multi-Objective Learning
Multi-objective optimization (MOO) is a prevalent challenge for Deep
Learning, however, there exists no scalable MOO solution for truly deep neural
networks. Prior work either demand optimizing a new network for every point on
the Pareto front, or induce a large overhead to the number of trainable
parameters by using hyper-networks conditioned on modifiable preferences. In
this paper, we propose to condition the network directly on these preferences
by augmenting them to the feature space. Furthermore, we ensure a well-spread
Pareto front by penalizing the solutions to maintain a small angle to the
preference vector. In a series of experiments, we demonstrate that our Pareto
fronts achieve state-of-the-art quality despite being computed significantly
faster. Furthermore, we showcase the scalability as our method approximates the
full Pareto front on the CelebA dataset with an EfficientNet network at a tiny
training time overhead of 7% compared to a simple single-objective
optimization. We make our code publicly available at
https://github.com/ruchtem/cosmos.Comment: Accepted at ICDM 2021 as short paper. Adapt title to match published
versio
Ultra-Fast Shapelets for Time Series Classification
Time series shapelets are discriminative subsequences and their similarity to
a time series can be used for time series classification. Since the discovery
of time series shapelets is costly in terms of time, the applicability on long
or multivariate time series is difficult. In this work we propose Ultra-Fast
Shapelets that uses a number of random shapelets. It is shown that Ultra-Fast
Shapelets yield the same prediction quality as current state-of-the-art
shapelet-based time series classifiers that carefully select the shapelets by
being by up to three orders of magnitudes. Since this method allows a
ultra-fast shapelet discovery, using shapelets for long multivariate time
series classification becomes feasible.
A method for using shapelets for multivariate time series is proposed and
Ultra-Fast Shapelets is proven to be successful in comparison to
state-of-the-art multivariate time series classifiers on 15 multivariate time
series datasets from various domains. Finally, time series derivatives that
have proven to be useful for other time series classifiers are investigated for
the shapelet-based classifiers. It is shown that they have a positive impact
and that they are easy to integrate with a simple preprocessing step, without
the need of adapting the shapelet discovery algorithm.Comment: Preprint submitted to Journal of Data & Knowledge Engineering January
24, 201
Time-Series Classification Through Histograms of Symbolic Polynomials
Time-series classification has attracted considerable research attention due
to the various domains where time-series data are observed, ranging from
medicine to econometrics. Traditionally, the focus of time-series
classification has been on short time-series data composed of a unique pattern
with intraclass pattern distortions and variations, while recently there have
been attempts to focus on longer series composed of various local patterns.
This study presents a novel method which can detect local patterns in long
time-series via fitting local polynomial functions of arbitrary degrees. The
coefficients of the polynomial functions are converted to symbolic words via
equivolume discretizations of the coefficients' distributions. The symbolic
polynomial words enable the detection of similar local patterns by assigning
the same words to similar polynomials. Moreover, a histogram of the frequencies
of the words is constructed from each time-series' bag of words. Each row of
the histogram enables a new representation for the series and symbolize the
existence of local patterns and their frequencies. Experimental evidence
demonstrates outstanding results of our method compared to the state-of-art
baselines, by exhibiting the best classification accuracies in all the datasets
and having statistically significant improvements in the absolute majority of
experiments
Learning Surrogate Losses
The minimization of loss functions is the heart and soul of Machine Learning.
In this paper, we propose an off-the-shelf optimization approach that can
minimize virtually any non-differentiable and non-decomposable loss function
(e.g. Miss-classification Rate, AUC, F1, Jaccard Index, Mathew Correlation
Coefficient, etc.) seamlessly. Our strategy learns smooth relaxation versions
of the true losses by approximating them through a surrogate neural network.
The proposed loss networks are set-wise models which are invariant to the order
of mini-batch instances. Ultimately, the surrogate losses are learned jointly
with the prediction model via bilevel optimization. Empirical results on
multiple datasets with diverse real-life loss functions compared with
state-of-the-art baselines demonstrate the efficiency of learning surrogate
losses
Dataset2Vec: Learning Dataset Meta-Features
Meta-learning, or learning to learn, is a machine learning approach that
utilizes prior learning experiences to expedite the learning process on unseen
tasks. As a data-driven approach, meta-learning requires meta-features that
represent the primary learning tasks or datasets, and are estimated
traditonally as engineered dataset statistics that require expert domain
knowledge tailored for every meta-task. In this paper, first, we propose a
meta-feature extractor called Dataset2Vec that combines the versatility of
engineered dataset meta-features with the expressivity of meta-features learned
by deep neural networks. Primary learning tasks or datasets are represented as
hierarchical sets, i.e., as a set of sets, esp. as a set of predictor/target
pairs, and then a DeepSet architecture is employed to regress meta-features on
them. Second, we propose a novel auxiliary meta-learning task with abundant
data called dataset similarity learning that aims to predict if two batches
stem from the same dataset or different ones. In an experiment on a large-scale
hyperparameter optimization task for 120 UCI datasets with varying schemas as a
meta-learning task, we show that the meta-features of Dataset2Vec outperform
the expert engineered meta-features and thus demonstrate the usefulness of
learned meta-features for datasets with varying schemas for the first time
Channel masking for multivariate time series shapelets
Time series shapelets are discriminative sub-sequences and their similarity
to time series can be used for time series classification. Initial shapelet
extraction algorithms searched shapelets by complete enumeration of all
possible data sub-sequences. Research on shapelets for univariate time series
proposed a mechanism called shapelet learning which parameterizes the shapelets
and learns them jointly with a prediction model in an optimization procedure.
Trivial extension of this method to multivariate time series does not yield
very good results due to the presence of noisy channels which lead to
overfitting. In this paper we propose a shapelet learning scheme for
multivariate time series in which we introduce channel masks to discount noisy
channels and serve as an implicit regularization.Comment: 12 page
Data-Driven Vehicle Trajectory Forecasting
An active area of research is to increase the safety of self-driving
vehicles. Although safety cannot be guarenteed completely, the capability of a
vehicle to predict the future trajectories of its surrounding vehicles could
help ensure this notion of safety to a greater deal. We cast the trajectory
forecast problem in a multi-time step forecasting problem and develop a
Convolutional Neural Network based approach to learn from trajectory sequences
generated from completely raw dataset in real-time. Results show improvement
over baselines.Comment: Published in ECML KNOWMe: 2nd International Workshop on Knowledge
Discovery from Mobility and Transportation Systems 201