471 research outputs found
On Recursive Edit Distance Kernels with Application to Time Series Classification
This paper proposes some extensions to the work on kernels dedicated to
string or time series global alignment based on the aggregation of scores
obtained by local alignments. The extensions we propose allow to construct,
from classical recursive definition of elastic distances, recursive edit
distance (or time-warp) kernels that are positive definite if some sufficient
conditions are satisfied. The sufficient conditions we end-up with are original
and weaker than those proposed in earlier works, although a recursive
regularizing term is required to get the proof of the positive definiteness as
a direct consequence of the Haussler's convolution theorem. The classification
experiment we conducted on three classical time warp distances (two of which
being metrics), using Support Vector Machine classifier, leads to conclude
that, when the pairwise distance matrix obtained from the training data is
\textit{far} from definiteness, the positive definite recursive elastic kernels
outperform in general the distance substituting kernels for the classical
elastic distances we have tested.Comment: 14 page
Supervised Learning with Indefinite Topological Kernels
Topological Data Analysis (TDA) is a recent and growing branch of statistics
devoted to the study of the shape of the data. In this work we investigate the
predictive power of TDA in the context of supervised learning. Since
topological summaries, most noticeably the Persistence Diagram, are typically
defined in complex spaces, we adopt a kernel approach to translate them into
more familiar vector spaces. We define a topological exponential kernel, we
characterize it, and we show that, despite not being positive semi-definite, it
can be successfully used in regression and classification tasks
Identifying networks with common organizational principles
Many complex systems can be represented as networks, and the problem of
network comparison is becoming increasingly relevant. There are many techniques
for network comparison, from simply comparing network summary statistics to
sophisticated but computationally costly alignment-based approaches. Yet it
remains challenging to accurately cluster networks that are of a different size
and density, but hypothesized to be structurally similar. In this paper, we
address this problem by introducing a new network comparison methodology that
is aimed at identifying common organizational principles in networks. The
methodology is simple, intuitive and applicable in a wide variety of settings
ranging from the functional classification of proteins to tracking the
evolution of a world trade network.Comment: 26 pages, 7 figure
A review on distance based time series classification
Time series classification is an increasing research topic due to the vast amount of time series data
that is being created over a wide variety of fields. The particularity of the data makes it a challenging task
and different approaches have been taken, including the distance based approach. 1-NN has been a widely used
method within distance based time series classification due to its simplicity but still good performance. However,
its supremacy may be attributed to being able to use specific distances for time series within the classification
process and not to the classifier itself. With the aim of exploiting these distances within more complex classifiers,
new approaches have arisen in the past few years that are competitive or which outperform the 1-NN based
approaches. In some cases, these new methods use the distance measure to transform the series into feature
vectors, bridging the gap between time series and traditional classifiers. In other cases, the distances are employed
to obtain a time series kernel and enable the use of kernel methods for time series classification. One of the main
challenges is that a kernel function must be positive semi-definite, a matter that is also addressed within this
review. The presented review includes a taxonomy of all those methods that aim to classify time series using a
distance based approach, as well as a discussion of the strengths and weaknesses of each method.TIN2016-78365-
- …