471 research outputs found

    On Recursive Edit Distance Kernels with Application to Time Series Classification

    Get PDF
    This paper proposes some extensions to the work on kernels dedicated to string or time series global alignment based on the aggregation of scores obtained by local alignments. The extensions we propose allow to construct, from classical recursive definition of elastic distances, recursive edit distance (or time-warp) kernels that are positive definite if some sufficient conditions are satisfied. The sufficient conditions we end-up with are original and weaker than those proposed in earlier works, although a recursive regularizing term is required to get the proof of the positive definiteness as a direct consequence of the Haussler's convolution theorem. The classification experiment we conducted on three classical time warp distances (two of which being metrics), using Support Vector Machine classifier, leads to conclude that, when the pairwise distance matrix obtained from the training data is \textit{far} from definiteness, the positive definite recursive elastic kernels outperform in general the distance substituting kernels for the classical elastic distances we have tested.Comment: 14 page

    Supervised Learning with Indefinite Topological Kernels

    Full text link
    Topological Data Analysis (TDA) is a recent and growing branch of statistics devoted to the study of the shape of the data. In this work we investigate the predictive power of TDA in the context of supervised learning. Since topological summaries, most noticeably the Persistence Diagram, are typically defined in complex spaces, we adopt a kernel approach to translate them into more familiar vector spaces. We define a topological exponential kernel, we characterize it, and we show that, despite not being positive semi-definite, it can be successfully used in regression and classification tasks

    Identifying networks with common organizational principles

    Full text link
    Many complex systems can be represented as networks, and the problem of network comparison is becoming increasingly relevant. There are many techniques for network comparison, from simply comparing network summary statistics to sophisticated but computationally costly alignment-based approaches. Yet it remains challenging to accurately cluster networks that are of a different size and density, but hypothesized to be structurally similar. In this paper, we address this problem by introducing a new network comparison methodology that is aimed at identifying common organizational principles in networks. The methodology is simple, intuitive and applicable in a wide variety of settings ranging from the functional classification of proteins to tracking the evolution of a world trade network.Comment: 26 pages, 7 figure

    A review on distance based time series classification

    Get PDF
    Time series classification is an increasing research topic due to the vast amount of time series data that is being created over a wide variety of fields. The particularity of the data makes it a challenging task and different approaches have been taken, including the distance based approach. 1-NN has been a widely used method within distance based time series classification due to its simplicity but still good performance. However, its supremacy may be attributed to being able to use specific distances for time series within the classification process and not to the classifier itself. With the aim of exploiting these distances within more complex classifiers, new approaches have arisen in the past few years that are competitive or which outperform the 1-NN based approaches. In some cases, these new methods use the distance measure to transform the series into feature vectors, bridging the gap between time series and traditional classifiers. In other cases, the distances are employed to obtain a time series kernel and enable the use of kernel methods for time series classification. One of the main challenges is that a kernel function must be positive semi-definite, a matter that is also addressed within this review. The presented review includes a taxonomy of all those methods that aim to classify time series using a distance based approach, as well as a discussion of the strengths and weaknesses of each method.TIN2016-78365-
    • …
    corecore