24,987 research outputs found

    Distance-based kernels for real-valued data

    Get PDF
    We consider distance-based similarity measures for real-valued vectors of interest in kernel-based machine learning algorithms. In particular, a truncated Euclidean similarity measure and a self-normalized similarity measure related to the Canberra distance. It is proved that they are positive semi-definite (p.s.d.), thus facilitating their use in kernel-based methods, like the Support Vector Machine, a very popular machine learning tool. These kernels may be better suited than standard kernels (like the RBF) in certain situations, that are described in the paper. Some rather general results concerning positivity properties are presented in detail as well as some interesting ways of proving the p.s.d. property.Peer ReviewedPostprint (author's final draft

    Dynamic mode decomposition in vector-valued reproducing kernel Hilbert spaces for extracting dynamical structure among observables

    Full text link
    Understanding nonlinear dynamical systems (NLDSs) is challenging in a variety of engineering and scientific fields. Dynamic mode decomposition (DMD), which is a numerical algorithm for the spectral analysis of Koopman operators, has been attracting attention as a way of obtaining global modal descriptions of NLDSs without requiring explicit prior knowledge. However, since existing DMD algorithms are in principle formulated based on the concatenation of scalar observables, it is not directly applicable to data with dependent structures among observables, which take, for example, the form of a sequence of graphs. In this paper, we formulate Koopman spectral analysis for NLDSs with structures among observables and propose an estimation algorithm for this problem. This method can extract and visualize the underlying low-dimensional global dynamics of NLDSs with structures among observables from data, which can be useful in understanding the underlying dynamics of such NLDSs. To this end, we first formulate the problem of estimating spectra of the Koopman operator defined in vector-valued reproducing kernel Hilbert spaces, and then develop an estimation procedure for this problem by reformulating tensor-based DMD. As a special case of our method, we propose the method named as Graph DMD, which is a numerical algorithm for Koopman spectral analysis of graph dynamical systems, using a sequence of adjacency matrices. We investigate the empirical performance of our method by using synthetic and real-world data.Comment: 34 pages with 4 figures, Published in Neural Networks, 201

    A binary neural k-nearest neighbour technique

    Get PDF
    K-Nearest Neighbour (k-NN) is a widely used technique for classifying and clustering data. K-NN is effective but is often criticised for its polynomial run-time growth as k-NN calculates the distance to every other record in the data set for each record in turn. This paper evaluates a novel k-NN classifier with linear growth and faster run-time built from binary neural networks. The binary neural approach uses robust encoding to map standard ordinal, categorical and real-valued data sets onto a binary neural network. The binary neural network uses high speed pattern matching to recall the k-best matches. We compare various configurations of the binary approach to a conventional approach for memory overheads, training speed, retrieval speed and retrieval accuracy. We demonstrate the superior performance with respect to speed and memory requirements of the binary approach compared to the standard approach and we pinpoint the optimal configurations

    Positive Definite Kernels in Machine Learning

    Full text link
    This survey is an introduction to positive definite kernels and the set of methods they have inspired in the machine learning literature, namely kernel methods. We first discuss some properties of positive definite kernels as well as reproducing kernel Hibert spaces, the natural extension of the set of functions {k(x,),xX}\{k(x,\cdot),x\in\mathcal{X}\} associated with a kernel kk defined on a space X\mathcal{X}. We discuss at length the construction of kernel functions that take advantage of well-known statistical models. We provide an overview of numerous data-analysis methods which take advantage of reproducing kernel Hilbert spaces and discuss the idea of combining several kernels to improve the performance on certain tasks. We also provide a short cookbook of different kernels which are particularly useful for certain data-types such as images, graphs or speech segments.Comment: draft. corrected a typo in figure

    Learning Sets with Separating Kernels

    Full text link
    We consider the problem of learning a set from random samples. We show how relevant geometric and topological properties of a set can be studied analytically using concepts from the theory of reproducing kernel Hilbert spaces. A new kind of reproducing kernel, that we call separating kernel, plays a crucial role in our study and is analyzed in detail. We prove a new analytic characterization of the support of a distribution, that naturally leads to a family of provably consistent regularized learning algorithms and we discuss the stability of these methods with respect to random sampling. Numerical experiments show that the approach is competitive, and often better, than other state of the art techniques.Comment: final versio
    corecore