512 research outputs found

    Nonlinear Supervised Dimensionality Reduction via Smooth Regular Embeddings

    Full text link
    The recovery of the intrinsic geometric structures of data collections is an important problem in data analysis. Supervised extensions of several manifold learning approaches have been proposed in the recent years. Meanwhile, existing methods primarily focus on the embedding of the training data, and the generalization of the embedding to initially unseen test data is rather ignored. In this work, we build on recent theoretical results on the generalization performance of supervised manifold learning algorithms. Motivated by these performance bounds, we propose a supervised manifold learning method that computes a nonlinear embedding while constructing a smooth and regular interpolation function that extends the embedding to the whole data space in order to achieve satisfactory generalization. The embedding and the interpolator are jointly learnt such that the Lipschitz regularity of the interpolator is imposed while ensuring the separation between different classes. Experimental results on several image data sets show that the proposed method outperforms traditional classifiers and the supervised dimensionality reduction algorithms in comparison in terms of classification accuracy in most settings

    Multi-view Subspace Learning for Large-Scale Multi-Modal Data Analysis

    Get PDF
    Dimensionality reduction methods play a big role within the modern machine learning techniques, and subspace learning is one of the common approaches to it. Although various methods have been proposed over the past years, many of them suffer from limitations related to the unimodality assumptions on the data and low speed in the cases of high-dimensional data (in linear formulations) or large datasets (in kernel-based formulations). In this work, several methods for overcoming these limitations are proposed. In this thesis, the problem of the large-scale multi-modal data analysis for single- and multi-view data is discussed, and several extensions for Subclass Discriminant Analysis (SDA) are proposed. First, a Spectral Regression Subclass Discriminant Analysis method relying on the Graph Embedding-based formulation of SDA is proposed as a way to reduce the training time, and it is shown how the solution can be obtained efficiently, therefore reducing the computational requirements. Secondly, a novel multi-view formulation for Subclass Discriminant Analysis is proposed, allowing to extend it to data coming from multiple views. Besides, a speed-up approach for the multi-view formulation that allows reducing the computational requirements of the method is proposed. Linear and nonlinear kernel-based formulations are proposed for all the extensions. Experiments are performed on nine single-view and nine multi-view datasets and the accuracy and speed of the proposed extensions are evaluated. Experimentally it is shown that the proposed approaches result in a significant reduction of the training time while providing competitive performance, as compared to other subspace-learning based methods

    Speed-up and multi-view extensions to subclass discriminant analysis

    Get PDF
    Highlights • We present a speed-up extension to Subclass Discriminant Analysis. • We propose an extension to SDA for multi-view problems and a fast solution to it. • The proposed approaches result in lower training time and competitive performance.In this paper, we propose a speed-up approach for subclass discriminant analysis and formulate a novel efficient multi-view solution to it. The speed-up approach is developed based on graph embedding and spectral regression approaches that involve eigendecomposition of the corresponding Laplacian matrix and regression to its eigenvectors. We show that by exploiting the structure of the between-class Laplacian matrix, the eigendecomposition step can be substituted with a much faster process. Furthermore, we formulate a novel criterion for multi-view subclass discriminant analysis and show that an efficient solution to it can be obtained in a similar manner to the single-view case. We evaluate the proposed methods on nine single-view and nine multi-view datasets and compare them with related existing approaches. Experimental results show that the proposed solutions achieve competitive performance, often outperforming the existing methods. At the same time, they significantly decrease the training time

    Unified functional network and nonlinear time series analysis for complex systems science: The pyunicorn package

    Get PDF
    We introduce the \texttt{pyunicorn} (Pythonic unified complex network and recurrence analysis toolbox) open source software package for applying and combining modern methods of data analysis and modeling from complex network theory and nonlinear time series analysis. \texttt{pyunicorn} is a fully object-oriented and easily parallelizable package written in the language Python. It allows for the construction of functional networks such as climate networks in climatology or functional brain networks in neuroscience representing the structure of statistical interrelationships in large data sets of time series and, subsequently, investigating this structure using advanced methods of complex network theory such as measures and models for spatial networks, networks of interacting networks, node-weighted statistics or network surrogates. Additionally, \texttt{pyunicorn} provides insights into the nonlinear dynamics of complex systems as recorded in uni- and multivariate time series from a non-traditional perspective by means of recurrence quantification analysis (RQA), recurrence networks, visibility graphs and construction of surrogate time series. The range of possible applications of the library is outlined, drawing on several examples mainly from the field of climatology.Comment: 28 pages, 17 figure

    Applications of Optimal Transportation in the Natural Sciences (online meeting)

    Get PDF
    Concepts and methods from the mathematical theory of optimal transportation have reached significant importance in various fields of the natural sciences. The view on classical problems from a "transport perspective'' has lead to the development of powerful problem-adapted mathematical tools, and sometimes to a novel geometric understanding of the matter. The natural sciences, in turn, are the most important source of ideas for the further development of the optimal transport theory, and are a driving force for the design of efficient and reliable numerical methods to approximate Wasserstein distances and the like. The presentations and discussions in this workshop have been centered around recent analytical results and numerical methods in the field of optimal transportation that have been motivated by specific applications in statistical physics, quantum mechanics, and chemistry
    • …
    corecore