179 research outputs found

    Riemannian kernel based Nystr\"om method for approximate infinite-dimensional covariance descriptors with application to image set classification

    Full text link
    In the domain of pattern recognition, using the CovDs (Covariance Descriptors) to represent data and taking the metrics of the resulting Riemannian manifold into account have been widely adopted for the task of image set classification. Recently, it has been proven that infinite-dimensional CovDs are more discriminative than their low-dimensional counterparts. However, the form of infinite-dimensional CovDs is implicit and the computational load is high. We propose a novel framework for representing image sets by approximating infinite-dimensional CovDs in the paradigm of the Nystr\"om method based on a Riemannian kernel. We start by modeling the images via CovDs, which lie on the Riemannian manifold spanned by SPD (Symmetric Positive Definite) matrices. We then extend the Nystr\"om method to the SPD manifold and obtain the approximations of CovDs in RKHS (Reproducing Kernel Hilbert Space). Finally, we approximate infinite-dimensional CovDs via these approximations. Empirically, we apply our framework to the task of image set classification. The experimental results obtained on three benchmark datasets show that our proposed approximate infinite-dimensional CovDs outperform the original CovDs.Comment: 6 pages, 3 figures, International Conference on Pattern Recognition 201

    The Role of Riemannian Manifolds in Computer Vision: From Coding to Deep Metric Learning

    Get PDF
    A diverse number of tasks in computer vision and machine learning enjoy from representations of data that are compact yet discriminative, informative and robust to critical measurements. Two notable representations are offered by Region Covariance Descriptors (RCovD) and linear subspaces which are naturally analyzed through the manifold of Symmetric Positive Definite (SPD) matrices and the Grassmann manifold, respectively, two widely used types of Riemannian manifolds in computer vision. As our first objective, we examine image and video-based recognition applications where the local descriptors have the aforementioned Riemannian structures, namely the SPD or linear subspace structure. Initially, we provide a solution to compute Riemannian version of the conventional Vector of Locally aggregated Descriptors (VLAD), using geodesic distance of the underlying manifold as the nearness measure. Next, by having a closer look at the resulting codes, we formulate a new concept which we name Local Difference Vectors (LDV). LDVs enable us to elegantly expand our Riemannian coding techniques to any arbitrary metric as well as provide intrinsic solutions to Riemannian sparse coding and its variants when local structured descriptors are considered. We then turn our attention to two special types of covariance descriptors namely infinite-dimensional RCovDs and rank-deficient covariance matrices for which the underlying Riemannian structure, i.e. the manifold of SPD matrices is out of reach to great extent. %Generally speaking, infinite-dimensional RCovDs offer better discriminatory power over their low-dimensional counterparts. To overcome this difficulty, we propose to approximate the infinite-dimensional RCovDs by making use of two feature mappings, namely random Fourier features and the Nystrom method. As for the rank-deficient covariance matrices, unlike most existing approaches that employ inference tools by predefined regularizers, we derive positive definite kernels that can be decomposed into the kernels on the cone of SPD matrices and kernels on the Grassmann manifolds and show their effectiveness for image set classification task. Furthermore, inspired by attractive properties of Riemannian optimization techniques, we extend the recently introduced Keep It Simple and Straightforward MEtric learning (KISSME) method to the scenarios where input data is non-linearly distributed. To this end, we make use of the infinite dimensional covariance matrices and propose techniques towards projecting on the positive cone in a Reproducing Kernel Hilbert Space (RKHS). We also address the sensitivity issue of the KISSME to the input dimensionality. The KISSME algorithm is greatly dependent on Principal Component Analysis (PCA) as a preprocessing step which can lead to difficulties, especially when the dimensionality is not meticulously set. To address this issue, based on the KISSME algorithm, we develop a Riemannian framework to jointly learn a mapping performing dimensionality reduction and a metric in the induced space. Lastly, in line with the recent trend in metric learning, we devise end-to-end learning of a generic deep network for metric learning using our derivation

    Eigendecompositions of Transfer Operators in Reproducing Kernel Hilbert Spaces

    Get PDF
    Transfer operators such as the Perron--Frobenius or Koopman operator play an important role in the global analysis of complex dynamical systems. The eigenfunctions of these operators can be used to detect metastable sets, to project the dynamics onto the dominant slow processes, or to separate superimposed signals. We extend transfer operator theory to reproducing kernel Hilbert spaces and show that these operators are related to Hilbert space representations of conditional distributions, known as conditional mean embeddings in the machine learning community. Moreover, numerical methods to compute empirical estimates of these embeddings are akin to data-driven methods for the approximation of transfer operators such as extended dynamic mode decomposition and its variants. One main benefit of the presented kernel-based approaches is that these methods can be applied to any domain where a similarity measure given by a kernel is available. We illustrate the results with the aid of guiding examples and highlight potential applications in molecular dynamics as well as video and text data analysis

    Non-Linear Temporal Subspace Representations for Activity Recognition

    Full text link
    Representations that can compactly and effectively capture the temporal evolution of semantic content are important to computer vision and machine learning algorithms that operate on multi-variate time-series data. We investigate such representations motivated by the task of human action recognition. Here each data instance is encoded by a multivariate feature (such as via a deep CNN) where action dynamics are characterized by their variations in time. As these features are often non-linear, we propose a novel pooling method, kernelized rank pooling, that represents a given sequence compactly as the pre-image of the parameters of a hyperplane in a reproducing kernel Hilbert space, projections of data onto which captures their temporal order. We develop this idea further and show that such a pooling scheme can be cast as an order-constrained kernelized PCA objective. We then propose to use the parameters of a kernelized low-rank feature subspace as the representation of the sequences. We cast our formulation as an optimization problem on generalized Grassmann manifolds and then solve it efficiently using Riemannian optimization techniques. We present experiments on several action recognition datasets using diverse feature modalities and demonstrate state-of-the-art results.Comment: Accepted at the IEEE International Conference on Computer Vision and Pattern Recognition, CVPR, 2018. arXiv admin note: substantial text overlap with arXiv:1705.0858

    Learning by correlation for computer vision applications: from Kernel methods to deep learning

    Get PDF
    Learning to spot analogies and differences within/across visual categories is an arguably powerful approach in machine learning and pattern recognition which is directly inspired by human cognition. In this thesis, we investigate a variety of approaches which are primarily driven by correlation and tackle several computer vision applications
    corecore