12 research outputs found

    Kernel Spectral Curvature Clustering (KSCC)

    Full text link
    Multi-manifold modeling is increasingly used in segmentation and data representation tasks in computer vision and related fields. While the general problem, modeling data by mixtures of manifolds, is very challenging, several approaches exist for modeling data by mixtures of affine subspaces (which is often referred to as hybrid linear modeling). We translate some important instances of multi-manifold modeling to hybrid linear modeling in embedded spaces, without explicitly performing the embedding but applying the kernel trick. The resulting algorithm, Kernel Spectral Curvature Clustering, uses kernels at two levels - both as an implicit embedding method to linearize nonflat manifolds and as a principled method to convert a multiway affinity problem into a spectral clustering one. We demonstrate the effectiveness of the method by comparing it with other state-of-the-art methods on both synthetic data and a real-world problem of segmenting multiple motions from two perspective camera views.Comment: accepted to 2009 ICCV Workshop on Dynamical Visio

    Neural Collaborative Subspace Clustering

    Full text link
    We introduce the Neural Collaborative Subspace Clustering, a neural model that discovers clusters of data points drawn from a union of low-dimensional subspaces. In contrast to previous attempts, our model runs without the aid of spectral clustering. This makes our algorithm one of the kinds that can gracefully scale to large datasets. At its heart, our neural model benefits from a classifier which determines whether a pair of points lies on the same subspace or not. Essential to our model is the construction of two affinity matrices, one from the classifier and the other from a notion of subspace self-expressiveness, to supervise training in a collaborative scheme. We thoroughly assess and contrast the performance of our model against various state-of-the-art clustering algorithms including deep subspace-based ones.Comment: Accepted to ICML 201

    Nearness to Local Subspace Algorithm for Subspace and Motion Segmentation

    Get PDF
    There is a growing interest in computer science, engineering, and mathematics for modeling signals in terms of union of subspaces and manifolds. Subspace segmentation and clustering of high dimensional data drawn from a union of subspaces are especially important with many practical applications in computer vision, image and signal processing, communications, and information theory. This paper presents a clustering algorithm for high dimensional data that comes from a union of lower dimensional subspaces of equal and known dimensions. Such cases occur in many data clustering problems, such as motion segmentation and face recognition. The algorithm is reliable in the presence of noise, and applied to the Hopkins 155 Dataset, it generates the best results to date for motion segmentation. The two motion, three motion, and overall segmentation rates for the video sequences are 99.43%, 98.69%, and 99.24%, respectively

    Graphs decomposition using modified spectral clustering method

    Get PDF
    Among a large number of tasks on graphs, studies related to the placement of objects with the aim of increasing the information content of complex multi-parameter systems find wide practical application (for example, in transport and computer networks, piping systems, in image processing). Despite years of research, accurate and efficient algorithms cannot be found for placement problems. It is proposed to consider the solution of the allocation problem in the context of decomposition of the initial network into k regions, in each of which a vertex with some centrality property is searched. This article provides an analysis of sources for solving the problem of placement in graphs, as well as methods of decomposition of graph structures. Following the main provisions of the theory of spectral clustering, the disadvantages of the splitting applied criteria Rcut and Ncut are indicated. It is shown that the application of the distance minimization criterion Dcut proposed in this paper allows to obtain high results in the decomposition of the graph. The obtained results are based on the examples of searching for sensor placement vertices in the known ZJ and D-Тown networks of the EPANET hydraulic modeling system

    Subspace Segmentation And High-Dimensional Data Analysis

    Get PDF
    This thesis developed theory and associated algorithms to solve subspace segmentation problem. Given a set of data W={w_1,...,w_N} in R^D that comes from a union of subspaces, we focused on determining a nonlinear model of the form U={S_i}_{i in I}, where S_i is a set of subspaces, that is nearest to W. The model is then used to classify W into clusters. Our first approach is based on the binary reduced row echelon form of data matrix. We prove that, in absence of noise, our approach can find the number of subspaces, their dimensions, and an orthonormal basis for each subspace S_i. We provide a comprehensive analysis of our theory and determine its limitations and strengths in presence of outliers and noise. Our second approach is based on nearness to local subspaces approach and it can handle noise effectively, but it works only in special cases of the general subspace segmentation problem (i.e., subspaces of equal and known dimensions). Our approach is based on the computation of a binary similarity matrix for the data points. A local subspace is first estimated for each data point. Then, a distance matrix is generated by computing the distances between the local subspaces and points. The distance matrix is converted to the similarity matrix by applying a data-driven threshold. The problem is then transformed to segmentation of subspaces of dimension 1 instead of subspaces of dimension d. The algorithm was applied to the Hopkins 155 Dataset and generated the best results to date

    Novel methods for Intrinsic dimension estimation and manifold learning

    Get PDF
    One of the most challenging problems in modern science is how to deal with the huge amount of data that today's technologies provide. Several diculties may arise. For instance, the number of samples may be too big and the stream of incoming data may be faster than the algorithm needed to process them. Another common problem is that when data dimension grows also the volume of the space does, leading to a sparsication of the available data. This may cause problems in the statistical analysis since the data needed to support our conclusion often grows exponentially with the dimension. This problem is commonly referred to as the Curse of Dimensionality and it is one of the reasons why high dimensional data can not be analyzed eciently with traditional methods. Classical methods for dimensionality reduction, like principal component analysis and factor analysis, may fail due to a nonlinear structure of the data. In recent years several methods for nonlinear dimensionality reduction have been proposed. A general way to model high dimensional data set is to represent the observations as noisy samples drawn from a probability distribution mu in the real coordinate space of D dimensions. It has been observed that the essential support of mu can be often well approximated by low dimensional sets. These sets can be assumed to be low dimensional manifolds embedded in the ambient dimension D. A manifold is a topologial space which globally may not be Euclidean but in a small neighbor of each point behaves like an Euclidean space. In this setting we call intrinsic dimension the dimension of the manifold, which is usually much lower than the ambient dimension D. Roughly speaking, the intrinsic dimension of a data set can be described as the minimum number of variables needed to represent the data without signicant loss of information. In this work we propose dierent methods aimed at estimate the intrinsic dimension. The rst method we present models the neighbors of each point as stochastic processes, in such a way that a closed form likelihood function can be written. This leads to a closed form maximum likelihood estimator (MLE) for the intrinsic dimension, which has all the good features that a MLE can have. The second method is based on a multiscale singular value decomposition (MSVD) of the data. This method performs singular value decomposition (SVD) on neighbors of increasing size and nd an estimate for the intrinsic dimension studying the behavior of the singular values as the radius of the neighbor increases. We also introduce an algorithm to estimate the model parameters when the data are assumed to be sampled around an unknown number of planes with dierent intrinsic dimensions, embedded in a high dimensional space. This kind of models have many applications in computer vision and patter recognition, where the data can be described by multiple linear structures or need to be clusterized into groups that can be represented by low dimensional hyperplanes. The algorithm relies on both MSVD and spectral clustering, and it is able to estimate the number of planes, their dimension as well as their arrangement in the ambient space. Finally, we propose a novel method for manifold reconstruction based on a multiscale approach, which approximates the manifold from coarse to ne scales with increasing precision. The basic idea is to produce, at a generic scale j, a piecewise linear approximation of the manifold using a collection of low dimensional planes and use those planes to create clusters for the data. At scale j + 1, each cluster is independently approximated by another collection of low dimensional planes.The process is iterated until the desired precision is achieved. This algorithm is fast because it is highly parallelizable and its computational time is independent from the sample size. Moreover this method automatically constructs a tree structure for the data. This feature can be particularly useful in applications which requires an a priori tree data structure. The aim of the collection of methods proposed in this work is to provide algorithms to learn and estimate the underlying structure of high dimensional dataset

    Unsupervised Learning from Shollow to Deep

    Get PDF
    Machine learning plays a pivotal role in most state-of-the-art systems in many application research domains. With the rising of deep learning, massive labeled data become the solution of feature learning, which enables the model to learn automatically. Unfortunately, the trained deep learning model is hard to adapt to other datasets without fine-tuning, and the applicability of machine learning methods is limited by the amount of available labeled data. Therefore, the aim of this thesis is to alleviate the limitations of supervised learning by exploring algorithms to learn good internal representations, and invariant feature hierarchies from unlabelled data. Firstly, we extend the traditional dictionary learning and sparse coding algorithms onto hierarchical image representations in a principled way. To achieve dictionary atoms capture additional information from extended receptive fields and attain improved descriptive capacity, we present a two-pass multi-resolution cascade framework for dictionary learning and sparse coding. This cascade method allows collaborative reconstructions at different resolutions using only the same dimensional dictionary atoms. The jointly learned dictionary comprises atoms that adapt to the information available at the coarsest layer, where the support of atoms reaches a maximum range, and the residual images, where the supplementary details refine progressively a reconstruction objective. Our method generates flexible and accurate representations using only a small number of coefficients, and is efficient in computation. In the following work, we propose to incorporate the traditional self-expressiveness property into deep learning to explore better representation for subspace clustering. This architecture is built upon deep auto-encoders, which non-linearly map the input data into a latent space. Our key idea is to introduce a novel self-expressive layer between the encoder and the decoder to mimic the ``self-expressiveness'' property that has proven effective in traditional subspace clustering. Being differentiable, our new self-expressive layer provides a simple but effective way to learn pairwise affinities between all data points through a standard back-propagation procedure. Being nonlinear, our neural-network based method is able to cluster data points having complex (often nonlinear) structures. However, Subspace clustering algorithms are notorious for their scalability issues because building and processing large affinity matrices are demanding. We propose two methods to tackle this problem. One method is based on kk-Subspace Clustering, where we introduce a method that simultaneously learns an embedding space along subspaces within it to minimize a notion of reconstruction error, thus addressing the problem of subspace clustering in an end-to-end learning paradigm. This in turn frees us from the need of having an affinity matrix to perform clustering. The other way starts from using a feed forward network to replace the spectral clustering and learn the affinities of each data from "self-expressive" layer. We introduce the Neural Collaborative Subspace Clustering, where it benefits from a classifier which determines whether a pair of points lies on the same subspace under supervision of "self-expressive" layer. Essential to our model is the construction of two affinity matrices, one from the classifier and the other from a notion of subspace self-expressiveness, to supervise training in a collaborative scheme. In summary, we make constributions on how to perform the unsupervised learning in several tasks in this thesis. It starts from traditional sparse coding and dictionary learning perspective in low-level vision. Then, we exploit how to incorporate unsupervised learning in convolutional neural networks without label information and make subspace clustering to large scale dataset. Furthermore, we also extend the clustering on dense prediction task (saliency detection)
    corecore