5,774 research outputs found

    Semi-supervised Spectral Clustering for Classification

    Full text link
    We propose a Classification Via Clustering (CVC) algorithm which enables existing clustering methods to be efficiently employed in classification problems. In CVC, training and test data are co-clustered and class-cluster distributions are used to find the label of the test data. To determine an efficient number of clusters, a Semi-supervised Hierarchical Clustering (SHC) algorithm is proposed. Clusters are obtained by hierarchically applying two-way NCut by using signs of the Fiedler vector of the normalized graph Laplacian. To this end, a Direct Fiedler Vector Computation algorithm is proposed. The graph cut is based on the data structure and does not consider labels. Labels are used only to define the stopping criterion for graph cut. We propose clustering to be performed on the Grassmannian manifolds facilitating the formation of spectral ensembles. The proposed algorithm outperformed state-of-the-art image-set classification algorithms on five standard datasets

    Spatial Context based Angular Information Preserving Projection for Hyperspectral Image Classification

    Full text link
    Dimensionality reduction is a crucial preprocessing for hyperspectral data analysis - finding an appropriate subspace is often required for subsequent image classification. In recent work, we proposed supervised angular information based dimensionality reduction methods to find effective subspaces. Since unlabeled data are often more readily available compared to labeled data, we propose an unsupervised projection that finds a lower dimensional subspace where local angular information is preserved. To exploit spatial information from the hyperspectral images, we further extend our unsupervised projection to incorporate spatial contextual information around each pixel in the image. Additionally, we also propose a sparse representation based classifier which is optimized to exploit spatial information during classification - we hence assert that our proposed projection is particularly suitable for classifiers where local similarity and spatial context are both important. Experimental results with two real-world hyperspectral datasets demonstrate that our proposed methods provide a robust classification performance

    Perceptual Visual Interactive Learning

    Full text link
    Supervised learning methods are widely used in machine learning. However, the lack of labels in existing data limits the application of these technologies. Visual interactive learning (VIL) compared with computers can avoid semantic gap, and solve the labeling problem of small label quantity (SLQ) samples in a groundbreaking way. In order to fully understand the importance of VIL to the interaction process, we re-summarize the interactive learning related algorithms (e.g. clustering, classification, retrieval etc.) from the perspective of VIL. Note that, perception and cognition are two main visual processes of VIL. On this basis, we propose a perceptual visual interactive learning (PVIL) framework, which adopts gestalt principle to design interaction strategy and multi-dimensionality reduction (MDR) to optimize the process of visualization. The advantage of PVIL framework is that it combines computer's sensitivity of detailed features and human's overall understanding of global tasks. Experimental results validate that the framework is superior to traditional computer labeling methods (such as label propagation) in both accuracy and efficiency, which achieves significant classification results on dense distribution and sparse classes dataset

    Large Margin Low Rank Tensor Analysis

    Full text link
    Other than vector representations, the direct objects of human cognition are generally high-order tensors, such as 2D images and 3D textures. From this fact, two interesting questions naturally arise: How does the human brain represent these tensor perceptions in a "manifold" way, and how can they be recognized on the "manifold"? In this paper, we present a supervised model to learn the intrinsic structure of the tensors embedded in a high dimensional Euclidean space. With the fixed point continuation procedures, our model automatically and jointly discovers the optimal dimensionality and the representations of the low dimensional embeddings. This makes it an effective simulation of the cognitive process of human brain. Furthermore, the generalization of our model based on similarity between the learned low dimensional embeddings can be viewed as counterpart of recognition of human brain. Experiments on applications for object recognition and face recognition demonstrate the superiority of our proposed model over state-of-the-art approaches.Comment: 30 page

    A Unified Semi-Supervised Dimensionality Reduction Framework for Manifold Learning

    Full text link
    We present a general framework of semi-supervised dimensionality reduction for manifold learning which naturally generalizes existing supervised and unsupervised learning frameworks which apply the spectral decomposition. Algorithms derived under our framework are able to employ both labeled and unlabeled examples and are able to handle complex problems where data form separate clusters of manifolds. Our framework offers simple views, explains relationships among existing frameworks and provides further extensions which can improve existing algorithms. Furthermore, a new semi-supervised kernelization framework called ``KPCA trick'' is proposed to handle non-linear problems.Comment: 22 pages, 9 figure

    Machine learning based hyperspectral image analysis: A survey

    Full text link
    Hyperspectral sensors enable the study of the chemical properties of scene materials remotely for the purpose of identification, detection, and chemical composition analysis of objects in the environment. Hence, hyperspectral images captured from earth observing satellites and aircraft have been increasingly important in agriculture, environmental monitoring, urban planning, mining, and defense. Machine learning algorithms due to their outstanding predictive power have become a key tool for modern hyperspectral image analysis. Therefore, a solid understanding of machine learning techniques have become essential for remote sensing researchers and practitioners. This paper reviews and compares recent machine learning-based hyperspectral image analysis methods published in literature. We organize the methods by the image analysis task and by the type of machine learning algorithm, and present a two-way mapping between the image analysis tasks and the types of machine learning algorithms that can be applied to them. The paper is comprehensive in coverage of both hyperspectral image analysis tasks and machine learning algorithms. The image analysis tasks considered are land cover classification, target detection, unmixing, and physical parameter estimation. The machine learning algorithms covered are Gaussian models, linear regression, logistic regression, support vector machines, Gaussian mixture model, latent linear models, sparse linear models, Gaussian mixture models, ensemble learning, directed graphical models, undirected graphical models, clustering, Gaussian processes, Dirichlet processes, and deep learning. We also discuss the open challenges in the field of hyperspectral image analysis and explore possible future directions

    Online Supervised Subspace Tracking

    Full text link
    We present a framework for supervised subspace tracking, when there are two time series xtx_t and yty_t, one being the high-dimensional predictors and the other being the response variables and the subspace tracking needs to take into consideration of both sequences. It extends the classic online subspace tracking work which can be viewed as tracking of xtx_t only. Our online sufficient dimensionality reduction (OSDR) is a meta-algorithm that can be applied to various cases including linear regression, logistic regression, multiple linear regression, multinomial logistic regression, support vector machine, the random dot product model and the multi-scale union-of-subspace model. OSDR reduces data-dimensionality on-the-fly with low-computational complexity and it can also handle missing data and dynamic data. OSDR uses an alternating minimization scheme and updates the subspace via gradient descent on the Grassmannian manifold. The subspace update can be performed efficiently utilizing the fact that the Grassmannian gradient with respect to the subspace in many settings is rank-one (or low-rank in certain cases). The optimization problem for OSDR is non-convex and hard to analyze in general; we provide convergence analysis of OSDR in a simple linear regression setting. The good performance of OSDR compared with the conventional unsupervised subspace tracking are demonstrated via numerical examples on simulated and real data.Comment: Submitted for journal publicatio

    A literature survey of matrix methods for data science

    Full text link
    Efficient numerical linear algebra is a core ingredient in many applications across almost all scientific and industrial disciplines. With this survey we want to illustrate that numerical linear algebra has played and is playing a crucial role in enabling and improving data science computations with many new developments being fueled by the availability of data and computing resources. We highlight the role of various different factorizations and the power of changing the representation of the data as well as discussing topics such as randomized algorithms, functions of matrices, and high-dimensional problems. We briefly touch upon the role of techniques from numerical linear algebra used within deep learning

    Feature Selection and Feature Extraction in Pattern Analysis: A Literature Review

    Full text link
    Pattern analysis often requires a pre-processing stage for extracting or selecting features in order to help the classification, prediction, or clustering stage discriminate or represent the data in a better way. The reason for this requirement is that the raw data are complex and difficult to process without extracting or selecting appropriate features beforehand. This paper reviews theory and motivation of different common methods of feature selection and extraction and introduces some of their applications. Some numerical implementations are also shown for these methods. Finally, the methods in feature selection and extraction are compared.Comment: 14 pages, 1 figure, 2 tables, survey (literature review) pape

    Improved graph Laplacian via geometric self-consistency

    Full text link
    We address the problem of setting the kernel bandwidth used by Manifold Learning algorithms to construct the graph Laplacian. Exploiting the connection between manifold geometry, represented by the Riemannian metric, and the Laplace-Beltrami operator, we set the bandwidth by optimizing the Laplacian's ability to preserve the geometry of the data. Experiments show that this principled approach is effective and robust.Comment: 12 page
    • …
    corecore