3,920 research outputs found

    Robust Unsupervised Flexible Auto-weighted Local-Coordinate Concept Factorization for Image Clustering

    Full text link
    We investigate the high-dimensional data clustering problem by proposing a novel and unsupervised representation learning model called Robust Flexible Auto-weighted Local-coordinate Concept Factorization (RFA-LCF). RFA-LCF integrates the robust flexible CF, robust sparse local-coordinate coding and the adaptive reconstruction weighting learning into a unified model. The adaptive weighting is driven by including the joint manifold preserving constraints on the recovered clean data, basis concepts and new representation. Specifically, our RFA-LCF uses a L2,1-norm based flexible residue to encode the mismatch between clean data and its reconstruction, and also applies the robust adaptive sparse local-coordinate coding to represent the data using a few nearby basis concepts, which can make the factorization more accurate and robust to noise. The robust flexible factorization is also performed in the recovered clean data space for enhancing representations. RFA-LCF also considers preserving the local manifold structures of clean data space, basis concept space and the new coordinate space jointly in an adaptive manner way. Extensive comparisons show that RFA-LCF can deliver enhanced clustering results.Comment: Accepted at the 44th IEEE International Conference on Acoustics, Speech, and Signal Processing(ICASSP 2019

    Global and Local Structure Preserving Sparse Subspace Learning: An Iterative Approach to Unsupervised Feature Selection

    Full text link
    As we aim at alleviating the curse of high-dimensionality, subspace learning is becoming more popular. Existing approaches use either information about global or local structure of the data, and few studies simultaneously focus on global and local structures as the both of them contain important information. In this paper, we propose a global and local structure preserving sparse subspace learning (GLoSS) model for unsupervised feature selection. The model can simultaneously realize feature selection and subspace learning. In addition, we develop a greedy algorithm to establish a generic combinatorial model, and an iterative strategy based on an accelerated block coordinate descent is used to solve the GLoSS problem. We also provide whole iterate sequence convergence analysis of the proposed iterative algorithm. Extensive experiments are conducted on real-world datasets to show the superiority of the proposed approach over several state-of-the-art unsupervised feature selection approaches.Comment: 32 page, 6 figures and 60 reference

    Learning A Task-Specific Deep Architecture For Clustering

    Full text link
    While sparse coding-based clustering methods have shown to be successful, their bottlenecks in both efficiency and scalability limit the practical usage. In recent years, deep learning has been proved to be a highly effective, efficient and scalable feature learning tool. In this paper, we propose to emulate the sparse coding-based clustering pipeline in the context of deep learning, leading to a carefully crafted deep model benefiting from both. A feed-forward network structure, named TAGnet, is constructed based on a graph-regularized sparse coding algorithm. It is then trained with task-specific loss functions from end to end. We discover that connecting deep learning to sparse coding benefits not only the model performance, but also its initialization and interpretation. Moreover, by introducing auxiliary clustering tasks to the intermediate feature hierarchy, we formulate DTAGnet and obtain a further performance boost. Extensive experiments demonstrate that the proposed model gains remarkable margins over several state-of-the-art methods

    Multi-View Surveillance Video Summarization via Joint Embedding and Sparse Optimization

    Full text link
    Most traditional video summarization methods are designed to generate effective summaries for single-view videos, and thus they cannot fully exploit the complicated intra and inter-view correlations in summarizing multi-view videos in a camera network. In this paper, with the aim of summarizing multi-view videos, we introduce a novel unsupervised framework via joint embedding and sparse representative selection. The objective function is two-fold. The first is to capture the multi-view correlations via an embedding, which helps in extracting a diverse set of representatives. The second is to use a `2;1- norm to model the sparsity while selecting representative shots for the summary. We propose to jointly optimize both of the objectives, such that embedding can not only characterize the correlations, but also indicate the requirements of sparse representative selection. We present an efficient alternating algorithm based on half-quadratic minimization to solve the proposed non-smooth and non-convex objective with convergence analysis. A key advantage of the proposed approach with respect to the state-of-the-art is that it can summarize multi-view videos without assuming any prior correspondences/alignment between them, e.g., uncalibrated camera networks. Rigorous experiments on several multi-view datasets demonstrate that our approach clearly outperforms the state-of-the-art methods.Comment: IEEE Trans. on Multimedia, 2017 (In Press

    Feature Selection: A Data Perspective

    Full text link
    Feature selection, as a data preprocessing strategy, has been proven to be effective and efficient in preparing data (especially high-dimensional data) for various data mining and machine learning problems. The objectives of feature selection include: building simpler and more comprehensible models, improving data mining performance, and preparing clean, understandable data. The recent proliferation of big data has presented some substantial challenges and opportunities to feature selection. In this survey, we provide a comprehensive and structured overview of recent advances in feature selection research. Motivated by current challenges and opportunities in the era of big data, we revisit feature selection research from a data perspective and review representative feature selection algorithms for conventional data, structured data, heterogeneous data and streaming data. Methodologically, to emphasize the differences and similarities of most existing feature selection algorithms for conventional data, we categorize them into four main groups: similarity based, information theoretical based, sparse learning based and statistical based methods. To facilitate and promote the research in this community, we also present an open-source feature selection repository that consists of most of the popular feature selection algorithms (\url{http://featureselection.asu.edu/}). Also, we use it as an example to show how to evaluate feature selection algorithms. At the end of the survey, we present a discussion about some open problems and challenges that require more attention in future research

    Are You Imitating Me? Unsupervised Sparse Modeling for Group Activity Analysis from a Single Video

    Full text link
    A framework for unsupervised group activity analysis from a single video is here presented. Our working hypothesis is that human actions lie on a union of low-dimensional subspaces, and thus can be efficiently modeled as sparse linear combinations of atoms from a learned dictionary representing the action's primitives. Contrary to prior art, and with the primary goal of spatio-temporal action grouping, in this work only one single video segment is available for both unsupervised learning and analysis without any prior training information. After extracting simple features at a single spatio-temporal scale, we learn a dictionary for each individual in the video during each short time lapse. These dictionaries allow us to compare the individuals' actions by producing an affinity matrix which contains sufficient discriminative information about the actions in the scene leading to grouping with simple and efficient tools. With diverse publicly available real videos, we demonstrate the effectiveness of the proposed framework and its robustness to cluttered backgrounds, changes of human appearance, and action variability

    Unsupervised Multi-modal Hashing for Cross-modal retrieval

    Full text link
    With the advantage of low storage cost and high efficiency, hashing learning has received much attention in the domain of Big Data. In this paper, we propose a novel unsupervised hashing learning method to cope with this open problem to directly preserve the manifold structure by hashing. To address this problem, both the semantic correlation in textual space and the locally geometric structure in the visual space are explored simultaneously in our framework. Besides, the `2;1-norm constraint is imposed on the projection matrices to learn the discriminative hash function for each modality. Extensive experiments are performed to evaluate the proposed method on the three publicly available datasets and the experimental results show that our method can achieve superior performance over the state-of-the-art methods.Comment: 4 pages, 4 figure

    Deep Sparse Subspace Clustering

    Full text link
    In this paper, we present a deep extension of Sparse Subspace Clustering, termed Deep Sparse Subspace Clustering (DSSC). Regularized by the unit sphere distribution assumption for the learned deep features, DSSC can infer a new data affinity matrix by simultaneously satisfying the sparsity principle of SSC and the nonlinearity given by neural networks. One of the appealing advantages brought by DSSC is: when original real-world data do not meet the class-specific linear subspace distribution assumption, DSSC can employ neural networks to make the assumption valid with its hierarchical nonlinear transformations. To the best of our knowledge, this is among the first deep learning based subspace clustering methods. Extensive experiments are conducted on four real-world datasets to show the proposed DSSC is significantly superior to 12 existing methods for subspace clustering.Comment: The initial version is completed at the beginning of 201

    Cross-modal Subspace Learning for Fine-grained Sketch-based Image Retrieval

    Full text link
    Sketch-based image retrieval (SBIR) is challenging due to the inherent domain-gap between sketch and photo. Compared with pixel-perfect depictions of photos, sketches are iconic renderings of the real world with highly abstract. Therefore, matching sketch and photo directly using low-level visual clues are unsufficient, since a common low-level subspace that traverses semantically across the two modalities is non-trivial to establish. Most existing SBIR studies do not directly tackle this cross-modal problem. This naturally motivates us to explore the effectiveness of cross-modal retrieval methods in SBIR, which have been applied in the image-text matching successfully. In this paper, we introduce and compare a series of state-of-the-art cross-modal subspace learning methods and benchmark them on two recently released fine-grained SBIR datasets. Through thorough examination of the experimental results, we have demonstrated that the subspace learning can effectively model the sketch-photo domain-gap. In addition we draw a few key insights to drive future research.Comment: Accepted by Neurocomputin

    cvpaper.challenge in 2015 - A review of CVPR2015 and DeepSurvey

    Full text link
    The "cvpaper.challenge" is a group composed of members from AIST, Tokyo Denki Univ. (TDU), and Univ. of Tsukuba that aims to systematically summarize papers on computer vision, pattern recognition, and related fields. For this particular review, we focused on reading the ALL 602 conference papers presented at the CVPR2015, the premier annual computer vision event held in June 2015, in order to grasp the trends in the field. Further, we are proposing "DeepSurvey" as a mechanism embodying the entire process from the reading through all the papers, the generation of ideas, and to the writing of paper.Comment: Survey Pape
    corecore