31 research outputs found

    Dissimilarity-based Sparse Subset Selection

    Full text link
    Finding an informative subset of a large collection of data points or models is at the center of many problems in computer vision, recommender systems, bio/health informatics as well as image and natural language processing. Given pairwise dissimilarities between the elements of a `source set' and a `target set,' we consider the problem of finding a subset of the source set, called representatives or exemplars, that can efficiently describe the target set. We formulate the problem as a row-sparsity regularized trace minimization problem. Since the proposed formulation is, in general, NP-hard, we consider a convex relaxation. The solution of our optimization finds representatives and the assignment of each element of the target set to each representative, hence, obtaining a clustering. We analyze the solution of our proposed optimization as a function of the regularization parameter. We show that when the two sets jointly partition into multiple groups, our algorithm finds representatives from all groups and reveals clustering of the sets. In addition, we show that the proposed framework can effectively deal with outliers. Our algorithm works with arbitrary dissimilarities, which can be asymmetric or violate the triangle inequality. To efficiently implement our algorithm, we consider an Alternating Direction Method of Multipliers (ADMM) framework, which results in quadratic complexity in the problem size. We show that the ADMM implementation allows to parallelize the algorithm, hence further reducing the computational time. Finally, by experiments on real-world datasets, we show that our proposed algorithm improves the state of the art on the two problems of scene categorization using representative images and time-series modeling and segmentation using representative~models

    Iterative Projection and Matching: Finding Structure-preserving Representatives and Its Application to Computer Vision

    Full text link
    The goal of data selection is to capture the most structural information from a set of data. This paper presents a fast and accurate data selection method, in which the selected samples are optimized to span the subspace of all data. We propose a new selection algorithm, referred to as iterative projection and matching (IPM), with linear complexity w.r.t. the number of data, and without any parameter to be tuned. In our algorithm, at each iteration, the maximum information from the structure of the data is captured by one selected sample, and the captured information is neglected in the next iterations by projection on the null-space of previously selected samples. The computational efficiency and the selection accuracy of our proposed algorithm outperform those of the conventional methods. Furthermore, the superiority of the proposed algorithm is shown on active learning for video action recognition dataset on UCF-101; learning using representatives on ImageNet; training a generative adversarial network (GAN) to generate multi-view images from a single-view input on CMU Multi-PIE dataset; and video summarization on UTE Egocentric dataset.Comment: 11 pages, 5 figures, 5 table

    A Light-Weight Multimodal Framework for Improved Environmental Audio Tagging

    Full text link
    The lack of strong labels has severely limited the state-of-the-art fully supervised audio tagging systems to be scaled to larger dataset. Meanwhile, audio-visual learning models based on unlabeled videos have been successfully applied to audio tagging, but they are inevitably resource hungry and require a long time to train. In this work, we propose a light-weight, multimodal framework for environmental audio tagging. The audio branch of the framework is a convolutional and recurrent neural network (CRNN) based on multiple instance learning (MIL). It is trained with the audio tracks of a large collection of weakly labeled YouTube video excerpts; the video branch uses pretrained state-of-the-art image recognition networks and word embeddings to extract information from the video track and to map visual objects to sound events. Experiments on the audio tagging task of the DCASE 2017 challenge show that the incorporation of video information improves a strong baseline audio tagging system by 5.3\% absolute in terms of F1F_1 score. The entire system can be trained within 6~hours on a single GPU, and can be easily carried over to other audio tasks such as speech sentimental analysis.Comment: 5 pages, 3 figures, Accepted and to appear at ICASSP 201

    Multiple Kernel kk-Means Clustering by Selecting Representative Kernels

    Full text link
    To cluster data that are not linearly separable in the original feature space, kk-means clustering was extended to the kernel version. However, the performance of kernel kk-means clustering largely depends on the choice of kernel function. To mitigate this problem, multiple kernel learning has been introduced into the kk-means clustering to obtain an optimal kernel combination for clustering. Despite the success of multiple kernel kk-means clustering in various scenarios, few of the existing work update the combination coefficients based on the diversity of kernels, which leads to the result that the selected kernels contain high redundancy and would degrade the clustering performance and efficiency. In this paper, we propose a simple but efficient strategy that selects a diverse subset from the pre-specified kernels as the representative kernels, and then incorporate the subset selection process into the framework of multiple kk-means clustering. The representative kernels can be indicated as the significant combination weights. Due to the non-convexity of the obtained objective function, we develop an alternating minimization method to optimize the combination coefficients of the selected kernels and the cluster membership alternatively. We evaluate the proposed approach on several benchmark and real-world datasets. The experimental results demonstrate the competitiveness of our approach in comparison with the state-of-the-art methods.Comment: 8 pages, 7 figure

    Batch Active Preference-Based Learning of Reward Functions

    Full text link
    Data generation and labeling are usually an expensive part of learning for robotics. While active learning methods are commonly used to tackle the former problem, preference-based learning is a concept that attempts to solve the latter by querying users with preference questions. In this paper, we will develop a new algorithm, batch active preference-based learning, that enables efficient learning of reward functions using as few data samples as possible while still having short query generation times. We introduce several approximations to the batch active learning problem, and provide theoretical guarantees for the convergence of our algorithms. Finally, we present our experimental results for a variety of robotics tasks in simulation. Our results suggest that our batch active learning algorithm requires only a few queries that are computed in a short amount of time. We then showcase our algorithm in a study to learn human users' preferences.Comment: Proceedings of the 2nd Conference on Robot Learning (CoRL), October 201

    Approximate Subspace-Sparse Recovery with Corrupted Data via Constrained â„“1\ell_1-Minimization

    Full text link
    High-dimensional data often lie in low-dimensional subspaces corresponding to different classes they belong to. Finding sparse representations of data points in a dictionary built using the collection of data helps to uncover low-dimensional subspaces and address problems such as clustering, classification, subset selection and more. In this paper, we address the problem of recovering sparse representations for noisy data points in a dictionary whose columns correspond to corrupted data lying close to a union of subspaces. We consider a constrained ℓ1\ell_1-minimization and study conditions under which the solution of the proposed optimization satisfies the approximate subspace-sparse recovery condition. More specifically, we show that each noisy data point, perturbed from a subspace by a noise of the magnitude of ε\varepsilon, will be reconstructed using data points from the same subspace with a small error of the order of O(ε)O(\varepsilon) and that the coefficients corresponding to data points in other subspaces will be sufficiently small, \ie, of the order of O(ε)O(\varepsilon). We do not impose any randomness assumption on the arrangement of subspaces or distribution of data points in each subspace. Our framework is based on a novel generalization of the null-space property to the setting where data lie in multiple subspaces, the number of data points in each subspace exceeds the dimension of the subspace, and all data points are corrupted by noise. Moreover, assuming a random distribution for data points, we further show that coefficients from the desired support not only reconstruct a given point with high accuracy, but also have sufficiently large values, \ie, of the order of O(1)O(1)

    Cluster Representatives Selection in Non-Metric Spaces for Nearest Prototype Classification

    Full text link
    The nearest prototype classification is a less computationally intensive replacement for the kk-NN method, especially when large datasets are considered. In metric spaces, centroids are often used as prototypes to represent whole clusters. The selection of cluster prototypes in non-metric spaces is more challenging as the idea of computing centroids is not directly applicable. In this paper, we present CRS, a novel method for selecting a small yet representative subset of objects as a cluster prototype. Memory and computationally efficient selection of representatives is enabled by leveraging the similarity graph representation of each cluster created by the NN-Descent algorithm. CRS can be used in an arbitrary metric or non-metric space because of the graph-based approach, which requires only a pairwise similarity measure. As we demonstrate in the experimental evaluation, our method outperforms the state of the art techniques on multiple datasets from different domains

    Two-way Spectrum Pursuit for CUR Decomposition and Its Application in Joint Column/Row Subset Selection

    Full text link
    The problem of simultaneous column and row subset selection is addressed in this paper. The column space and row space of a matrix are spanned by its left and right singular vectors, respectively. However, the singular vectors are not within actual columns/rows of the matrix. In this paper, an iterative approach is proposed to capture the most structural information of columns/rows via selecting a subset of actual columns/rows. This algorithm is referred to as two-way spectrum pursuit (TWSP) which provides us with an accurate solution for the CUR matrix decomposition. TWSP is applicable in a wide range of applications since it enjoys a linear complexity w.r.t. number of original columns/rows. We demonstrated the application of TWSP for joint channel and sensor selection in cognitive radio networks, informative users and contents detection, and efficient supervised data reduction

    Scaled Simplex Representation for Subspace Clustering

    Full text link
    The self-expressive property of data points, i.e., each data point can be linearly represented by the other data points in the same subspace, has proven effective in leading subspace clustering methods. Most self-expressive methods usually construct a feasible affinity matrix from a coefficient matrix, obtained by solving an optimization problem. However, the negative entries in the coefficient matrix are forced to be positive when constructing the affinity matrix via exponentiation, absolute symmetrization, or squaring operations. This consequently damages the inherent correlations among the data. Besides, the affine constraint used in these methods is not flexible enough for practical applications. To overcome these problems, in this paper, we introduce a scaled simplex representation (SSR) for subspace clustering problem. Specifically, the non-negative constraint is used to make the coefficient matrix physically meaningful, and the coefficient vector is constrained to be summed up to a scalar s<1 to make it more discriminative. The proposed SSR based subspace clustering (SSRSC) model is reformulated as a linear equality-constrained problem, which is solved efficiently under the alternating direction method of multipliers framework. Experiments on benchmark datasets demonstrate that the proposed SSRSC algorithm is very efficient and outperforms state-of-the-art subspace clustering methods on accuracy. The code can be found at https://github.com/csjunxu/SSRSC.Comment: Accepted by IEEE Transactions on Cybernetics. 13 pages, 9 figures, 10 tables. Code can be found at https://github.com/csjunxu/SSRS

    Diversity-aware Multi-Video Summarization

    Full text link
    Most video summarization approaches have focused on extracting a summary from a single video; we propose an unsupervised framework for summarizing a collection of videos. We observe that each video in the collection may contain some information that other videos do not have, and thus exploring the underlying complementarity could be beneficial in creating a diverse informative summary. We develop a novel diversity-aware sparse optimization method for multi-video summarization by exploring the complementarity within the videos. Our approach extracts a multi-video summary which is both interesting and representative in describing the whole video collection. To efficiently solve our optimization problem, we develop an alternating minimization algorithm that minimizes the overall objective function with respect to one video at a time while fixing the other videos. Moreover, we introduce a new benchmark dataset, Tour20, that contains 140 videos with multiple human created summaries, which were acquired in a controlled experiment. Finally, by extensive experiments on the new Tour20 dataset and several other multi-view datasets, we show that the proposed approach clearly outperforms the state-of-the-art methods on the two problems-topic-oriented video summarization and multi-view video summarization in a camera network.Comment: IEEE Trans. on Image Processing, 2017 (In Press
    corecore