31 research outputs found
Dissimilarity-based Sparse Subset Selection
Finding an informative subset of a large collection of data points or models
is at the center of many problems in computer vision, recommender systems,
bio/health informatics as well as image and natural language processing. Given
pairwise dissimilarities between the elements of a `source set' and a `target
set,' we consider the problem of finding a subset of the source set, called
representatives or exemplars, that can efficiently describe the target set. We
formulate the problem as a row-sparsity regularized trace minimization problem.
Since the proposed formulation is, in general, NP-hard, we consider a convex
relaxation. The solution of our optimization finds representatives and the
assignment of each element of the target set to each representative, hence,
obtaining a clustering. We analyze the solution of our proposed optimization as
a function of the regularization parameter. We show that when the two sets
jointly partition into multiple groups, our algorithm finds representatives
from all groups and reveals clustering of the sets. In addition, we show that
the proposed framework can effectively deal with outliers. Our algorithm works
with arbitrary dissimilarities, which can be asymmetric or violate the triangle
inequality. To efficiently implement our algorithm, we consider an Alternating
Direction Method of Multipliers (ADMM) framework, which results in quadratic
complexity in the problem size. We show that the ADMM implementation allows to
parallelize the algorithm, hence further reducing the computational time.
Finally, by experiments on real-world datasets, we show that our proposed
algorithm improves the state of the art on the two problems of scene
categorization using representative images and time-series modeling and
segmentation using representative~models
Iterative Projection and Matching: Finding Structure-preserving Representatives and Its Application to Computer Vision
The goal of data selection is to capture the most structural information from
a set of data. This paper presents a fast and accurate data selection method,
in which the selected samples are optimized to span the subspace of all data.
We propose a new selection algorithm, referred to as iterative projection and
matching (IPM), with linear complexity w.r.t. the number of data, and without
any parameter to be tuned. In our algorithm, at each iteration, the maximum
information from the structure of the data is captured by one selected sample,
and the captured information is neglected in the next iterations by projection
on the null-space of previously selected samples. The computational efficiency
and the selection accuracy of our proposed algorithm outperform those of the
conventional methods. Furthermore, the superiority of the proposed algorithm is
shown on active learning for video action recognition dataset on UCF-101;
learning using representatives on ImageNet; training a generative adversarial
network (GAN) to generate multi-view images from a single-view input on CMU
Multi-PIE dataset; and video summarization on UTE Egocentric dataset.Comment: 11 pages, 5 figures, 5 table
A Light-Weight Multimodal Framework for Improved Environmental Audio Tagging
The lack of strong labels has severely limited the state-of-the-art fully
supervised audio tagging systems to be scaled to larger dataset. Meanwhile,
audio-visual learning models based on unlabeled videos have been successfully
applied to audio tagging, but they are inevitably resource hungry and require a
long time to train. In this work, we propose a light-weight, multimodal
framework for environmental audio tagging. The audio branch of the framework is
a convolutional and recurrent neural network (CRNN) based on multiple instance
learning (MIL). It is trained with the audio tracks of a large collection of
weakly labeled YouTube video excerpts; the video branch uses pretrained
state-of-the-art image recognition networks and word embeddings to extract
information from the video track and to map visual objects to sound events.
Experiments on the audio tagging task of the DCASE 2017 challenge show that the
incorporation of video information improves a strong baseline audio tagging
system by 5.3\% absolute in terms of score. The entire system can be
trained within 6~hours on a single GPU, and can be easily carried over to other
audio tasks such as speech sentimental analysis.Comment: 5 pages, 3 figures, Accepted and to appear at ICASSP 201
Multiple Kernel -Means Clustering by Selecting Representative Kernels
To cluster data that are not linearly separable in the original feature
space, -means clustering was extended to the kernel version. However, the
performance of kernel -means clustering largely depends on the choice of
kernel function. To mitigate this problem, multiple kernel learning has been
introduced into the -means clustering to obtain an optimal kernel
combination for clustering. Despite the success of multiple kernel -means
clustering in various scenarios, few of the existing work update the
combination coefficients based on the diversity of kernels, which leads to the
result that the selected kernels contain high redundancy and would degrade the
clustering performance and efficiency. In this paper, we propose a simple but
efficient strategy that selects a diverse subset from the pre-specified kernels
as the representative kernels, and then incorporate the subset selection
process into the framework of multiple -means clustering. The representative
kernels can be indicated as the significant combination weights. Due to the
non-convexity of the obtained objective function, we develop an alternating
minimization method to optimize the combination coefficients of the selected
kernels and the cluster membership alternatively. We evaluate the proposed
approach on several benchmark and real-world datasets. The experimental results
demonstrate the competitiveness of our approach in comparison with the
state-of-the-art methods.Comment: 8 pages, 7 figure
Batch Active Preference-Based Learning of Reward Functions
Data generation and labeling are usually an expensive part of learning for
robotics. While active learning methods are commonly used to tackle the former
problem, preference-based learning is a concept that attempts to solve the
latter by querying users with preference questions. In this paper, we will
develop a new algorithm, batch active preference-based learning, that enables
efficient learning of reward functions using as few data samples as possible
while still having short query generation times. We introduce several
approximations to the batch active learning problem, and provide theoretical
guarantees for the convergence of our algorithms. Finally, we present our
experimental results for a variety of robotics tasks in simulation. Our results
suggest that our batch active learning algorithm requires only a few queries
that are computed in a short amount of time. We then showcase our algorithm in
a study to learn human users' preferences.Comment: Proceedings of the 2nd Conference on Robot Learning (CoRL), October
201
Approximate Subspace-Sparse Recovery with Corrupted Data via Constrained -Minimization
High-dimensional data often lie in low-dimensional subspaces corresponding to
different classes they belong to. Finding sparse representations of data points
in a dictionary built using the collection of data helps to uncover
low-dimensional subspaces and address problems such as clustering,
classification, subset selection and more. In this paper, we address the
problem of recovering sparse representations for noisy data points in a
dictionary whose columns correspond to corrupted data lying close to a union of
subspaces. We consider a constrained -minimization and study conditions
under which the solution of the proposed optimization satisfies the approximate
subspace-sparse recovery condition. More specifically, we show that each noisy
data point, perturbed from a subspace by a noise of the magnitude of
, will be reconstructed using data points from the same subspace
with a small error of the order of and that the coefficients
corresponding to data points in other subspaces will be sufficiently small,
\ie, of the order of . We do not impose any randomness
assumption on the arrangement of subspaces or distribution of data points in
each subspace. Our framework is based on a novel generalization of the
null-space property to the setting where data lie in multiple subspaces, the
number of data points in each subspace exceeds the dimension of the subspace,
and all data points are corrupted by noise. Moreover, assuming a random
distribution for data points, we further show that coefficients from the
desired support not only reconstruct a given point with high accuracy, but also
have sufficiently large values, \ie, of the order of
Cluster Representatives Selection in Non-Metric Spaces for Nearest Prototype Classification
The nearest prototype classification is a less computationally intensive
replacement for the -NN method, especially when large datasets are
considered. In metric spaces, centroids are often used as prototypes to
represent whole clusters. The selection of cluster prototypes in non-metric
spaces is more challenging as the idea of computing centroids is not directly
applicable.
In this paper, we present CRS, a novel method for selecting a small yet
representative subset of objects as a cluster prototype. Memory and
computationally efficient selection of representatives is enabled by leveraging
the similarity graph representation of each cluster created by the NN-Descent
algorithm. CRS can be used in an arbitrary metric or non-metric space because
of the graph-based approach, which requires only a pairwise similarity measure.
As we demonstrate in the experimental evaluation, our method outperforms the
state of the art techniques on multiple datasets from different domains
Two-way Spectrum Pursuit for CUR Decomposition and Its Application in Joint Column/Row Subset Selection
The problem of simultaneous column and row subset selection is addressed in
this paper. The column space and row space of a matrix are spanned by its left
and right singular vectors, respectively. However, the singular vectors are not
within actual columns/rows of the matrix. In this paper, an iterative approach
is proposed to capture the most structural information of columns/rows via
selecting a subset of actual columns/rows. This algorithm is referred to as
two-way spectrum pursuit (TWSP) which provides us with an accurate solution for
the CUR matrix decomposition. TWSP is applicable in a wide range of
applications since it enjoys a linear complexity w.r.t. number of original
columns/rows. We demonstrated the application of TWSP for joint channel and
sensor selection in cognitive radio networks, informative users and contents
detection, and efficient supervised data reduction
Scaled Simplex Representation for Subspace Clustering
The self-expressive property of data points, i.e., each data point can be
linearly represented by the other data points in the same subspace, has proven
effective in leading subspace clustering methods. Most self-expressive methods
usually construct a feasible affinity matrix from a coefficient matrix,
obtained by solving an optimization problem. However, the negative entries in
the coefficient matrix are forced to be positive when constructing the affinity
matrix via exponentiation, absolute symmetrization, or squaring operations.
This consequently damages the inherent correlations among the data. Besides,
the affine constraint used in these methods is not flexible enough for
practical applications. To overcome these problems, in this paper, we introduce
a scaled simplex representation (SSR) for subspace clustering problem.
Specifically, the non-negative constraint is used to make the coefficient
matrix physically meaningful, and the coefficient vector is constrained to be
summed up to a scalar s<1 to make it more discriminative. The proposed SSR
based subspace clustering (SSRSC) model is reformulated as a linear
equality-constrained problem, which is solved efficiently under the alternating
direction method of multipliers framework. Experiments on benchmark datasets
demonstrate that the proposed SSRSC algorithm is very efficient and outperforms
state-of-the-art subspace clustering methods on accuracy. The code can be found
at https://github.com/csjunxu/SSRSC.Comment: Accepted by IEEE Transactions on Cybernetics. 13 pages, 9 figures, 10
tables. Code can be found at https://github.com/csjunxu/SSRS
Diversity-aware Multi-Video Summarization
Most video summarization approaches have focused on extracting a summary from
a single video; we propose an unsupervised framework for summarizing a
collection of videos. We observe that each video in the collection may contain
some information that other videos do not have, and thus exploring the
underlying complementarity could be beneficial in creating a diverse
informative summary. We develop a novel diversity-aware sparse optimization
method for multi-video summarization by exploring the complementarity within
the videos. Our approach extracts a multi-video summary which is both
interesting and representative in describing the whole video collection. To
efficiently solve our optimization problem, we develop an alternating
minimization algorithm that minimizes the overall objective function with
respect to one video at a time while fixing the other videos. Moreover, we
introduce a new benchmark dataset, Tour20, that contains 140 videos with
multiple human created summaries, which were acquired in a controlled
experiment. Finally, by extensive experiments on the new Tour20 dataset and
several other multi-view datasets, we show that the proposed approach clearly
outperforms the state-of-the-art methods on the two problems-topic-oriented
video summarization and multi-view video summarization in a camera network.Comment: IEEE Trans. on Image Processing, 2017 (In Press