18 research outputs found
Twin Learning for Similarity and Clustering: A Unified Kernel Approach
Many similarity-based clustering methods work in two separate steps including
similarity matrix computation and subsequent spectral clustering. However,
similarity measurement is challenging because it is usually impacted by many
factors, e.g., the choice of similarity metric, neighborhood size, scale of
data, noise and outliers. Thus the learned similarity matrix is often not
suitable, let alone optimal, for the subsequent clustering. In addition,
nonlinear similarity often exists in many real world data which, however, has
not been effectively considered by most existing methods. To tackle these two
challenges, we propose a model to simultaneously learn cluster indicator matrix
and similarity information in kernel spaces in a principled way. We show
theoretical relationships to kernel k-means, k-means, and spectral clustering
methods. Then, to address the practical issue of how to select the most
suitable kernel for a particular clustering task, we further extend our model
with a multiple kernel learning ability. With this joint model, we can
automatically accomplish three subtasks of finding the best cluster indicator
matrix, the most accurate similarity relations and the optimal combination of
multiple kernels. By leveraging the interactions between these three subtasks
in a joint framework, each subtask can be iteratively boosted by using the
results of the others towards an overall optimal solution. Extensive
experiments are performed to demonstrate the effectiveness of our method.Comment: Published in AAAI 201
Unified Spectral Clustering with Optimal Graph
Spectral clustering has found extensive use in many areas. Most traditional
spectral clustering algorithms work in three separate steps: similarity graph
construction; continuous labels learning; discretizing the learned labels by
k-means clustering. Such common practice has two potential flaws, which may
lead to severe information loss and performance degradation. First, predefined
similarity graph might not be optimal for subsequent clustering. It is
well-accepted that similarity graph highly affects the clustering results. To
this end, we propose to automatically learn similarity information from data
and simultaneously consider the constraint that the similarity matrix has exact
c connected components if there are c clusters. Second, the discrete solution
may deviate from the spectral solution since k-means method is well-known as
sensitive to the initialization of cluster centers. In this work, we transform
the candidate solution into a new one that better approximates the discrete
one. Finally, those three subtasks are integrated into a unified framework,
with each subtask iteratively boosted by using the results of the others
towards an overall optimal solution. It is known that the performance of a
kernel method is largely determined by the choice of kernels. To tackle this
practical problem of how to select the most suitable kernel for a particular
data set, we further extend our model to incorporate multiple kernel learning
ability. Extensive experiments demonstrate the superiority of our proposed
method as compared to existing clustering approaches.Comment: Accepted by AAAI 201
Automatic Discovery, Association Estimation and Learning of Semantic Attributes for a Thousand Categories
Attribute-based recognition models, due to their impressive performance and
their ability to generalize well on novel categories, have been widely adopted
for many computer vision applications. However, usually both the attribute
vocabulary and the class-attribute associations have to be provided manually by
domain experts or large number of annotators. This is very costly and not
necessarily optimal regarding recognition performance, and most importantly, it
limits the applicability of attribute-based models to large scale data sets. To
tackle this problem, we propose an end-to-end unsupervised attribute learning
approach. We utilize online text corpora to automatically discover a salient
and discriminative vocabulary that correlates well with the human concept of
semantic attributes. Moreover, we propose a deep convolutional model to
optimize class-attribute associations with a linguistic prior that accounts for
noise and missing data in text. In a thorough evaluation on ImageNet, we
demonstrate that our model is able to efficiently discover and learn semantic
attributes at a large scale. Furthermore, we demonstrate that our model
outperforms the state-of-the-art in zero-shot learning on three data sets:
ImageNet, Animals with Attributes and aPascal/aYahoo. Finally, we enable
attribute-based learning on ImageNet and will share the attributes and
associations for future research.Comment: Accepted as a conference paper at CVPR 201
Unsupervised Feature Selection with Adaptive Structure Learning
The problem of feature selection has raised considerable interests in the
past decade. Traditional unsupervised methods select the features which can
faithfully preserve the intrinsic structures of data, where the intrinsic
structures are estimated using all the input features of data. However, the
estimated intrinsic structures are unreliable/inaccurate when the redundant and
noisy features are not removed. Therefore, we face a dilemma here: one need the
true structures of data to identify the informative features, and one need the
informative features to accurately estimate the true structures of data. To
address this, we propose a unified learning framework which performs structure
learning and feature selection simultaneously. The structures are adaptively
learned from the results of feature selection, and the informative features are
reselected to preserve the refined structures of data. By leveraging the
interactions between these two essential tasks, we are able to capture accurate
structures and select more informative features. Experimental results on many
benchmark data sets demonstrate that the proposed method outperforms many state
of the art unsupervised feature selection methods
Infinite Latent Feature Selection: A Probabilistic Latent Graph-Based Ranking Approach
Feature selection is playing an increasingly significant role with respect to
many computer vision applications spanning from object recognition to visual
object tracking. However, most of the recent solutions in feature selection are
not robust across different and heterogeneous set of data. In this paper, we
address this issue proposing a robust probabilistic latent graph-based feature
selection algorithm that performs the ranking step while considering all the
possible subsets of features, as paths on a graph, bypassing the combinatorial
problem analytically. An appealing characteristic of the approach is that it
aims to discover an abstraction behind low-level sensory data, that is,
relevancy. Relevancy is modelled as a latent variable in a PLSA-inspired
generative process that allows the investigation of the importance of a feature
when injected into an arbitrary set of cues. The proposed method has been
tested on ten diverse benchmarks, and compared against eleven state of the art
feature selection methods. Results show that the proposed approach attains the
highest performance levels across many different scenarios and difficulties,
thereby confirming its strong robustness while setting a new state of the art
in feature selection domain.Comment: Accepted at the IEEE International Conference on Computer Vision
(ICCV), 2017, Venice. Preprint cop
A Matlab Toolbox for Feature Importance Ranking
More attention is being paid for feature importance ranking (FIR), in
particular when thousands of features can be extracted for intelligent
diagnosis and personalized medicine. A large number of FIR approaches have been
proposed, while few are integrated for comparison and real-life applications.
In this study, a matlab toolbox is presented and a total of 30 algorithms are
collected. Moreover, the toolbox is evaluated on a database of 163 ultrasound
images. To each breast mass lesion, 15 features are extracted. To figure out
the optimal subset of features for classification, all combinations of features
are tested and linear support vector machine is used for the malignancy
prediction of lesions annotated in ultrasound images. At last, the
effectiveness of FIR is analyzed according to performance comparison. The
toolbox is online (https://github.com/NicoYuCN/matFIR). In our future work,
more FIR methods, feature selection methods and machine learning classifiers
will be integrated
One-Step Clustering with Adaptively Local Kernels and a Neighborhood Kernel
Among the methods of multiple kernel clustering (MKC), some adopt a neighborhood kernel as the optimal kernel, and some use local base kernels to generate an optimal kernel. However, these two methods are not synthetically combined together to leverage their advantages, which affects the quality of the optimal kernel. Furthermore, most existing MKC methods require a two-step strategy to cluster, i.e., first learn an indicator matrix, then executive clustering. This does not guarantee the optimality of the final results. To overcome the above drawbacks, a one-step clustering with adaptively local kernels and a neighborhood kernel (OSC-ALK-ONK) is proposed in this paper, where the two methods are combined together to produce an optimal kernel. In particular, the neighborhood kernel improves the expression capability of the optimal kernel and enlarges its search range, and local base kernels avoid the redundancy of base kernels and promote their variety. Accordingly, the quality of the optimal kernel is enhanced. Further, a soft block diagonal (BD) regularizer is utilized to encourage the indicator matrix to be BD. It is helpful to obtain explicit clustering results directly and achieve one-step clustering, then overcome the disadvantage of the two-step strategy. In addition, extensive experiments on eight data sets and comparisons with six clustering methods show that OSC-ALK-ONK is effective