6,273 research outputs found
Generalization Guarantees for a Binary Classification Framework for Two-Stage Multiple Kernel Learning
We present generalization bounds for the TS-MKL framework for two stage
multiple kernel learning. We also present bounds for sparse kernel learning
formulations within the TS-MKL framework
A Theoretical Analysis of Contrastive Unsupervised Representation Learning
Recent empirical works have successfully used unlabeled data to learn feature
representations that are broadly useful in downstream classification tasks.
Several of these methods are reminiscent of the well-known word2vec embedding
algorithm: leveraging availability of pairs of semantically "similar" data
points and "negative samples," the learner forces the inner product of
representations of similar pairs with each other to be higher on average than
with negative samples. The current paper uses the term contrastive learning for
such algorithms and presents a theoretical framework for analyzing them by
introducing latent classes and hypothesizing that semantically similar points
are sampled from the same latent class. This framework allows us to show
provable guarantees on the performance of the learned representations on the
average classification task that is comprised of a subset of the same set of
latent classes. Our generalization bound also shows that learned
representations can reduce (labeled) sample complexity on downstream tasks. We
conduct controlled experiments in both the text and image domains to support
the theory.Comment: 19 pages, 5 figure
Deep Attributes Driven Multi-Camera Person Re-identification
The visual appearance of a person is easily affected by many factors like
pose variations, viewpoint changes and camera parameter differences. This makes
person Re-Identification (ReID) among multiple cameras a very challenging task.
This work is motivated to learn mid-level human attributes which are robust to
such visual appearance variations. And we propose a semi-supervised attribute
learning framework which progressively boosts the accuracy of attributes only
using a limited number of labeled data. Specifically, this framework involves a
three-stage training. A deep Convolutional Neural Network (dCNN) is first
trained on an independent dataset labeled with attributes. Then it is
fine-tuned on another dataset only labeled with person IDs using our defined
triplet loss. Finally, the updated dCNN predicts attribute labels for the
target dataset, which is combined with the independent dataset for the final
round of fine-tuning. The predicted attributes, namely \emph{deep attributes}
exhibit superior generalization ability across different datasets. By directly
using the deep attributes with simple Cosine distance, we have obtained
surprisingly good accuracy on four person ReID datasets. Experiments also show
that a simple metric learning modular further boosts our method, making it
significantly outperform many recent works.Comment: Person Re-identification; 17 pages; 5 figures; In IEEE ECCV 201
Supervised Learning with Similarity Functions
We address the problem of general supervised learning when data can only be
accessed through an (indefinite) similarity function between data points.
Existing work on learning with indefinite kernels has concentrated solely on
binary/multi-class classification problems. We propose a model that is generic
enough to handle any supervised learning task and also subsumes the model
previously proposed for classification. We give a "goodness" criterion for
similarity functions w.r.t. a given supervised learning task and then adapt a
well-known landmarking technique to provide efficient algorithms for supervised
learning using "good" similarity functions. We demonstrate the effectiveness of
our model on three important super-vised learning problems: a) real-valued
regression, b) ordinal regression and c) ranking where we show that our method
guarantees bounded generalization error. Furthermore, for the case of
real-valued regression, we give a natural goodness definition that, when used
in conjunction with a recent result in sparse vector recovery, guarantees a
sparse predictor with bounded generalization error. Finally, we report results
of our learning algorithms on regression and ordinal regression tasks using
non-PSD similarity functions and demonstrate the effectiveness of our
algorithms, especially that of the sparse landmark selection algorithm that
achieves significantly higher accuracies than the baseline methods while
offering reduced computational costs.Comment: To appear in the proceedings of NIPS 2012, 30 page
- …