222 research outputs found
Learning multi-view neighborhood preserving projections
We address the problem of metric learning for multi-view data, namely the construction of embedding projections from data in different representations into a shared feature space, such that the Euclidean distance in this space provides a meaningful within-view as well as between-view similarity. Our motivation stems from the problem of cross-media retrieval tasks, where the availability of a joint Euclidean distance function is a prerequisite to allow fast, in particular hashing-based, nearest neighbor queries. We formulate an objective function that expresses the intuitive concept that matching samples are mapped closely together in the output space, whereas non-matching samples are pushed apart, no matter in which view they are available. The resulting optimization problem is not convex, but it can be decomposed explicitly into a convex and a concave part, thereby allowing efficient optimization using the convex-concave procedure. Experiments on an image retrieval task show that nearest-neighbor based cross-view retrieval is indeed possible, and the proposed technique improves the retrieval accuracy over baseline techniques
Data-Dependent Stability of Stochastic Gradient Descent
We establish a data-dependent notion of algorithmic stability for Stochastic
Gradient Descent (SGD), and employ it to develop novel generalization bounds.
This is in contrast to previous distribution-free algorithmic stability results
for SGD which depend on the worst-case constants. By virtue of the
data-dependent argument, our bounds provide new insights into learning with SGD
on convex and non-convex problems. In the convex case, we show that the bound
on the generalization error depends on the risk at the initialization point. In
the non-convex case, we prove that the expected curvature of the objective
function around the initialization point has crucial influence on the
generalization error. In both cases, our results suggest a simple data-driven
strategy to stabilize SGD by pre-screening its initialization. As a corollary,
our results allow us to show optimistic generalization bounds that exhibit fast
convergence rates for SGD subject to a vanishing empirical risk and low noise
of stochastic gradient
A Multi-Plane Block-Coordinate Frank-Wolfe Algorithm for Training Structural SVMs with a Costly max-Oracle
Structural support vector machines (SSVMs) are amongst the best performing
models for structured computer vision tasks, such as semantic image
segmentation or human pose estimation. Training SSVMs, however, is
computationally costly, because it requires repeated calls to a structured
prediction subroutine (called \emph{max-oracle}), which has to solve an
optimization problem itself, e.g. a graph cut.
In this work, we introduce a new algorithm for SSVM training that is more
efficient than earlier techniques when the max-oracle is computationally
expensive, as it is frequently the case in computer vision tasks. The main idea
is to (i) combine the recent stochastic Block-Coordinate Frank-Wolfe algorithm
with efficient hyperplane caching, and (ii) use an automatic selection rule for
deciding whether to call the exact max-oracle or to rely on an approximate one
based on the cached hyperplanes.
We show experimentally that this strategy leads to faster convergence to the
optimum with respect to the number of requires oracle calls, and that this
translates into faster convergence with respect to the total runtime when the
max-oracle is slow compared to the other steps of the algorithm.
A publicly available C++ implementation is provided at
http://pub.ist.ac.at/~vnk/papers/SVM.html
Probabilistic Image Colorization
We develop a probabilistic technique for colorizing grayscale natural images.
In light of the intrinsic uncertainty of this task, the proposed probabilistic
framework has numerous desirable properties. In particular, our model is able
to produce multiple plausible and vivid colorizations for a given grayscale
image and is one of the first colorization models to provide a proper
stochastic sampling scheme. Moreover, our training procedure is supported by a
rigorous theoretical framework that does not require any ad hoc heuristics and
allows for efficient modeling and learning of the joint pixel color
distribution. We demonstrate strong quantitative and qualitative experimental
results on the CIFAR-10 dataset and the challenging ILSVRC 2012 dataset
Learning to rank using privileged information
Many computer vision problems have an asymmetric distribution of information between training and test time. In this work, we study the case where we are given additional information about the training data, which however will not be available at test time. This situation is called learning using privileged information (LUPI). We introduce two maximum-margin techniques that are able to make use of this additional source of information, and we show that the framework is applicable to several scenarios that have been studied in computer vision before. Experiments with attributes, bounding boxes, image tags and rationales as additional information in object classification show promising results
- …