Search CORE

99 research outputs found

Data-Dependent Stability of Stochastic Gradient Descent

Author: Kuzborskij Ilja
Lampert Christoph H.
Publication venue
Publication date: 01/01/2018
Field of study

We establish a data-dependent notion of algorithmic stability for Stochastic Gradient Descent (SGD), and employ it to develop novel generalization bounds. This is in contrast to previous distribution-free algorithmic stability results for SGD which depend on the worst-case constants. By virtue of the data-dependent argument, our bounds provide new insights into learning with SGD on convex and non-convex problems. In the convex case, we show that the bound on the generalization error depends on the risk at the initialization point. In the non-convex case, we prove that the expected curvature of the objective function around the initialization point has crucial influence on the generalization error. In both cases, our results suggest a simple data-driven strategy to stabilize SGD by pre-screening its initialization. As a corollary, our results allow us to show optimistic generalization bounds that exhibit fast convergence rates for SGD subject to a vanishing empirical risk and low noise of stochastic gradient

arXiv.org e-Print Archive

IST Austria: PubRep (Institute of Science and Technology)

A Multi-Plane Block-Coordinate Frank-Wolfe Algorithm for Training Structural SVMs with a Costly max-Oracle

Author: Kolmogorov Vladimir
Lampert Christoph H.
Shah Neel
Publication venue
Publication date: 18/11/2014
Field of study

Structural support vector machines (SSVMs) are amongst the best performing models for structured computer vision tasks, such as semantic image segmentation or human pose estimation. Training SSVMs, however, is computationally costly, because it requires repeated calls to a structured prediction subroutine (called \emph{max-oracle}), which has to solve an optimization problem itself, e.g. a graph cut. In this work, we introduce a new algorithm for SSVM training that is more efficient than earlier techniques when the max-oracle is computationally expensive, as it is frequently the case in computer vision tasks. The main idea is to (i) combine the recent stochastic Block-Coordinate Frank-Wolfe algorithm with efficient hyperplane caching, and (ii) use an automatic selection rule for deciding whether to call the exact max-oracle or to rely on an approximate one based on the cached hyperplanes. We show experimentally that this strategy leads to faster convergence to the optimum with respect to the number of requires oracle calls, and that this translates into faster convergence with respect to the total runtime when the max-oracle is slow compared to the other steps of the algorithm. A publicly available C++ implementation is provided at http://pub.ist.ac.at/~vnk/papers/SVM.html

arXiv.org e-Print Archive

CiteSeerX

Crossref

IST Austria: PubRep (Institute of Science and Technology)

Probabilistic Image Colorization

Author: Kolesnikov Alexander
Lampert Christoph H.
Royer Amelie
Publication venue
Publication date: 01/01/2017
Field of study

We develop a probabilistic technique for colorizing grayscale natural images. In light of the intrinsic uncertainty of this task, the proposed probabilistic framework has numerous desirable properties. In particular, our model is able to produce multiple plausible and vivid colorizations for a given grayscale image and is one of the first colorization models to provide a proper stochastic sampling scheme. Moreover, our training procedure is supported by a rigorous theoretical framework that does not require any ad hoc heuristics and allows for efficient modeling and learning of the joint pixel color distribution. We demonstrate strong quantitative and qualitative experimental results on the CIFAR-10 dataset and the challenging ILSVRC 2012 dataset

arXiv.org e-Print Archive

Crossref

IST Austria: PubRep (Institute of Science and Technology)

Boundary regularity of admissible operators

Author: Lampert Christoph H.
Publication venue: 'Universitat Autonoma de Barcelona'
Publication date: 01/01/2005
Field of study

In strictly pseudoconvex domains with smooth boundary, we prove a commutator relationship between admissible integral operators, as introduced by Lieb and Range, and smooth vector fields which are tangential at boundary points. This makes it possible to gain estimates for admissible operators in function spaces which involve tangential derivatives. Examples are given under with circumstances these can be transformed into genuine Sobolev- and Ck-estimates

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Revistes Catalanes amb Accés Obert

Diposit Digital de Documents de la UAB

The most persistent soft-clique in a set of sampled graphs

Author: Quadrianto Novi
Chen Chao
Lampert Christoph H
Publication venue: Omnipress
Publication date: 26/03/2005
Field of study

When searching for characteristic subpatterns in potentially noisy graph data, it appears self-evident that having multiple observations would be better than having just one. However, it turns out that the inconsistencies introduced when different graph instances have different edge sets pose a serious challenge. In this work we address this challenge for the problem of finding maximum weighted cliques. We introduce the concept of most persistent soft-clique. This is subset of vertices, that 1) is almost fully or at least densely connected, 2) occurs in all or almost all graph instances, and 3) has the maximum weight. We present a measure of clique-ness, that essentially counts the number of edge missing to make a subset of vertices into a clique. With this measure, we show that the problem of finding the most persistent soft-clique problem can be cast either as: a) a max-min two person game optimization problem, or b) a min-min soft margin optimization problem. Both formulations lead to the same solution when using a partial Lagrangian method to solve the optimization problems. By experiments on synthetic data and on real social network data, we show that the proposed method is able to reliably find soft cliques in graph data, even if that is distorted by random noise or unreliable observations

arXiv.org e-Print Archive

CiteSeerX

CERN Document Server

Sussex Research Online

University of St. Andrews - Pure

Learning to rank using privileged information

Author: Lampert Christoph H
Quadrianto Novi
Sharmanska Viktoriia
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

Many computer vision problems have an asymmetric distribution of information between training and test time. In this work, we study the case where we are given additional information about the training data, which however will not be available at test time. This situation is called learning using privileged information (LUPI). We introduce two maximum-margin techniques that are able to make use of this additional source of information, and we show that the framework is applicable to several scenarios that have been studied in computer vision before. Experiments with attributes, bounding boxes, image tags and rationales as additional information in object classification show promising results

CiteSeerX

Crossref

IST Austria: PubRep (Institute of Science and Technology)

Spiral - Imperial College Digital Repository

Sussex Research Online