12 research outputs found
IST Austria Thesis
An instance of the Constraint Satisfaction Problem (CSP) is given by a finite set of
variables, a finite domain of labels, and a set of constraints, each constraint acting on
a subset of the variables. The goal is to find an assignment of labels to its variables
that satisfies all constraints (or decide whether one exists). If we allow more general
“soft” constraints, which come with (possibly infinite) costs of particular assignments,
we obtain instances from a richer class called Valued Constraint Satisfaction Problem
(VCSP). There the goal is to find an assignment with minimum total cost.
In this thesis, we focus (assuming that P
6
=
NP) on classifying computational com-
plexity of CSPs and VCSPs under certain restricting conditions. Two results are the core
content of the work. In one of them, we consider VCSPs parametrized by a constraint
language, that is the set of “soft” constraints allowed to form the instances, and finish
the complexity classification modulo (missing pieces of) complexity classification for
analogously parametrized CSP. The other result is a generalization of Edmonds’ perfect
matching algorithm. This generalization contributes to complexity classfications in two
ways. First, it gives a new (largest known) polynomial-time solvable class of Boolean
CSPs in which every variable may appear in at most two constraints and second, it
settles full classification of Boolean CSPs with planar drawing (again parametrized by a
constraint language)
Total variation on a tree
We consider the problem of minimizing the continuous valued total variation
subject to different unary terms on trees and propose fast direct algorithms
based on dynamic programming to solve these problems. We treat both the convex
and the non-convex case and derive worst case complexities that are equal or
better than existing methods. We show applications to total variation based 2D
image processing and computer vision problems based on a Lagrangian
decomposition approach. The resulting algorithms are very efficient, offer a
high degree of parallelism and come along with memory requirements which are
only in the order of the number of image pixels.Comment: accepted to SIAM Journal on Imaging Sciences (SIIMS
L4: Practical loss-based stepsize adaptation for deep learning
We propose a stepsize adaptation scheme for stochastic gradient descent. It
operates directly with the loss function and rescales the gradient in order to
make fixed predicted progress on the loss. We demonstrate its capabilities by
conclusively improving the performance of Adam and Momentum optimizers. The
enhanced optimizers with default hyperparameters consistently outperform their
constant stepsize counterparts, even the best ones, without a measurable
increase in computational cost. The performance is validated on multiple
architectures including dense nets, CNNs, ResNets, and the recurrent
Differential Neural Computer on classical datasets MNIST, fashion MNIST,
CIFAR10 and others.Comment: NeurIPS, 201
Playful Math - An introduction to mathematical games
It is difficult to give a precise definition for the concept of a mathematical game. Instead, we list some features, which apply for most games (in particular for those we introduce in the next section). Usually there are two players (or teams) playing against each other. Complete information. This roughly means that in every turn, each of the players can make a perfectly logical decision based on the history of the game. The game has no gambling part; the outcome does not depend on luck, but purely on the strategy of the players. The rules are simple to understand, and usually there are only a few of them. The ultimate goal is not winning, but understanding the structure of the game.If a game always comes to an end in finitely many turns, one of the players always has a winning strategy.
All the games we present are of this type
Efficient Optimization for Rank-based Loss Functions
The accuracy of information retrieval systems is often measured using complex
loss functions such as the average precision (AP) or the normalized discounted
cumulative gain (NDCG). Given a set of positive and negative samples, the
parameters of a retrieval system can be estimated by minimizing these loss
functions. However, the non-differentiability and non-decomposability of these
loss functions does not allow for simple gradient based optimization
algorithms. This issue is generally circumvented by either optimizing a
structured hinge-loss upper bound to the loss function or by using asymptotic
methods like the direct-loss minimization framework. Yet, the high
computational complexity of loss-augmented inference, which is necessary for
both the frameworks, prohibits its use in large training data sets. To
alleviate this deficiency, we present a novel quicksort flavored algorithm for
a large class of non-decomposable loss functions. We provide a complete
characterization of the loss functions that are amenable to our algorithm, and
show that it includes both AP and NDCG based loss functions. Furthermore, we
prove that no comparison based algorithm can improve upon the computational
complexity of our approach asymptotically. We demonstrate the effectiveness of
our approach in the context of optimizing the structured hinge loss upper bound
of AP and NDCG loss for learning models for a variety of vision tasks. We show
that our approach provides significantly better results than simpler
decomposable loss functions, while requiring a comparable training time.Comment: 15 pages, 2 figure
Superconcentrators of density 25.3
An N-superconcentrator is a directed, acyclic graph with N input nodes and N output nodes such that every subset of the inputs and every subset of the outputs of same cardinality can be connected by node-disjoint paths. It is known that linear-size and bounded-degree superconcentrators exist. We prove the existence of such superconcentrators with asymptotic density 25.3 (where the density is the number of edges divided by N). The previously best known densities were 28 [12] and 27.4136 [17]
Efficient optimization for rank-based loss functions
The accuracy of information retrieval systems is often measured using complex loss functions such as the average precision (AP) or the normalized discounted cumulative gain (NDCG). Given a set of positive and negative samples, the parameters of a retrieval system can be estimated by minimizing these loss functions. However, the non-differentiability and non-decomposability of these loss functions does not allow for simple gradient based optimization algorithms. This issue is generally circumvented by either optimizing a structured hinge-loss upper bound to the loss function or by using asymptotic methods like the direct-loss minimization framework. Yet, the high computational complexity of loss-augmented inference, which is necessary for both the frameworks, prohibits its use in large training data sets. To alleviate this deficiency, we present a novel quicksort flavored algorithm for a large class of non-decomposable loss functions. We provide a complete characterization of the loss functions that are amenable to our algorithm, and show that it includes both AP and NDCG based loss functions. Furthermore, we prove that no comparison based algorithm can improve upon the computational complexity of our approach asymptotically. We demonstrate the effectiveness of our approach in the context of optimizing the structured hinge loss upper bound of AP and NDCG loss for learning models for a variety of vision tasks. We show that our approach provides significantly better results than simpler decomposable loss functions, while requiring a comparable training time