378 research outputs found
Hierarchies of Relaxations for Online Prediction Problems with Evolving Constraints
We study online prediction where regret of the algorithm is measured against
a benchmark defined via evolving constraints. This framework captures online
prediction on graphs, as well as other prediction problems with combinatorial
structure. A key aspect here is that finding the optimal benchmark predictor
(even in hindsight, given all the data) might be computationally hard due to
the combinatorial nature of the constraints. Despite this, we provide
polynomial-time \emph{prediction} algorithms that achieve low regret against
combinatorial benchmark sets. We do so by building improper learning algorithms
based on two ideas that work together. The first is to alleviate part of the
computational burden through random playout, and the second is to employ
Lasserre semidefinite hierarchies to approximate the resulting integer program.
Interestingly, for our prediction algorithms, we only need to compute the
values of the semidefinite programs and not the rounded solutions. However, the
integrality gap for Lasserre hierarchy \emph{does} enter the generic regret
bound in terms of Rademacher complexity of the benchmark set. This establishes
a trade-off between the computation time and the regret bound of the algorithm
Functional Liftings of Vectorial Variational Problems with Laplacian Regularization
We propose a functional lifting-based convex relaxation of variational
problems with Laplacian-based second-order regularization. The approach rests
on ideas from the calibration method as well as from sublabel-accurate
continuous multilabeling approaches, and makes these approaches amenable for
variational problems with vectorial data and higher-order regularization, as is
common in image processing applications. We motivate the approach in the
function space setting and prove that, in the special case of absolute
Laplacian regularization, it encompasses the discretization-first
sublabel-accurate continuous multilabeling approach as a special case. We
present a mathematical connection between the lifted and original functional
and discuss possible interpretations of minimizers in the lifted function
space. Finally, we exemplarily apply the proposed approach to 2D image
registration problems.Comment: 12 pages, 3 figures; accepted at the conference "Scale Space and
Variational Methods" in Hofgeismar, Germany 201
A generalized risk approach to path inference based on hidden Markov models
Motivated by the unceasing interest in hidden Markov models (HMMs), this
paper re-examines hidden path inference in these models, using primarily a
risk-based framework. While the most common maximum a posteriori (MAP), or
Viterbi, path estimator and the minimum error, or Posterior Decoder (PD), have
long been around, other path estimators, or decoders, have been either only
hinted at or applied more recently and in dedicated applications generally
unfamiliar to the statistical learning community. Over a decade ago, however, a
family of algorithmically defined decoders aiming to hybridize the two standard
ones was proposed (Brushe et al., 1998). The present paper gives a careful
analysis of this hybridization approach, identifies several problems and issues
with it and other previously proposed approaches, and proposes practical
resolutions of those. Furthermore, simple modifications of the classical
criteria for hidden path recognition are shown to lead to a new class of
decoders. Dynamic programming algorithms to compute these decoders in the usual
forward-backward manner are presented. A particularly interesting subclass of
such estimators can be also viewed as hybrids of the MAP and PD estimators.
Similar to previously proposed MAP-PD hybrids, the new class is parameterized
by a small number of tunable parameters. Unlike their algorithmic predecessors,
the new risk-based decoders are more clearly interpretable, and, most
importantly, work "out of the box" in practice, which is demonstrated on some
real bioinformatics tasks and data. Some further generalizations and
applications are discussed in conclusion.Comment: Section 5: corrected denominators of the scaled beta variables (pp.
27-30), => corrections in claims 1, 3, Prop. 12, bottom of Table 1. Decoder
(49), Corol. 14 are generalized to handle 0 probabilities. Notation is more
closely aligned with (Bishop, 2006). Details are inserted in eqn-s (43); the
positivity assumption in Prop. 11 is explicit. Fixed typing errors in
equation (41), Example
Learning Differentiable Programs with Admissible Neural Heuristics
We study the problem of learning differentiable functions expressed as programs in a domain-specific language. Such programmatic models can offer benefits such as composability and interpretability; however, learning them requires optimizing over a combinatorial space of program "architectures". We frame this optimization problem as a search in a weighted graph whose paths encode top-down derivations of program syntax. Our key innovation is to view various classes of neural networks as continuous relaxations over the space of programs, which can then be used to complete any partial program. This relaxed program is differentiable and can be trained end-to-end, and the resulting training loss is an approximately admissible heuristic that can guide the combinatorial search. We instantiate our approach on top of the A-star algorithm and an iteratively deepened branch-and-bound search, and use these algorithms to learn programmatic classifiers in three sequence classification tasks. Our experiments show that the algorithms outperform state-of-the-art methods for program learning, and that they discover programmatic classifiers that yield natural interpretations and achieve competitive accuracy
Learning Differentiable Programs with Admissible Neural Heuristics
We study the problem of learning differentiable functions expressed as
programs in a domain-specific language. Such programmatic models can offer
benefits such as composability and interpretability; however, learning them
requires optimizing over a combinatorial space of program "architectures". We
frame this optimization problem as a search in a weighted graph whose paths
encode top-down derivations of program syntax. Our key innovation is to view
various classes of neural networks as continuous relaxations over the space of
programs, which can then be used to complete any partial program. This relaxed
program is differentiable and can be trained end-to-end, and the resulting
training loss is an approximately admissible heuristic that can guide the
combinatorial search. We instantiate our approach on top of the A-star
algorithm and an iteratively deepened branch-and-bound search, and use these
algorithms to learn programmatic classifiers in three sequence classification
tasks. Our experiments show that the algorithms outperform state-of-the-art
methods for program learning, and that they discover programmatic classifiers
that yield natural interpretations and achieve competitive accuracy.Comment: 9 pages, published in NeurIPS 202
- …