1,494 research outputs found
Reparameterizing the Birkhoff Polytope for Variational Permutation Inference
Many matching, tracking, sorting, and ranking problems require probabilistic
reasoning about possible permutations, a set that grows factorially with
dimension. Combinatorial optimization algorithms may enable efficient point
estimation, but fully Bayesian inference poses a severe challenge in this
high-dimensional, discrete space. To surmount this challenge, we start with the
usual step of relaxing a discrete set (here, of permutation matrices) to its
convex hull, which here is the Birkhoff polytope: the set of all
doubly-stochastic matrices. We then introduce two novel transformations: first,
an invertible and differentiable stick-breaking procedure that maps
unconstrained space to the Birkhoff polytope; second, a map that rounds points
toward the vertices of the polytope. Both transformations include a temperature
parameter that, in the limit, concentrates the densities on permutation
matrices. We then exploit these transformations and reparameterization
gradients to introduce variational inference over permutation matrices, and we
demonstrate its utility in a series of experiments
Probabilistic Models over Ordered Partitions with Application in Learning to Rank
This paper addresses the general problem of modelling and learning rank data
with ties. We propose a probabilistic generative model, that models the process
as permutations over partitions. This results in super-exponential
combinatorial state space with unknown numbers of partitions and unknown
ordering among them. We approach the problem from the discrete choice theory,
where subsets are chosen in a stagewise manner, reducing the state space per
each stage significantly. Further, we show that with suitable parameterisation,
we can still learn the models in linear time. We evaluate the proposed models
on the problem of learning to rank with the data from the recently held Yahoo!
challenge, and demonstrate that the models are competitive against well-known
rivals.Comment: 19 pages, 2 figure
Algorithms for Approximate Minimization of the Difference Between Submodular Functions, with Applications
We extend the work of Narasimhan and Bilmes [30] for minimizing set functions
representable as a difference between submodular functions. Similar to [30],
our new algorithms are guaranteed to monotonically reduce the objective
function at every step. We empirically and theoretically show that the
per-iteration cost of our algorithms is much less than [30], and our algorithms
can be used to efficiently minimize a difference between submodular functions
under various combinatorial constraints, a problem not previously addressed. We
provide computational bounds and a hardness result on the mul- tiplicative
inapproximability of minimizing the difference between submodular functions. We
show, however, that it is possible to give worst-case additive bounds by
providing a polynomial time computable lower-bound on the minima. Finally we
show how a number of machine learning problems can be modeled as minimizing the
difference between submodular functions. We experimentally show the validity of
our algorithms by testing them on the problem of feature selection with
submodular cost features.Comment: 17 pages, 8 figures. A shorter version of this appeared in Proc.
Uncertainty in Artificial Intelligence (UAI), Catalina Islands, 201
- …