1,183 research outputs found
Rethinking Initialization of the Sinkhorn Algorithm
While the optimal transport (OT) problem was originally formulated as a
linear program, the addition of entropic regularization has proven beneficial
both computationally and statistically, for many applications. The Sinkhorn
fixed-point algorithm is the most popular approach to solve this regularized
problem, and, as a result, multiple attempts have been made to reduce its
runtime using, e.g., annealing in the regularization parameter, momentum or
acceleration. The premise of this work is that initialization of the Sinkhorn
algorithm has received comparatively little attention, possibly due to two
preconceptions: since the regularized OT problem is convex, it may not be worth
crafting a good initialization, since any is guaranteed to work; secondly,
because the outputs of the Sinkhorn algorithm are often unrolled in end-to-end
pipelines, a data-dependent initialization would bias Jacobian computations. We
challenge this conventional wisdom, and show that data-dependent initializers
result in dramatic speed-ups, with no effect on differentiability as long as
implicit differentiation is used. Our initializations rely on closed-forms for
exact or approximate OT solutions that are known in the 1D, Gaussian or GMM
settings. They can be used with minimal tuning, and result in consistent
speed-ups for a wide variety of OT problems
Self-Ordering Point Clouds
In this paper we address the task of finding representative subsets of points
in a 3D point cloud by means of a point-wise ordering. Only a few works have
tried to address this challenging vision problem, all with the help of hard to
obtain point and cloud labels. Different from these works, we introduce the
task of point-wise ordering in 3D point clouds through self-supervision, which
we call self-ordering. We further contribute the first end-to-end trainable
network that learns a point-wise ordering in a self-supervised fashion. It
utilizes a novel differentiable point scoring-sorting strategy and it
constructs an hierarchical contrastive scheme to obtain self-supervision
signals. We extensively ablate the method and show its scalability and superior
performance even compared to supervised ordering methods on multiple datasets
and tasks including zero-shot ordering of point clouds from unseen categories
Fast, Differentiable and Sparse Top-k: a Convex Analysis Perspective
The top-k operator returns a sparse vector, where the non-zero values
correspond to the k largest values of the input. Unfortunately, because it is a
discontinuous function, it is difficult to incorporate in neural networks
trained end-to-end with backpropagation. Recent works have considered
differentiable relaxations, based either on regularization or perturbation
techniques. However, to date, no approach is fully differentiable and sparse.
In this paper, we propose new differentiable and sparse top-k operators. We
view the top-k operator as a linear program over the permutahedron, the convex
hull of permutations. We then introduce a p-norm regularization term to smooth
out the operator, and show that its computation can be reduced to isotonic
optimization. Our framework is significantly more general than the existing one
and allows for example to express top-k operators that select values in
magnitude. On the algorithmic side, in addition to pool adjacent violator (PAV)
algorithms, we propose a new GPU/TPU-friendly Dykstra algorithm to solve
isotonic optimization problems. We successfully use our operators to prune
weights in neural networks, to fine-tune vision transformers, and as a router
in sparse mixture of experts.Comment: ICML 2023 18 page
About latent roles in forecasting players in team sports
Forecasting players in sports has grown in popularity due to the potential
for a tactical advantage and the applicability of such research to multi-agent
interaction systems. Team sports contain a significant social component that
influences interactions between teammates and opponents. However, it still
needs to be fully exploited. In this work, we hypothesize that each participant
has a specific function in each action and that role-based interaction is
critical for predicting players' future moves. We create RolFor, a novel
end-to-end model for Role-based Forecasting. RolFor uses a new module we
developed called Ordering Neural Networks (OrderNN) to permute the order of the
players such that each player is assigned to a latent role. The latent role is
then modeled with a RoleGCN. Thanks to its graph representation, it provides a
fully learnable adjacency matrix that captures the relationships between roles
and is subsequently used to forecast the players' future trajectories.
Extensive experiments on a challenging NBA basketball dataset back up the
importance of roles and justify our goal of modeling them using optimizable
models. When an oracle provides roles, the proposed RolFor compares favorably
to the current state-of-the-art (it ranks first in terms of ADE and second in
terms of FDE errors). However, training the end-to-end RolFor incurs the issues
of differentiability of permutation methods, which we experimentally review.
Finally, this work restates differentiable ranking as a difficult open problem
and its great potential in conjunction with graph-based interaction models.
Project is available at: https://www.pinlab.org/aboutlatentrolesComment: AI4ABM@ICLR2023 Worksho
- …