211 research outputs found
Understanding the Mechanics of SPIGOT: Surrogate Gradients for Latent Structure Learning
Latent structure models are a powerful tool for modeling language data: they
can mitigate the error propagation and annotation bottleneck in pipeline
systems, while simultaneously uncovering linguistic insights about the data.
One challenge with end-to-end training of these models is the argmax operation,
which has null gradient. In this paper, we focus on surrogate gradients, a
popular strategy to deal with this problem. We explore latent structure
learning through the angle of pulling back the downstream learning objective.
In this paradigm, we discover a principled motivation for both the
straight-through estimator (STE) as well as the recently-proposed SPIGOT - a
variant of STE for structured models. Our perspective leads to new algorithms
in the same family. We empirically compare the known and the novel pulled-back
estimators against the popular alternatives, yielding new insight for
practitioners and revealing intriguing failure cases.Comment: EMNLP 202
Nonparametric ridge estimation
We study the problem of estimating the ridges of a density function. Ridge
estimation is an extension of mode finding and is useful for understanding the
structure of a density. It can also be used to find hidden structure in point
cloud data. We show that, under mild regularity conditions, the ridges of the
kernel density estimator consistently estimate the ridges of the true density.
When the data are noisy measurements of a manifold, we show that the ridges are
close and topologically similar to the hidden manifold. To find the estimated
ridges in practice, we adapt the modified mean-shift algorithm proposed by
Ozertem and Erdogmus [J. Mach. Learn. Res. 12 (2011) 1249-1286]. Some numerical
experiments verify that the algorithm is accurate.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1218 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Recommended from our members
Large-Scale Multi-Agent Transport: Theory, Algorithms and Analysis
The problem of transport of multi-agent systems has received much attention in a wide range of engineering and biological contexts, such as spatial coverage optimization, collective migration, estimation and mapping of unknown environments. In particular, the emphasis has been on the search for scalable decentralized algorithms that are applicable to large-scale multi-agent systems.For large multi-agent collectives, it is appropriate to describe the configuration of the collective and its evolution using macroscopic quantities, while actuation rests at the microscopic scale at the level of individual agents. Moreover, the control problem faces a multitude of information constraints imposed by the multi-agent setting, such as limitations in sensing, communication and localization. Viewed in this way, the problem naturally extends across scales and this motivates a search for algorithms that respect information constraints at the microscopic level while guaranteeing performance at the macroscopic level.We address the above concerns in this dissertation on three fronts: theory, algorithms and analysis. We begin with the development of a multiscale theory of gradient descent-based multi-agent transport that bridges the microscopic and macroscopic perspectives and sets out a general framework for the design and analysis of decentralized algorithms for transport. We then consider the problem of optimal transport of multi-agent systems, wherein the objective is the minimization of the net cost of transport under constraints of distributed computation. This is followed by a treatment of multi-agent transport under constraints on sensing and communication, in the absence of location information, where we study the problem of self-organization in swarms of agents. Motivated by the problem of multi-agent navigation and tracking of moving targets, we then present a study of moving-horizon estimation of nonlinear systems viewed as a transport of probability measures. Finally, we investigate the robustness of multi-agent networks to agent failure, via the problem of identifying critical nodes in large-scale networks
Self-Consistent Velocity Matching of Probability Flows
We present a discretization-free scalable framework for solving a large class
of mass-conserving partial differential equations (PDEs), including the
time-dependent Fokker-Planck equation and the Wasserstein gradient flow. The
main observation is that the time-varying velocity field of the PDE solution
needs to be self-consistent: it must satisfy a fixed-point equation involving
the probability flow characterized by the same velocity field. Instead of
directly minimizing the residual of the fixed-point equation with neural
parameterization, we use an iterative formulation with a biased gradient
estimator that bypasses significant computational obstacles with strong
empirical performance. Compared to existing approaches, our method does not
suffer from temporal or spatial discretization, covers a wider range of PDEs,
and scales to high dimensions. Experimentally, our method recovers analytical
solutions accurately when they are available and achieves superior performance
in high dimensions with less training time compared to alternatives
On topological data analysis for structural dynamics: an introduction to persistent homology
Topological methods can provide a way of proposing new metrics and methods of
scrutinising data, that otherwise may be overlooked. In this work, a method of
quantifying the shape of data, via a topic called topological data analysis
will be introduced. The main tool within topological data analysis (TDA) is
persistent homology. Persistent homology is a method of quantifying the shape
of data over a range of length scales. The required background and a method of
computing persistent homology is briefly discussed in this work. Ideas from
topological data analysis are then used for nonlinear dynamics to analyse some
common attractors, by calculating their embedding dimension, and then to assess
their general topologies. A method will also be proposed, that uses topological
data analysis to determine the optimal delay for a time-delay embedding. TDA
will also be applied to a Z24 Bridge case study in structural health
monitoring, where it will be used to scrutinise different data partitions,
classified by the conditions at which the data were collected. A metric, from
topological data analysis, is used to compare data between the partitions. The
results presented demonstrate that the presence of damage alters the manifold
shape more significantly than the effects present from temperature
Courbure discrète : théorie et applications
International audienceThe present volume contains the proceedings of the 2013 Meeting on discrete curvature, held at CIRM, Luminy, France. The aim of this meeting was to bring together researchers from various backgrounds, ranging from mathematics to computer science, with a focus on both theory and applications. With 27 invited talks and 8 posters, the conference attracted 70 researchers from all over the world. The challenge of finding a common ground on the topic of discrete curvature was met with success, and these proceedings are a testimony of this wor
- …