47 research outputs found
Geometrically Enriched Latent Spaces
A common assumption in generative models is that the generator immerses the
latent space into a Euclidean ambient space. Instead, we consider the ambient
space to be a Riemannian manifold, which allows for encoding domain knowledge
through the associated Riemannian metric. Shortest paths can then be defined
accordingly in the latent space to both follow the learned manifold and respect
the ambient geometry. Through careful design of the ambient metric we can
ensure that shortest paths are well-behaved even for deterministic generators
that otherwise would exhibit a misleading bias. Experimentally we show that our
approach improves interpretability of learned representations both using
stochastic and deterministic generators
A locally adaptive normal distribution
The multivariate normal density is a monotonic function of the distance to
the mean, and its ellipsoidal shape is due to the underlying Euclidean metric.
We suggest to replace this metric with a locally adaptive, smoothly changing
(Riemannian) metric that favors regions of high local density. The resulting
locally adaptive normal distribution (LAND) is a generalization of the normal
distribution to the "manifold" setting, where data is assumed to lie near a
potentially low-dimensional manifold embedded in . The LAND is
parametric, depending only on a mean and a covariance, and is the maximum
entropy distribution under the given metric. The underlying metric is, however,
non-parametric. We develop a maximum likelihood algorithm to infer the
distribution parameters that relies on a combination of gradient descent and
Monte Carlo integration. We further extend the LAND to mixture models, and
provide the corresponding EM algorithm. We demonstrate the efficiency of the
LAND to fit non-trivial probability distributions over both synthetic data, and
EEG measurements of human sleep
Fast and Robust Shortest Paths on Manifolds Learned from Data
We propose a fast, simple and robust algorithm for computing shortest paths
and distances on Riemannian manifolds learned from data. This amounts to
solving a system of ordinary differential equations (ODEs) subject to boundary
conditions. Here standard solvers perform poorly because they require
well-behaved Jacobians of the ODE, and usually, manifolds learned from data
imply unstable and ill-conditioned Jacobians. Instead, we propose a fixed-point
iteration scheme for solving the ODE that avoids Jacobians. This enhances the
stability of the solver, while reduces the computational cost. In experiments
involving both Riemannian metric learning and deep generative models we
demonstrate significant improvements in speed and stability over both
general-purpose state-of-the-art solvers as well as over specialized solvers.Comment: Accepted at Artificial Intelligence and Statistics (AISTATS) 201
Variational Autoencoders with Riemannian Brownian Motion Priors
Variational Autoencoders (VAEs) represent the given data in a low-dimensional
latent space, which is generally assumed to be Euclidean. This assumption
naturally leads to the common choice of a standard Gaussian prior over
continuous latent variables. Recent work has, however, shown that this prior
has a detrimental effect on model capacity, leading to subpar performance. We
propose that the Euclidean assumption lies at the heart of this failure mode.
To counter this, we assume a Riemannian structure over the latent space, which
constitutes a more principled geometric view of the latent codes, and replace
the standard Gaussian prior with a Riemannian Brownian motion prior. We propose
an efficient inference scheme that does not rely on the unknown normalizing
factor of this prior. Finally, we demonstrate that this prior significantly
increases model capacity using only one additional scalar parameter.Comment: Published in ICML 202
Metagenomics : tools and insights for analyzing next-generation sequencing data derived from biodiversity studies
Advances in next-generation sequencing (NGS) have allowed significant breakthroughs in microbial ecology studies. This has led to the rapid expansion of research in the field and the establishment of “metagenomics”, often defined as the analysis of DNA from microbial communities in environmental samples without prior need for culturing. Many metagenomics statistical/computational tools and databases have been developed in order to allow the exploitation of the huge influx of data. In this review article, we provide an overview of the sequencing technologies and how they are uniquely suited to various types of metagenomic studies. We focus on the currently available bioinformatics techniques, tools, and methodologies for performing each individual step of a typical metagenomic dataset analysis. We also provide future trends in the field with respect to tools and technologies currently under development. Moreover, we discuss data management, distribution, and integration tools that are capable of performing comparative metagenomic analyses of multiple datasets using well-established databases, as well as commonly used annotation standards
Neural Contractive Dynamical Systems
Stability guarantees are crucial when ensuring a fully autonomous robot does
not take undesirable or potentially harmful actions. Unfortunately, global
stability guarantees are hard to provide in dynamical systems learned from
data, especially when the learned dynamics are governed by neural networks. We
propose a novel methodology to learn neural contractive dynamical systems,
where our neural architecture ensures contraction, and hence, global stability.
To efficiently scale the method to high-dimensional dynamical systems, we
develop a variant of the variational autoencoder that learns dynamics in a
low-dimensional latent representation space while retaining contractive
stability after decoding. We further extend our approach to learning
contractive systems on the Lie group of rotations to account for full-pose
end-effector dynamic motions. The result is the first highly flexible learning
architecture that provides contractive stability guarantees with capability to
perform obstacle avoidance. Empirically, we demonstrate that our approach
encodes the desired dynamics more accurately than the current state-of-the-art,
which provides less strong stability guarantees
Polytraits : a database on biological traits of marine polychaetes
The study of ecosystem functioning – the role which organisms play in an ecosystem – is becoming increasingly important in marine ecological research. The functional structure of a community can be represented by a set of functional traits assigned to behavioural, reproductive and morphological characteristics. The collection of these traits from the literature is however a laborious and time-consuming process, and gaps of knowledge and restricted availability of literature are a common problem. Trait data are not yet readily being shared by research communities, and even if they are, a lack of trait data repositories and standards for data formats leads to the publication of trait information in forms which cannot be processed by computers. This paper describes Polytraits (http://polytraits.lifewatchgreece.eu), a database on biological traits of marine polychaetes (bristle worms, Polychaeta: Annelida). At present, the database contains almost 20,000 records on morphological, behavioural and reproductive characteristics of more than 1,000 marine polychaete species, all referenced by literature sources. All data can be freely accessed through the project website in different ways and formats, both human-readable and machine-readable, and have been submitted to the Encyclopedia of Life for archival and integration with trait information from other sources