28 research outputs found
Convolutional neural networks: a magic bullet for gravitational-wave detection?
In the last few years, machine learning techniques, in particular
convolutional neural networks, have been investigated as a method to replace or
complement traditional matched filtering techniques that are used to detect the
gravitational-wave signature of merging black holes. However, to date, these
methods have not yet been successfully applied to the analysis of long
stretches of data recorded by the Advanced LIGO and Virgo gravitational-wave
observatories. In this work, we critically examine the use of convolutional
neural networks as a tool to search for merging black holes. We identify the
strengths and limitations of this approach, highlight some common pitfalls in
translating between machine learning and gravitational-wave astronomy, and
discuss the interdisciplinary challenges. In particular, we explain in detail
why convolutional neural networks alone cannot be used to claim a statistically
significant gravitational-wave detection. However, we demonstrate how they can
still be used to rapidly flag the times of potential signals in the data for a
more detailed follow-up. Our convolutional neural network architecture as well
as the proposed performance metrics are better suited for this task than a
standard binary classifications scheme. A detailed evaluation of our approach
on Advanced LIGO data demonstrates the potential of such systems as trigger
generators. Finally, we sound a note of caution by constructing adversarial
examples, which showcase interesting "failure modes" of our model, where inputs
with no visible resemblance to real gravitational-wave signals are identified
as such by the network with high confidence.Comment: First two authors contributed equally; appeared at Phys. Rev.
Learning Independent Causal Mechanisms
Statistical learning relies upon data sampled from a distribution, and we
usually do not care what actually generated it in the first place. From the
point of view of causal modeling, the structure of each distribution is induced
by physical mechanisms that give rise to dependences between observables.
Mechanisms, however, can be meaningful autonomous modules of generative models
that make sense beyond a particular entailed data distribution, lending
themselves to transfer between problems. We develop an algorithm to recover a
set of independent (inverse) mechanisms from a set of transformed data points.
The approach is unsupervised and based on a set of experts that compete for
data generated by the mechanisms, driving specialization. We analyze the
proposed method in a series of experiments on image data. Each expert learns to
map a subset of the transformed data back to a reference distribution. The
learned mechanisms generalize to novel domains. We discuss implications for
transfer learning and links to recent trends in generative modeling.Comment: ICML 201
Universal hydrodynamic flow in holographic planar shock collisions
We study the collision of planar shock waves in AdS as a function of
shock profile. In the dual field theory the shock waves describe planar sheets
of energy whose collision results in the formation of a plasma which behaves
hydrodynamically at late times. We find that the post-collision stress tensor
near the light cone exhibits transient non-universal behavior which depends on
both the shock width and the precise functional form of the shock profile.
However, over a large range of shock widths, including those which yield
qualitative different behavior near the future light cone, and for different
shock profiles, we find universal behavior in the subsequent hydrodynamic
evolution. Additionally, we compute the rapidity distribution of produced
particles and find it to be well described by a Gaussian.Comment: 23 pages, 15 figures, published versio
Avoiding Discrimination through Causal Reasoning
Recent work on fairness in machine learning has focused on various
statistical discrimination criteria and how they trade off. Most of these
criteria are observational: They depend only on the joint distribution of
predictor, protected attribute, features, and outcome. While convenient to work
with, observational criteria have severe inherent limitations that prevent them
from resolving matters of fairness conclusively.
Going beyond observational criteria, we frame the problem of discrimination
based on protected attributes in the language of causal reasoning. This
viewpoint shifts attention from "What is the right fairness criterion?" to
"What do we want to assume about the causal data generating process?" Through
the lens of causality, we make several contributions. First, we crisply
articulate why and when observational criteria fail, thus formalizing what was
before a matter of opinion. Second, our approach exposes previously ignored
subtleties and why they are fundamental to the problem. Finally, we put forward
natural causal non-discrimination criteria and develop algorithms that satisfy
them.Comment: Advances in Neural Information Processing Systems 30, 2017
http://papers.nips.cc/paper/6668-avoiding-discrimination-through-causal-reasonin
Supervised Learning and Model Analysis with Compositional Data
The compositionality and sparsity of high-throughput sequencing data poses a
challenge for regression and classification. However, in microbiome research in
particular, conditional modeling is an essential tool to investigate
relationships between phenotypes and the microbiome. Existing techniques are
often inadequate: they either rely on extensions of the linear log-contrast
model (which adjusts for compositionality, but is often unable to capture
useful signals), or they are based on black-box machine learning methods (which
may capture useful signals, but ignore compositionality in downstream
analyses).
We propose KernelBiome, a kernel-based nonparametric regression and
classification framework for compositional data. It is tailored to sparse
compositional data and is able to incorporate prior knowledge, such as
phylogenetic structure. KernelBiome captures complex signals, including in the
zero-structure, while automatically adapting model complexity. We demonstrate
on par or improved predictive performance compared with state-of-the-art
machine learning methods. Additionally, our framework provides two key
advantages: (i) We propose two novel quantities to interpret contributions of
individual components and prove that they consistently estimate average
perturbation effects of the conditional mean, extending the interpretability of
linear log-contrast models to nonparametric models. (ii) We show that the
connection between kernels and distances aids interpretability and provides a
data-driven embedding that can augment further analysis. Finally, we apply the
KernelBiome framework to two public microbiome studies and illustrate the
proposed model analysis. KernelBiome is available as an open-source Python
package at https://github.com/shimenghuang/KernelBiome
Stabilized Neural Differential Equations for Learning Dynamics with Explicit Constraints
Many successful methods to learn dynamical systems from data have recently
been introduced. However, ensuring that the inferred dynamics preserve known
constraints, such as conservation laws or restrictions on the allowed system
states, remains challenging. We propose stabilized neural differential
equations (SNDEs), a method to enforce arbitrary manifold constraints for
neural differential equations. Our approach is based on a stabilization term
that, when added to the original dynamics, renders the constraint manifold
provably asymptotically stable. Due to its simplicity, our method is compatible
with all common neural differential equation (NDE) models and broadly
applicable. In extensive empirical evaluations, we demonstrate that SNDEs
outperform existing methods while broadening the types of constraints that can
be incorporated into NDE training.Comment: 22 pages, 8 figures. Accepted at NeurIPS 202
Discovering ordinary differential equations that govern time-series
Natural laws are often described through differential equations yet finding a
differential equation that describes the governing law underlying observed data
is a challenging and still mostly manual task. In this paper we make a step
towards the automation of this process: we propose a transformer-based
sequence-to-sequence model that recovers scalar autonomous ordinary
differential equations (ODEs) in symbolic form from time-series data of a
single observed solution of the ODE. Our method is efficiently scalable: after
one-time pretraining on a large set of ODEs, we can infer the governing laws of
a new observed solution in a few forward passes of the model. Then we show that
our model performs better or on par with existing methods in various test cases
in terms of accurate symbolic recovery of the ODE, especially for more complex
expressions.Comment: Workshop paper at NeurIPS 2022 workshop "AI for Science