73,078 research outputs found
Finding Statistically Significant Interactions between Continuous Features
The search for higher-order feature interactions that are statistically
significantly associated with a class variable is of high relevance in fields
such as Genetics or Healthcare, but the combinatorial explosion of the
candidate space makes this problem extremely challenging in terms of
computational efficiency and proper correction for multiple testing. While
recent progress has been made regarding this challenge for binary features, we
here present the first solution for continuous features. We propose an
algorithm which overcomes the combinatorial explosion of the search space of
higher-order interactions by deriving a lower bound on the p-value for each
interaction, which enables us to massively prune interactions that can never
reach significance and to thereby gain more statistical power. In our
experiments, our approach efficiently detects all significant interactions in a
variety of synthetic and real-world datasets.Comment: 13 pages, 5 figures, 2 tables, accepted to the 28th International
Joint Conference on Artificial Intelligence (IJCAI 2019
A Physics-Based Approach to Unsupervised Discovery of Coherent Structures in Spatiotemporal Systems
Given that observational and numerical climate data are being produced at
ever more prodigious rates, increasingly sophisticated and automated analysis
techniques have become essential. Deep learning is quickly becoming a standard
approach for such analyses and, while great progress is being made, major
challenges remain. Unlike commercial applications in which deep learning has
led to surprising successes, scientific data is highly complex and typically
unlabeled. Moreover, interpretability and detecting new mechanisms are key to
scientific discovery. To enhance discovery we present a complementary
physics-based, data-driven approach that exploits the causal nature of
spatiotemporal data sets generated by local dynamics (e.g. hydrodynamic flows).
We illustrate how novel patterns and coherent structures can be discovered in
cellular automata and outline the path from them to climate data.Comment: 4 pages, 1 figure;
http://csc.ucdavis.edu/~cmg/compmech/pubs/ci2017_Rupe_et_al.ht
Discovering Functional Communities in Dynamical Networks
Many networks are important because they are substrates for dynamical
systems, and their pattern of functional connectivity can itself be dynamic --
they can functionally reorganize, even if their underlying anatomical structure
remains fixed. However, the recent rapid progress in discovering the community
structure of networks has overwhelmingly focused on that constant anatomical
connectivity. In this paper, we lay out the problem of discovering_functional
communities_, and describe an approach to doing so. This method combines recent
work on measuring information sharing across stochastic networks with an
existing and successful community-discovery algorithm for weighted networks. We
illustrate it with an application to a large biophysical model of the
transition from beta to gamma rhythms in the hippocampus.Comment: 18 pages, 4 figures, Springer "Lecture Notes in Computer Science"
style. Forthcoming in the proceedings of the workshop "Statistical Network
Analysis: Models, Issues and New Directions", at ICML 2006. Version 2: small
clarifications, typo corrections, added referenc
Automatic Bayesian Density Analysis
Making sense of a dataset in an automatic and unsupervised fashion is a
challenging problem in statistics and AI. Classical approaches for {exploratory
data analysis} are usually not flexible enough to deal with the uncertainty
inherent to real-world data: they are often restricted to fixed latent
interaction models and homogeneous likelihoods; they are sensitive to missing,
corrupt and anomalous data; moreover, their expressiveness generally comes at
the price of intractable inference. As a result, supervision from statisticians
is usually needed to find the right model for the data. However, since domain
experts are not necessarily also experts in statistics, we propose Automatic
Bayesian Density Analysis (ABDA) to make exploratory data analysis accessible
at large. Specifically, ABDA allows for automatic and efficient missing value
estimation, statistical data type and likelihood discovery, anomaly detection
and dependency structure mining, on top of providing accurate density
estimation. Extensive empirical evidence shows that ABDA is a suitable tool for
automatic exploratory analysis of mixed continuous and discrete tabular data.Comment: In proceedings of the Thirty-Third AAAI Conference on Artificial
Intelligence (AAAI-19
A Survey on Graph Kernels
Graph kernels have become an established and widely-used technique for
solving classification tasks on graphs. This survey gives a comprehensive
overview of techniques for kernel-based graph classification developed in the
past 15 years. We describe and categorize graph kernels based on properties
inherent to their design, such as the nature of their extracted graph features,
their method of computation and their applicability to problems in practice. In
an extensive experimental evaluation, we study the classification accuracy of a
large suite of graph kernels on established benchmarks as well as new datasets.
We compare the performance of popular kernels with several baseline methods and
study the effect of applying a Gaussian RBF kernel to the metric induced by a
graph kernel. In doing so, we find that simple baselines become competitive
after this transformation on some datasets. Moreover, we study the extent to
which existing graph kernels agree in their predictions (and prediction errors)
and obtain a data-driven categorization of kernels as result. Finally, based on
our experimental results, we derive a practitioner's guide to kernel-based
graph classification
Dynamic mode decomposition in vector-valued reproducing kernel Hilbert spaces for extracting dynamical structure among observables
Understanding nonlinear dynamical systems (NLDSs) is challenging in a variety
of engineering and scientific fields. Dynamic mode decomposition (DMD), which
is a numerical algorithm for the spectral analysis of Koopman operators, has
been attracting attention as a way of obtaining global modal descriptions of
NLDSs without requiring explicit prior knowledge. However, since existing DMD
algorithms are in principle formulated based on the concatenation of scalar
observables, it is not directly applicable to data with dependent structures
among observables, which take, for example, the form of a sequence of graphs.
In this paper, we formulate Koopman spectral analysis for NLDSs with structures
among observables and propose an estimation algorithm for this problem. This
method can extract and visualize the underlying low-dimensional global dynamics
of NLDSs with structures among observables from data, which can be useful in
understanding the underlying dynamics of such NLDSs. To this end, we first
formulate the problem of estimating spectra of the Koopman operator defined in
vector-valued reproducing kernel Hilbert spaces, and then develop an estimation
procedure for this problem by reformulating tensor-based DMD. As a special case
of our method, we propose the method named as Graph DMD, which is a numerical
algorithm for Koopman spectral analysis of graph dynamical systems, using a
sequence of adjacency matrices. We investigate the empirical performance of our
method by using synthetic and real-world data.Comment: 34 pages with 4 figures, Published in Neural Networks, 201
- …