4,494 research outputs found
Detecting Parameter Symmetries in Probabilistic Models
Probabilistic models often have parameters that can be translated, scaled,
permuted, or otherwise transformed without changing the model. These symmetries
can lead to strong correlation and multimodality in the posterior distribution
over the model's parameters, which can pose challenges both for performing
inference and interpreting the results. In this work, we address the automatic
detection of common problematic model symmetries. To do so, we introduce local
symmetries, which cover many common cases and are amenable to automatic
detection. We show how to derive algorithms to detect several broad classes of
local symmetries. Our algorithms are compatible with probabilistic programming
constructs such as arrays, for loops, and if statements, and they scale to
models with many variables.Comment: 24 pages, 8 figure
alphastable: An R Package for Modelling Multivariate Stable and Mixture of Symmetric Stable Distributions
The family of stable distributions received extensive applications in many
fields of studies since it incorporates both the skewness and heavy tails. In
this paper, we introduce a package written in the R language called
alphastable. The alphastable performs a variety of tasks including: 1-
generating random numbers from univariate, truncated, and multivariate stable
distributions. 2- computing the probability density function of univariate and
multivariate elliptically contoured stable distributions, 3- computing the
distribution function of univariate stable distributions, 4- estimating the
parameters of univariate symmetric stable, univariate Cauchy, mixture of
Cauchy, mixture of univariate symmetric stable, multivariate elliptically
contoured stable, and multivariate strictly stable distributions. This package,
as it will be shown, is very useful for modelling data in univariate and
multivariate cases that arise in the fields of finance and economics.Comment: 35 pages, 14 figure
Learning Hierarchical Information Flow with Recurrent Neural Modules
We propose ThalNet, a deep learning model inspired by neocortical
communication via the thalamus. Our model consists of recurrent neural modules
that send features through a routing center, endowing the modules with the
flexibility to share features over multiple time steps. We show that our model
learns to route information hierarchically, processing input data by a chain of
modules. We observe common architectures, such as feed forward neural networks
and skip connections, emerging as special cases of our architecture, while
novel connectivity patterns are learned for the text8 compression task. Our
model outperforms standard recurrent neural networks on several sequential
benchmarks.Comment: NIPS 201
A data-driven approach to precipitation parameterizations using convolutional encoder-decoder neural networks
Numerical Weather Prediction (NWP) models represent sub-grid processes using
parameterizations, which are often complex and a major source of uncertainty in
weather forecasting. In this work, we devise a simple machine learning (ML)
methodology to learn parameterizations from basic NWP fields. Specifically, we
demonstrate how encoder-decoder Convolutional Neural Networks (CNN) can be used
to derive total precipitation using geopotential height as the only input.
Several popular neural network architectures, from the field of image
processing, are considered and a comparison with baseline ML methodologies is
provided. We use NWP reanalysis data to train different ML models showing how
encoder-decoder CNNs are able to interpret the spatial information contained in
the geopotential field to infer total precipitation with a high degree of
accuracy. We also provide a method to identify the levels of the geopotential
height that have a higher influence on precipitation through a variable
selection process. As far as we know, this paper covers the first attempt to
model NWP parameterizations using CNN methodologies
Unsupervised Transient Light Curve Analysis Via Hierarchical Bayesian Inference
Historically, light curve studies of supernovae (SNe) and other transient
classes have focused on individual objects with copious and high
signal-to-noise observations. In the nascent era of wide field transient
searches, objects with detailed observations are decreasing as a fraction of
the overall known SN population, and this strategy sacrifices the majority of
the information contained in the data about the underlying population of
transients. A population level modeling approach, simultaneously fitting all
available observations of objects in a transient sub-class of interest, fully
mines the data to infer the properties of the population and avoids certain
systematic biases. We present a novel hierarchical Bayesian statistical model
for population level modeling of transient light curves, and discuss its
implementation using an efficient Hamiltonian Monte Carlo technique. As a test
case, we apply this model to the Type IIP SN sample from the Pan-STARRS1 Medium
Deep Survey, consisting of 18,837 photometric observations of 76 SNe,
corresponding to a joint posterior distribution with 9,176 parameters under our
model. Our hierarchical model fits provide improved constraints on light curve
parameters relevant to the physical properties of their progenitor stars
relative to modeling individual light curves alone. Moreover, we directly
evaluate the probability for occurrence rates of unseen light curve
characteristics from the model hyperparameters, addressing observational biases
in survey methodology. We view this modeling framework as an unsupervised
machine learning technique with the ability to maximize scientific returns from
data to be collected by future wide field transient searches like LSST.
\smallskipComment: Submitted; 10 pages, 11 figures, plus appendix with cod
Joint Parameter Discovery and Generative Modeling of Dynamic Systems
Given an unknown dynamic system such as a coupled harmonic oscillator with
springs and point masses. We are often interested in gaining insights into
its physical parameters, i.e. stiffnesses and masses, by observing trajectories
of motion. How do we achieve this from video frames or time-series data and
without the knowledge of the dynamics model? We present a neural framework for
estimating physical parameters in a manner consistent with the underlying
physics. The neural framework uses a deep latent variable model to disentangle
the system physical parameters from canonical coordinate observations. It then
returns a Hamiltonian parameterization that generalizes well with respect to
the discovered physical parameters. We tested our framework with simple
harmonic oscillators, , and noisy observations and show that it discovers
the underlying system parameters and generalizes well with respect to these
discovered parameters. Our model also extrapolates the dynamics of the system
beyond the training interval and outperforms a non-physically constrained
baseline model. Our source code and datasets can be found at this URL:
https://github.com/gbarber94/ConSciNet.Comment: 11 pages, 7 figure
A survey on trajectory clustering analysis
This paper comprehensively surveys the development of trajectory clustering.
Considering the critical role of trajectory data mining in modern intelligent
systems for surveillance security, abnormal behavior detection, crowd behavior
analysis, and traffic control, trajectory clustering has attracted growing
attention. Existing trajectory clustering methods can be grouped into three
categories: unsupervised, supervised and semi-supervised algorithms. In spite
of achieving a certain level of development, trajectory clustering is limited
in its success by complex conditions such as application scenarios and data
dimensions. This paper provides a holistic understanding and deep insight into
trajectory clustering, and presents a comprehensive analysis of representative
methods and promising future directions
A Roadmap Towards Resilient Internet of Things for Cyber-Physical Systems
The Internet of Things (IoT) is a ubiquitous system connecting many different
devices - the things - which can be accessed from the distance. The
cyber-physical systems (CPS) monitor and control the things from the distance.
As a result, the concepts of dependability and security get deeply intertwined.
The increasing level of dynamicity, heterogeneity, and complexity adds to the
system's vulnerability, and challenges its ability to react to faults. This
paper summarizes state-of-the-art of existing work on anomaly detection,
fault-tolerance and self-healing, and adds a number of other methods applicable
to achieve resilience in an IoT. We particularly focus on non-intrusive methods
ensuring data integrity in the network. Furthermore, this paper presents the
main challenges in building a resilient IoT for CPS which is crucial in the era
of smart CPS with enhanced connectivity (an excellent example of such a system
is connected autonomous vehicles). It further summarizes our solutions,
work-in-progress and future work to this topic to enable "Trustworthy IoT for
CPS". Finally, this framework is illustrated on a selected use case: A smart
sensor infrastructure in the transport domain.Comment: preprint (2018-10-29
Principal Manifold Estimation and Model Complexity Selection
We propose a framework of principal manifolds to model high-dimensional data.
This framework is based on Sobolev spaces and designed to model data of any
intrinsic dimension. It includes principal component analysis and principal
curve algorithm as special cases. We propose a novel method for model
complexity selection to avoid overfitting, eliminate the effects of outliers,
and improve the computation speed. Additionally, we propose a method for
identifying the interiors of circle-like curves and cylinder/ball-like
surfaces. The proposed approach is compared to existing methods by simulations
and applied to estimate tumor surfaces and interiors in a lung cancer study.Comment: 34 pages, 9 figure
Noisy Activation Functions
Common nonlinear activation functions used in neural networks can cause
training difficulties due to the saturation behavior of the activation
function, which may hide dependencies that are not visible to vanilla-SGD
(using first order gradients only). Gating mechanisms that use softly
saturating activation functions to emulate the discrete switching of digital
logic circuits are good examples of this. We propose to exploit the injection
of appropriate noise so that the gradients may flow easily, even if the
noiseless application of the activation function would yield zero gradient.
Large noise will dominate the noise-free gradient and allow stochastic gradient
descent toexplore more. By adding noise only to the problematic parts of the
activation function, we allow the optimization procedure to explore the
boundary between the degenerate (saturating) and the well-behaved parts of the
activation function. We also establish connections to simulated annealing, when
the amount of noise is annealed down, making it easier to optimize hard
objective functions. We find experimentally that replacing such saturating
activation functions by noisy variants helps training in many contexts,
yielding state-of-the-art or competitive results on different datasets and
task, especially when training seems to be the most difficult, e.g., when
curriculum learning is necessary to obtain good results
- …