4,494 research outputs found

    Detecting Parameter Symmetries in Probabilistic Models

    Full text link
    Probabilistic models often have parameters that can be translated, scaled, permuted, or otherwise transformed without changing the model. These symmetries can lead to strong correlation and multimodality in the posterior distribution over the model's parameters, which can pose challenges both for performing inference and interpreting the results. In this work, we address the automatic detection of common problematic model symmetries. To do so, we introduce local symmetries, which cover many common cases and are amenable to automatic detection. We show how to derive algorithms to detect several broad classes of local symmetries. Our algorithms are compatible with probabilistic programming constructs such as arrays, for loops, and if statements, and they scale to models with many variables.Comment: 24 pages, 8 figure

    alphastable: An R Package for Modelling Multivariate Stable and Mixture of Symmetric Stable Distributions

    Full text link
    The family of stable distributions received extensive applications in many fields of studies since it incorporates both the skewness and heavy tails. In this paper, we introduce a package written in the R language called alphastable. The alphastable performs a variety of tasks including: 1- generating random numbers from univariate, truncated, and multivariate stable distributions. 2- computing the probability density function of univariate and multivariate elliptically contoured stable distributions, 3- computing the distribution function of univariate stable distributions, 4- estimating the parameters of univariate symmetric stable, univariate Cauchy, mixture of Cauchy, mixture of univariate symmetric stable, multivariate elliptically contoured stable, and multivariate strictly stable distributions. This package, as it will be shown, is very useful for modelling data in univariate and multivariate cases that arise in the fields of finance and economics.Comment: 35 pages, 14 figure

    Learning Hierarchical Information Flow with Recurrent Neural Modules

    Full text link
    We propose ThalNet, a deep learning model inspired by neocortical communication via the thalamus. Our model consists of recurrent neural modules that send features through a routing center, endowing the modules with the flexibility to share features over multiple time steps. We show that our model learns to route information hierarchically, processing input data by a chain of modules. We observe common architectures, such as feed forward neural networks and skip connections, emerging as special cases of our architecture, while novel connectivity patterns are learned for the text8 compression task. Our model outperforms standard recurrent neural networks on several sequential benchmarks.Comment: NIPS 201

    A data-driven approach to precipitation parameterizations using convolutional encoder-decoder neural networks

    Full text link
    Numerical Weather Prediction (NWP) models represent sub-grid processes using parameterizations, which are often complex and a major source of uncertainty in weather forecasting. In this work, we devise a simple machine learning (ML) methodology to learn parameterizations from basic NWP fields. Specifically, we demonstrate how encoder-decoder Convolutional Neural Networks (CNN) can be used to derive total precipitation using geopotential height as the only input. Several popular neural network architectures, from the field of image processing, are considered and a comparison with baseline ML methodologies is provided. We use NWP reanalysis data to train different ML models showing how encoder-decoder CNNs are able to interpret the spatial information contained in the geopotential field to infer total precipitation with a high degree of accuracy. We also provide a method to identify the levels of the geopotential height that have a higher influence on precipitation through a variable selection process. As far as we know, this paper covers the first attempt to model NWP parameterizations using CNN methodologies

    Unsupervised Transient Light Curve Analysis Via Hierarchical Bayesian Inference

    Full text link
    Historically, light curve studies of supernovae (SNe) and other transient classes have focused on individual objects with copious and high signal-to-noise observations. In the nascent era of wide field transient searches, objects with detailed observations are decreasing as a fraction of the overall known SN population, and this strategy sacrifices the majority of the information contained in the data about the underlying population of transients. A population level modeling approach, simultaneously fitting all available observations of objects in a transient sub-class of interest, fully mines the data to infer the properties of the population and avoids certain systematic biases. We present a novel hierarchical Bayesian statistical model for population level modeling of transient light curves, and discuss its implementation using an efficient Hamiltonian Monte Carlo technique. As a test case, we apply this model to the Type IIP SN sample from the Pan-STARRS1 Medium Deep Survey, consisting of 18,837 photometric observations of 76 SNe, corresponding to a joint posterior distribution with 9,176 parameters under our model. Our hierarchical model fits provide improved constraints on light curve parameters relevant to the physical properties of their progenitor stars relative to modeling individual light curves alone. Moreover, we directly evaluate the probability for occurrence rates of unseen light curve characteristics from the model hyperparameters, addressing observational biases in survey methodology. We view this modeling framework as an unsupervised machine learning technique with the ability to maximize scientific returns from data to be collected by future wide field transient searches like LSST. \smallskipComment: Submitted; 10 pages, 11 figures, plus appendix with cod

    Joint Parameter Discovery and Generative Modeling of Dynamic Systems

    Full text link
    Given an unknown dynamic system such as a coupled harmonic oscillator with nn springs and point masses. We are often interested in gaining insights into its physical parameters, i.e. stiffnesses and masses, by observing trajectories of motion. How do we achieve this from video frames or time-series data and without the knowledge of the dynamics model? We present a neural framework for estimating physical parameters in a manner consistent with the underlying physics. The neural framework uses a deep latent variable model to disentangle the system physical parameters from canonical coordinate observations. It then returns a Hamiltonian parameterization that generalizes well with respect to the discovered physical parameters. We tested our framework with simple harmonic oscillators, n=1n=1, and noisy observations and show that it discovers the underlying system parameters and generalizes well with respect to these discovered parameters. Our model also extrapolates the dynamics of the system beyond the training interval and outperforms a non-physically constrained baseline model. Our source code and datasets can be found at this URL: https://github.com/gbarber94/ConSciNet.Comment: 11 pages, 7 figure

    A survey on trajectory clustering analysis

    Full text link
    This paper comprehensively surveys the development of trajectory clustering. Considering the critical role of trajectory data mining in modern intelligent systems for surveillance security, abnormal behavior detection, crowd behavior analysis, and traffic control, trajectory clustering has attracted growing attention. Existing trajectory clustering methods can be grouped into three categories: unsupervised, supervised and semi-supervised algorithms. In spite of achieving a certain level of development, trajectory clustering is limited in its success by complex conditions such as application scenarios and data dimensions. This paper provides a holistic understanding and deep insight into trajectory clustering, and presents a comprehensive analysis of representative methods and promising future directions

    A Roadmap Towards Resilient Internet of Things for Cyber-Physical Systems

    Full text link
    The Internet of Things (IoT) is a ubiquitous system connecting many different devices - the things - which can be accessed from the distance. The cyber-physical systems (CPS) monitor and control the things from the distance. As a result, the concepts of dependability and security get deeply intertwined. The increasing level of dynamicity, heterogeneity, and complexity adds to the system's vulnerability, and challenges its ability to react to faults. This paper summarizes state-of-the-art of existing work on anomaly detection, fault-tolerance and self-healing, and adds a number of other methods applicable to achieve resilience in an IoT. We particularly focus on non-intrusive methods ensuring data integrity in the network. Furthermore, this paper presents the main challenges in building a resilient IoT for CPS which is crucial in the era of smart CPS with enhanced connectivity (an excellent example of such a system is connected autonomous vehicles). It further summarizes our solutions, work-in-progress and future work to this topic to enable "Trustworthy IoT for CPS". Finally, this framework is illustrated on a selected use case: A smart sensor infrastructure in the transport domain.Comment: preprint (2018-10-29

    Principal Manifold Estimation and Model Complexity Selection

    Full text link
    We propose a framework of principal manifolds to model high-dimensional data. This framework is based on Sobolev spaces and designed to model data of any intrinsic dimension. It includes principal component analysis and principal curve algorithm as special cases. We propose a novel method for model complexity selection to avoid overfitting, eliminate the effects of outliers, and improve the computation speed. Additionally, we propose a method for identifying the interiors of circle-like curves and cylinder/ball-like surfaces. The proposed approach is compared to existing methods by simulations and applied to estimate tumor surfaces and interiors in a lung cancer study.Comment: 34 pages, 9 figure

    Noisy Activation Functions

    Full text link
    Common nonlinear activation functions used in neural networks can cause training difficulties due to the saturation behavior of the activation function, which may hide dependencies that are not visible to vanilla-SGD (using first order gradients only). Gating mechanisms that use softly saturating activation functions to emulate the discrete switching of digital logic circuits are good examples of this. We propose to exploit the injection of appropriate noise so that the gradients may flow easily, even if the noiseless application of the activation function would yield zero gradient. Large noise will dominate the noise-free gradient and allow stochastic gradient descent toexplore more. By adding noise only to the problematic parts of the activation function, we allow the optimization procedure to explore the boundary between the degenerate (saturating) and the well-behaved parts of the activation function. We also establish connections to simulated annealing, when the amount of noise is annealed down, making it easier to optimize hard objective functions. We find experimentally that replacing such saturating activation functions by noisy variants helps training in many contexts, yielding state-of-the-art or competitive results on different datasets and task, especially when training seems to be the most difficult, e.g., when curriculum learning is necessary to obtain good results
    • …
    corecore