13 research outputs found

    Truncated Variational Sampling for "Black Box" Optimization of Generative Models

    Get PDF
    We investigate the optimization of two probabilistic generative models with binary latent variables using a novel variational EM approach. The approach distinguishes itself from previous variational approaches by using latent states as variational parameters. Here we use efficient and general purpose sampling procedures to vary the latent states, and investigate the "black box" applicability of the resulting optimization procedure. For general purpose applicability, samples are drawn from approximate marginal distributions of the considered generative model as well as from the model's prior distribution. As such, variational sampling is defined in a generic form, and is directly executable for a given model. As a proof of concept, we then apply the novel procedure (A) to Binary Sparse Coding (a model with continuous observables), and (B) to basic Sigmoid Belief Networks (which are models with binary observables). Numerical experiments verify that the investigated approach efficiently as well as effectively increases a variational free energy objective without requiring any additional analytical steps

    ProSper -- A Python Library for Probabilistic Sparse Coding with Non-Standard Priors and Superpositions

    Get PDF
    ProSper is a python library containing probabilistic algorithms to learn dictionaries. Given a set of data points, the implemented algorithms seek to learn the elementary components that have generated the data. The library widens the scope of dictionary learning approaches beyond implementations of standard approaches such as ICA, NMF or standard L1 sparse coding. The implemented algorithms are especially well-suited in cases when data consist of components that combine non-linearly and/or for data requiring flexible prior distributions. Furthermore, the implemented algorithms go beyond standard approaches by inferring prior and noise parameters of the data, and they provide rich a-posteriori approximations for inference. The library is designed to be extendable and it currently includes: Binary Sparse Coding (BSC), Ternary Sparse Coding (TSC), Discrete Sparse Coding (DSC), Maximal Causes Analysis (MCA), Maximum Magnitude Causes Analysis (MMCA), and Gaussian Sparse Coding (GSC, a recent spike-and-slab sparse coding approach). The algorithms are scalable due to a combination of variational approximations and parallelization. Implementations of all algorithms allow for parallel execution on multiple CPUs and multiple machines for medium to large-scale applications. Typical large-scale runs of the algorithms can use hundreds of CPUs to learn hundreds of dictionary elements from data with tens of millions of floating-point numbers such that models with several hundred thousand parameters can be optimized. The library is designed to have minimal dependencies and to be easy to use. It targets users of dictionary learning algorithms and Machine Learning researchers

    ProSper -- A Python Library for Probabilistic Sparse Coding with Non-Standard Priors and Superpositions

    Get PDF
    ProSper is a python library containing probabilistic algorithms to learn dictionaries. Given a set of data points, the implemented algorithms seek to learn the elementary components that have generated the data. The library widens the scope of dictionary learning approaches beyond implementations of standard approaches such as ICA, NMF or standard L1 sparse coding. The implemented algorithms are especially well-suited in cases when data consist of components that combine non-linearly and/or for data requiring flexible prior distributions. Furthermore, the implemented algorithms go beyond standard approaches by inferring prior and noise parameters of the data, and they provide rich a-posteriori approximations for inference. The library is designed to be extendable and it currently includes: Binary Sparse Coding (BSC), Ternary Sparse Coding (TSC), Discrete Sparse Coding (DSC), Maximal Causes Analysis (MCA), Maximum Magnitude Causes Analysis (MMCA), and Gaussian Sparse Coding (GSC, a recent spike-and-slab sparse coding approach). The algorithms are scalable due to a combination of variational approximations and parallelization. Implementations of all algorithms allow for parallel execution on multiple CPUs and multiple machines for medium to large-scale applications. Typical large-scale runs of the algorithms can use hundreds of CPUs to learn hundreds of dictionary elements from data with tens of millions of floating-point numbers such that models with several hundred thousand parameters can be optimized. The library is designed to have minimal dependencies and to be easy to use. It targets users of dictionary learning algorithms and Machine Learning researchers

    Prediction of cesarean delivery in class III obese nulliparous women:An externally validated model using machine learning

    Get PDF
    Background: class III obese women, are at a higher risk of cesarean section during labor, and cesarean section is responsible for increased maternal and neonatal morbidity in this population. Objective: the objective of this project was to develop a method with which to quantify cesarean section risk before labor. Methods: this is a multicentric retrospective cohort study conducted on 410 nulliparous class III obese pregnant women who attempted vaginal delivery in two French university hospitals. We developed two predictive algorithms (a logistic regression and a random forest models) and assessed performance levels and compared them. Results: the logistic regression model found that only initial weight and labor induction were significant in the prediction of unplanned cesarean section. The probability forest was able to predict cesarean section probability using only two pre-labor characteristics: initial weight and labor induction. Its performances were higher and were calculated for a cut-point of 49.5% risk and the results were (with 95% confidence intervals): area under the curve 0.70 (0.62,0.78), accuracy 0.66 (0.58, 0.73), specificity 0.87 (0.77, 0.93), and sensitivity 0.44 (0.32, 0.55). Conclusions: this is an innovative and effective approach to predicting unplanned CS risk in this population and could play a role in the choice of a trial of labor versus planned cesarean section. Further studies are needed, especially a prospective clinical trial. Funding: French state funds “Plan Investissements d'Avenir” and Agence Nationale de la Recherche.</p

    Prediction of cesarean delivery in class III obese nulliparous women:An externally validated model using machine learning

    Get PDF
    Background: class III obese women, are at a higher risk of cesarean section during labor, and cesarean section is responsible for increased maternal and neonatal morbidity in this population. Objective: the objective of this project was to develop a method with which to quantify cesarean section risk before labor. Methods: this is a multicentric retrospective cohort study conducted on 410 nulliparous class III obese pregnant women who attempted vaginal delivery in two French university hospitals. We developed two predictive algorithms (a logistic regression and a random forest models) and assessed performance levels and compared them. Results: the logistic regression model found that only initial weight and labor induction were significant in the prediction of unplanned cesarean section. The probability forest was able to predict cesarean section probability using only two pre-labor characteristics: initial weight and labor induction. Its performances were higher and were calculated for a cut-point of 49.5% risk and the results were (with 95% confidence intervals): area under the curve 0.70 (0.62,0.78), accuracy 0.66 (0.58, 0.73), specificity 0.87 (0.77, 0.93), and sensitivity 0.44 (0.32, 0.55). Conclusions: this is an innovative and effective approach to predicting unplanned CS risk in this population and could play a role in the choice of a trial of labor versus planned cesarean section. Further studies are needed, especially a prospective clinical trial. Funding: French state funds “Plan Investissements d'Avenir” and Agence Nationale de la Recherche.</p

    Kymatio: Scattering Transforms in Python

    Full text link
    The wavelet scattering transform is an invariant signal representation suitable for many signal processing and machine learning applications. We present the Kymatio software package, an easy-to-use, high-performance Python implementation of the scattering transform in 1D, 2D, and 3D that is compatible with modern deep learning frameworks. All transforms may be executed on a GPU (in addition to CPU), offering a considerable speed up over CPU implementations. The package also has a small memory footprint, resulting inefficient memory usage. The source code, documentation, and examples are available undera BSD license at https://www.kymat.io

    Predicting COVID-19 positivity and hospitalization with multi-scale graph neural networks

    No full text
    Abstract The pandemic of COVID-19 is undoubtedly one of the biggest challenges for modern healthcare. In order to analyze the spatio-temporal aspects of the spread of COVID-19, technology has helped us to track, identify and store information regarding positivity and hospitalization, across different levels of municipal entities. In this work, we present a method for predicting the number of positive and hospitalized cases via a novel multi-scale graph neural network, integrating information from fine-scale geographical zones of a few thousand inhabitants. By leveraging population mobility data and other features, the model utilizes message passing to model interaction between areas. Our proposed model manages to outperform baselines and deep learning models, presenting low errors in both prediction tasks. We specifically point out the importance of our contribution in predicting hospitalization since hospitals became critical infrastructure during the pandemic. To the best of our knowledge, this is the first work to exploit high-resolution spatio-temporal data in a multi-scale manner, incorporating additional knowledge, such as vaccination rates and population mobility data. We believe that our method may improve future estimations of positivity and hospitalization, which is crucial for healthcare planning

    Efficient spatio-temporal feature clustering for large event-based datasets

    No full text
    International audienceEvent-based cameras encode changes in a visual scene with high temporal precision and low power consumption, generating millions of events per second in the process. Current event-based processing algorithms do not scale well in terms of runtime and computational resources when applied to a large amount of data. This problem is further exacerbated by the development of high spatial resolution vision sensors. We introduce a fast and computationally efficient clustering algorithm that is particularly designed for dealing with large event-based datasets. The approach is based on the expectation-maximization (EM) algorithm and relies on a stochastic approximation of the E-step over a truncated space to reduce the computational burden and speed up the learning process.We evaluate the quality, complexity, and stability of the clustering algorithmon a variety of large event-based datasets, and then validate our approach with a classification task. The proposed algorithm is significantly faster than standard k-means and reduces computational demands by two to three orders of magnitude while being more stable, interpretable, and close to the state of the art in terms of classification accuracy
    corecore