73 research outputs found

    Structured probabilistic inference

    Get PDF
    AbstractProbabilistic inference is among the main topics with reasoning in uncertainty in AI. For this purpose, Bayesian Networks (BNs) is one of the most successful and efficient Probabilistic Graphical Model (PGM) so far. Since the mid-90s, a growing number of BNs extensions have been proposed. Object-oriented, entity-relationship and first-order logic are the main representation paradigms used to extend BNs. While entity-relationship and first-order models have been successfully used for machine learning in defining lifted probabilistic inference, object-oriented models have been mostly underused. Structured inference, which exploits the structural knowledge encoded in an object-oriented PGM, is a surprisingly unstudied technique. In this paper we propose a full object-oriented framework for PRM and propose two extensions of the state-of-the-art structured inference algorithm: SPI which removes the major flaws of existing algorithms and SPISBB which largely enhances SPI by using d-separation

    Anomaly Detection in Networks via Score-Based Generative Models

    Full text link
    Node outlier detection in attributed graphs is a challenging problem for which there is no method that would work well across different datasets. Motivated by the state-of-the-art results of score-based models in graph generative modeling, we propose to incorporate them into the aforementioned problem. Our method achieves competitive results on small-scale graphs. We provide an empirical analysis of the Dirichlet energy, and show that generative models might struggle to accurately reconstruct it.Comment: 16 pages, 8 figures, ICML workshop on Structured Probabilistic Inference & Generative Modelin

    10 Years of Probabilistic Querying – What Next?

    Full text link
    Over the past decade, the two research areas of probabilistic databases and probabilistic programming have intensively studied the problem of making structured probabilistic inference scalable, but — so far — both areas developed almost independently of one another. While probabilistic databases have focused on describing tractable query classes based on the structure of query plans and data lineage, probabilistic programming has contributed sophisticated inference techniques based on knowledge compilation and lifted (first-order) inference. Both fields have developed their own variants of — both exact and approximate — top-k algorithms for query evaluation, and both investigate query optimization techniques known from SQL, Datalog, and Prolog, which all calls for a more intensive study of the commonalities and integration of the two fields. Moreover, we believe that natural-language processing and information extraction will remain a driving factor and in fact a longstanding challenge for developing expressive representation models which can be combined with structured probabilistic inference — also for the next decades to come

    Beyond Intuition, a Framework for Applying GPs to Real-World Data

    Full text link
    Gaussian Processes (GPs) offer an attractive method for regression over small, structured and correlated datasets. However, their deployment is hindered by computational costs and limited guidelines on how to apply GPs beyond simple low-dimensional datasets. We propose a framework to identify the suitability of GPs to a given problem and how to set up a robust and well-specified GP model. The guidelines formalise the decisions of experienced GP practitioners, with an emphasis on kernel design and options for computational scalability. The framework is then applied to a case study of glacier elevation change yielding more accurate results at test time.Comment: Accepted at the 1st ICML Workshop on Structured Probabilistic Inference and Generative Modelling (2023

    Geometric Constraints in Probabilistic Manifolds: A Bridge from Molecular Dynamics to Structured Diffusion Processes

    Full text link
    Understanding the macroscopic characteristics of biological complexes demands precision and specificity in statistical ensemble modeling. One of the primary challenges in this domain lies in sampling from particular subsets of the state-space, driven either by existing structural knowledge or specific areas of interest within the state-space. We propose a method that enables sampling from distributions that rigorously adhere to arbitrary sets of geometric constraints in Euclidean spaces. This is achieved by integrating a constraint projection operator within the well-regarded architecture of Denoising Diffusion Probabilistic Models, a framework founded in generative modeling and probabilistic inference. The significance of this work becomes apparent, for instance, in the context of deep learning-based drug design, where it is imperative to maintain specific molecular profile interactions to realize the desired therapeutic outcomes and guarantee safety.Comment: Published at ICML 2023 Workshop on Structured Probabilistic Inference and Generative Modelin

    Benchmarking Bayesian Causal Discovery Methods for Downstream Treatment Effect Estimation

    Full text link
    The practical utility of causality in decision-making is widely recognized, with causal discovery and inference being inherently intertwined. Nevertheless, a notable gap exists in the evaluation of causal discovery methods, where insufficient emphasis is placed on downstream inference. To address this gap, we evaluate six established baseline causal discovery methods and a newly proposed method based on GFlowNets, on the downstream task of treatment effect estimation. Through the implementation of a robust evaluation procedure, we offer valuable insights into the efficacy of these causal discovery methods for treatment effect estimation, considering both synthetic and real-world scenarios, as well as low-data scenarios. Furthermore, the results of our study demonstrate that GFlowNets possess the capability to effectively capture a wide range of useful and diverse ATE modes.Comment: Peer-Reviewed and Accepted to ICML 2023 Workshop on Structured Probabilistic Inference & Generative Modelin

    BatchGFN: Generative Flow Networks for Batch Active Learning

    Full text link
    We introduce BatchGFN -- a novel approach for pool-based active learning that uses generative flow networks to sample sets of data points proportional to a batch reward. With an appropriate reward function to quantify the utility of acquiring a batch, such as the joint mutual information between the batch and the model parameters, BatchGFN is able to construct highly informative batches for active learning in a principled way. We show our approach enables sampling near-optimal utility batches at inference time with a single forward pass per point in the batch in toy regression problems. This alleviates the computational complexity of batch-aware algorithms and removes the need for greedy approximations to find maximizers for the batch reward. We also present early results for amortizing training across acquisition steps, which will enable scaling to real-world tasks.Comment: Accepted at the Structured Probabilistic Inference & Generative Modeling workshop, ICML 202

    Augmenting Control over Exploration Space in Molecular Dynamics Simulators to Streamline De Novo Analysis through Generative Control Policies

    Full text link
    This study introduces the P5 model - a foundational method that utilizes reinforcement learning (RL) to augment control, effectiveness, and scalability in molecular dynamics simulations (MD). Our innovative strategy optimizes the sampling of target polymer chain conformations, marking an efficiency improvement of over 37.1%. The RL-induced control policies function as an inductive bias, modulating Brownian forces to steer the system towards the preferred state, thereby expanding the exploration of the configuration space beyond what traditional MD allows. This broadened exploration generates a more varied set of conformations and targets specific properties, a feature pivotal for progress in polymer development, drug discovery, and material design. Our technique offers significant advantages when investigating new systems with limited prior knowledge, opening up new methodologies for tackling complex simulation problems with generative techniques.Comment: ICML 2023 Workshop on Structured Probabilistic Inference (SPIGM) and Generative Modeling, of the International Conference of Machine Learning (ICML

    Diffusion Generative Inverse Design

    Full text link
    Inverse design refers to the problem of optimizing the input of an objective function in order to enact a target outcome. For many real-world engineering problems, the objective function takes the form of a simulator that predicts how the system state will evolve over time, and the design challenge is to optimize the initial conditions that lead to a target outcome. Recent developments in learned simulation have shown that graph neural networks (GNNs) can be used for accurate, efficient, differentiable estimation of simulator dynamics, and support high-quality design optimization with gradient- or sampling-based optimization procedures. However, optimizing designs from scratch requires many expensive model queries, and these procedures exhibit basic failures on either non-convex or high-dimensional problems.In this work, we show how denoising diffusion models (DDMs) can be used to solve inverse design problems efficiently and propose a particle sampling algorithm for further improving their efficiency. We perform experiments on a number of fluid dynamics design challenges, and find that our approach substantially reduces the number of calls to the simulator compared to standard techniques.Comment: ICML workshop on Structured Probabilistic Inference & Generative Modelin

    An Empirical Study of the Effectiveness of Using a Replay Buffer on Mode Discovery in GFlowNets

    Full text link
    Reinforcement Learning (RL) algorithms aim to learn an optimal policy by iteratively sampling actions to learn how to maximize the total expected return, R(x)R(x). GFlowNets are a special class of algorithms designed to generate diverse candidates, xx, from a discrete set, by learning a policy that approximates the proportional sampling of R(x)R(x). GFlowNets exhibit improved mode discovery compared to conventional RL algorithms, which is very useful for applications such as drug discovery and combinatorial search. However, since GFlowNets are a relatively recent class of algorithms, many techniques which are useful in RL have not yet been associated with them. In this paper, we study the utilization of a replay buffer for GFlowNets. We explore empirically various replay buffer sampling techniques and assess the impact on the speed of mode discovery and the quality of the modes discovered. Our experimental results in the Hypergrid toy domain and a molecule synthesis environment demonstrate significant improvements in mode discovery when training with a replay buffer, compared to training only with trajectories generated on-policy.Comment: Accepted to ICML 2023 workshop on Structured Probabilistic Inference & Generative Modelin
    • …
    corecore