183 research outputs found

    Deep transfer learning for partial differential equations under conditional shift with DeepONet

    Full text link
    Traditional machine learning algorithms are designed to learn in isolation, i.e. address single tasks. The core idea of transfer learning (TL) is that knowledge gained in learning to perform one task (source) can be leveraged to improve learning performance in a related, but different, task (target). TL leverages and transfers previously acquired knowledge to address the expense of data acquisition and labeling, potential computational power limitations, and the dataset distribution mismatches. Although significant progress has been made in the fields of image processing, speech recognition, and natural language processing (for classification and regression) for TL, little work has been done in the field of scientific machine learning for functional regression and uncertainty quantification in partial differential equations. In this work, we propose a novel TL framework for task-specific learning under conditional shift with a deep operator network (DeepONet). Inspired by the conditional embedding operator theory, we measure the statistical distance between the source domain and the target feature domain by embedding conditional distributions onto a reproducing kernel Hilbert space. Task-specific operator learning is accomplished by fine-tuning task-specific layers of the target DeepONet using a hybrid loss function that allows for the matching of individual target samples while also preserving the global properties of the conditional distribution of target data. We demonstrate the advantages of our approach for various TL scenarios involving nonlinear PDEs under conditional shift. Our results include geometry domain adaptation and show that the proposed TL framework enables fast and efficient multi-task operator learning, despite significant differences between the source and target domains.Comment: 19 pages, 3 figure

    Causal Modeling with Stationary Diffusions

    Full text link
    We develop a novel approach towards causal inference. Rather than structural equations over a causal graph, we learn stochastic differential equations (SDEs) whose stationary densities model a system's behavior under interventions. These stationary diffusion models do not require the formalism of causal graphs, let alone the common assumption of acyclicity. We show that in several cases, they generalize to unseen interventions on their variables, often better than classical approaches. Our inference method is based on a new theoretical result that expresses a stationarity condition on the diffusion's generator in a reproducing kernel Hilbert space. The resulting kernel deviation from stationarity (KDS) is an objective function of independent interest

    Probabilistic learning and computation in brains and machines

    Get PDF
    Humans and animals are able to solve a wide variety of perceptual, decision making and motor tasks with great exibility. Moreover, behavioural evidence shows that this exibility extends to situations where accuracy requires the correct treatment of uncertainty induced by noise and ambiguity in the available sensory information as well as noise internal to the brain. It has been suggested that this adequate handling of uncertainty is based on a learned internal model, e.g. in the case of perception, a generative model of sensory observations. Learning latent variable models and performing inference in them is a key challenge for both biological and arti cial learning systems. Here, we introduce a new approach to learning in hierarchical latent variable models called the Distributed Distributional Code Helmholtz Machine (DDC-HM), which emphasises exibility and accuracy in the inferential process. The approximate posterior over unobserved variables is represented implicitly as a set of expectations, corresponding to mean parameters of an exponential family distribution. To train the generative and recognition models we develop an extended wake-sleep algorithm inspired by the original Helmholtz Machine. As a result, the DDC-HM is able to learn hierarchical latent models without having to propagate gradients across di erent stochastic layers|making our approach biologically appealing. In the second part of the thesis, we review existing proposals for neural representations of uncertainty with a focus on representational and computational exibility as well as experimental support. Finally, we consider inference and learning in dynamical environment models using Distributed Distributional Codes to represent both the stochastic latent transition model and the inferred posterior distributions. We show that this model makes it possible to generalise successor representations to biologically more realistic, partially observed settings

    Contributions in functional data analysis and functional-analytic statistics

    Get PDF
    Functional data analysis is the study of statistical algorithms which are applied in the scenario when the observed data is a collection of functions. Since this type of data is becoming cheaper and easier to collect, there is an increased need to develop statistical tools to handle such data. The first part of this thesis focuses on deriving distances between distributions over function spaces and applying these to two-sample testing, goodness-of-fit testing and sample quality assessment. This presents a wide range of contributions since currently there exists either very few or no methods at all to tackle these problems for functional data. The second part of this thesis adopts the functional-analytic perspective to two statistical algorithms. This is a perspective where functions are viewed as living in specific function spaces and the tool box of functional analysis is applied to identify and prove properties of the algorithms. The two algorithms are variational Gaussian processes, used widely throughout machine learning for function modelling with large observation data sets, and functional statistical depth, used widely as a means to evaluate outliers and perform testing for functional data sets. The results presented contribute a taxonomy of the variational Gaussian process methodology and multiple new results in the theory of functional depth including the open problem of providing a depth which characterises distributions on function spaces.Open Acces

    GraphiT: Encoding Graph Structure in Transformers

    Full text link
    We show that viewing graphs as sets of node features and incorporating structural and positional information into a transformer architecture is able to outperform representations learned with classical graph neural networks (GNNs). Our model, GraphiT, encodes such information by (i) leveraging relative positional encoding strategies in self-attention scores based on positive definite kernels on graphs, and (ii) enumerating and encoding local sub-structures such as paths of short length. We thoroughly evaluate these two ideas on many classification and regression tasks, demonstrating the effectiveness of each of them independently, as well as their combination. In addition to performing well on standard benchmarks, our model also admits natural visualization mechanisms for interpreting graph motifs explaining the predictions, making it a potentially strong candidate for scientific applications where interpretation is important. Code available at https://github.com/inria-thoth/GraphiT
    corecore