21,033 research outputs found
Machine learning in solar physics
The application of machine learning in solar physics has the potential to
greatly enhance our understanding of the complex processes that take place in
the atmosphere of the Sun. By using techniques such as deep learning, we are
now in the position to analyze large amounts of data from solar observations
and identify patterns and trends that may not have been apparent using
traditional methods. This can help us improve our understanding of explosive
events like solar flares, which can have a strong effect on the Earth
environment. Predicting hazardous events on Earth becomes crucial for our
technological society. Machine learning can also improve our understanding of
the inner workings of the sun itself by allowing us to go deeper into the data
and to propose more complex models to explain them. Additionally, the use of
machine learning can help to automate the analysis of solar data, reducing the
need for manual labor and increasing the efficiency of research in this field.Comment: 100 pages, 13 figures, 286 references, accepted for publication as a
Living Review in Solar Physics (LRSP
Pupil-driven quantitative differential phase contrast imaging
In this research, we reveal the inborn but hitherto ignored properties of
quantitative differential phase contrast (qDPC) imaging: the phase transfer
function being an edge detection filter. Inspired by this, we highlighted the
duality of qDPC between optics and pattern recognition, and propose a simple
and effective qDPC reconstruction algorithm, termed Pupil-Driven qDPC
(pd-qDPC), to facilitate the phase reconstruction quality for the family of
qDPC-based phase reconstruction algorithms. We formed a new cost function in
which modified L0-norm was used to represent the pupil-driven edge sparsity,
and the qDPC convolution operator is duplicated in the data fidelity term to
achieve automatic background removal. Further, we developed the iterative
reweighted soft-threshold algorithms based on split Bregman method to solve
this modified L0-norm problem. We tested pd-qDPC on both simulated and
experimental data and compare against state-of-the-art (SOTA) methods including
L2-norm, total variation regularization (TV-qDPC), isotropic-qDPC, and Retinex
qDPC algorithms. Results show that our proposed model is superior in terms of
phase reconstruction quality and implementation efficiency, in which it
significantly increases the experimental robustness while maintaining the data
fidelity. In general, the pd-qDPC enables the high-quality qDPC reconstruction
without any modification of the optical system. It simplifies the system
complexity and benefits the qDPC community and beyond including but not limited
to cell segmentation and PTF learning based on the edge filtering property
A DeepONet multi-fidelity approach for residual learning in reduced order modeling
In the present work, we introduce a novel approach to enhance the precision
of reduced order models by exploiting a multi-fidelity perspective and
DeepONets. Reduced models provide a real-time numerical approximation by
simplifying the original model. The error introduced by the such operation is
usually neglected and sacrificed in order to reach a fast computation. We
propose to couple the model reduction to a machine learning residual learning,
such that the above-mentioned error can be learned by a neural network and
inferred for new predictions. We emphasize that the framework maximizes the
exploitation of high-fidelity information, using it for building the reduced
order model and for learning the residual. In this work, we explore the
integration of proper orthogonal decomposition (POD), and gappy POD for sensors
data, with the recent DeepONet architecture. Numerical investigations for a
parametric benchmark function and a nonlinear parametric Navier-Stokes problem
are presented
SignReLU neural network and its approximation ability
Deep neural networks (DNNs) have garnered significant attention in various
fields of science and technology in recent years. Activation functions define
how neurons in DNNs process incoming signals for them. They are essential for
learning non-linear transformations and for performing diverse computations
among successive neuron layers. In the last few years, researchers have
investigated the approximation ability of DNNs to explain their power and
success. In this paper, we explore the approximation ability of DNNs using a
different activation function, called SignReLU. Our theoretical results
demonstrate that SignReLU networks outperform rational and ReLU networks in
terms of approximation performance. Numerical experiments are conducted
comparing SignReLU with the existing activations such as ReLU, Leaky ReLU, and
ELU, which illustrate the competitive practical performance of SignReLU
Networked Time Series Prediction with Incomplete Data
A networked time series (NETS) is a family of time series on a given graph,
one for each node. It has a wide range of applications from intelligent
transportation, environment monitoring to smart grid management. An important
task in such applications is to predict the future values of a NETS based on
its historical values and the underlying graph. Most existing methods require
complete data for training. However, in real-world scenarios, it is not
uncommon to have missing data due to sensor malfunction, incomplete sensing
coverage, etc. In this paper, we study the problem of NETS prediction with
incomplete data. We propose NETS-ImpGAN, a novel deep learning framework that
can be trained on incomplete data with missing values in both history and
future. Furthermore, we propose Graph Temporal Attention Networks, which
incorporate the attention mechanism to capture both inter-time series and
temporal correlations. We conduct extensive experiments on four real-world
datasets under different missing patterns and missing rates. The experimental
results show that NETS-ImpGAN outperforms existing methods, reducing the MAE by
up to 25%
The Inductive Bias of Flatness Regularization for Deep Matrix Factorization
Recent works on over-parameterized neural networks have shown that the
stochasticity in optimizers has the implicit regularization effect of
minimizing the sharpness of the loss function (in particular, the trace of its
Hessian) over the family zero-loss solutions. More explicit forms of flatness
regularization also empirically improve the generalization performance.
However, it remains unclear why and when flatness regularization leads to
better generalization. This work takes the first step toward understanding the
inductive bias of the minimum trace of the Hessian solutions in an important
setting: learning deep linear networks from linear measurements, also known as
\emph{deep matrix factorization}. We show that for all depth greater than one,
with the standard Restricted Isometry Property (RIP) on the measurements,
minimizing the trace of Hessian is approximately equivalent to minimizing the
Schatten 1-norm of the corresponding end-to-end matrix parameters (i.e., the
product of all layer matrices), which in turn leads to better generalization.
We empirically verify our theoretical findings on synthetic datasets
Modeling of a Liquid Leaf Target TNSA Experiment using Particle-In-Cell Simulations and Deep Learning
Liquid leaf targets show promise as high repetition rate targets for
laser-based ion acceleration using the Target Normal Sheath Acceleration (TNSA)
mechanism and are currently under development. In this work, we discuss the
effects of different ion species and investigate how they can be leveraged for
use as a possible laser-driven neutron source. To aid in this research, we
develop a surrogate model for liquid leaf target laser-ion acceleration
experiments, based on artificial neural networks. The model is trained using
data from Particle-In-Cell (PIC) simulations. The fast inference speed of our
deep learning model allows us to optimize experimental parameters for maximum
ion energy and laser-energy conversion efficiency. An analysis of parameter
influence on our model output, using Sobol and PAWN indices, provides deeper
insights into the laser-plasma system
Modular lifelong machine learning
Deep learning has drastically improved the state-of-the-art in many important fields, including computer vision and natural language processing (LeCun et al., 2015). However, it is expensive to train a deep neural network on a machine learning problem. The overall training cost further increases when one wants to solve additional problems. Lifelong machine learning (LML) develops algorithms that aim to efficiently learn to solve a sequence of problems, which become available one at a time. New problems are solved with less resources by transferring previously learned knowledge. At the same time, an LML algorithm needs to retain good performance on all encountered problems, thus avoiding catastrophic forgetting. Current approaches do not possess all the desired properties of an LML algorithm. First, they primarily focus on preventing catastrophic forgetting (Diaz-Rodriguez et al., 2018; Delange et al., 2021). As a result, they neglect some knowledge transfer properties. Furthermore, they assume that all problems in a sequence share the same input space. Finally, scaling these methods to a large sequence of problems remains a challenge.
Modular approaches to deep learning decompose a deep neural network into sub-networks, referred to as modules. Each module can then be trained to perform an atomic transformation, specialised in processing a distinct subset of inputs. This modular approach to storing knowledge makes it easy to only reuse the subset of modules which are useful for the task at hand.
This thesis introduces a line of research which demonstrates the merits of a modular approach to lifelong machine learning, and its ability to address the aforementioned shortcomings of other methods. Compared to previous work, we show that a modular approach can be used to achieve more LML properties than previously demonstrated. Furthermore, we develop tools which allow modular LML algorithms to scale in order to retain said properties on longer sequences of problems.
First, we introduce HOUDINI, a neurosymbolic framework for modular LML. HOUDINI represents modular deep neural networks as functional programs and accumulates a library of pre-trained modules over a sequence of problems. Given a new problem, we use program synthesis to select a suitable neural architecture, as well as a high-performing combination of pre-trained and new modules. We show that our approach has most of the properties desired from an LML algorithm. Notably, it can perform forward transfer, avoid negative transfer and prevent catastrophic forgetting, even across problems with disparate input domains and problems which require different neural architectures.
Second, we produce a modular LML algorithm which retains the properties of HOUDINI but can also scale to longer sequences of problems. To this end, we fix the choice of a neural architecture and introduce a probabilistic search framework, PICLE, for searching through different module combinations. To apply PICLE, we introduce two probabilistic models over neural modules which allows us to efficiently identify promising module combinations.
Third, we phrase the search over module combinations in modular LML as black-box optimisation, which allows one to make use of methods from the setting of hyperparameter optimisation (HPO). We then develop a new HPO method which marries a multi-fidelity approach with model-based optimisation. We demonstrate that this leads to improvement in anytime performance in the HPO setting and discuss how this can in turn be used to augment modular LML methods.
Overall, this thesis identifies a number of important LML properties, which have not all been attained in past methods, and presents an LML algorithm which can achieve all of them, apart from backward transfer
Interpretable and explainable machine learning for ultrasonic defect sizing
Despite its popularity in literature, there are few examples of machine learning (ML) being used for industrial nondestructive evaluation (NDE) applications. A significant barrier is the ‘black box’ nature of most ML algorithms. This paper aims to improve the interpretability and explainability of ML for ultrasonic NDE by presenting a novel dimensionality reduction method: Gaussian feature approximation (GFA). GFA involves fitting a 2D elliptical Gaussian function an ultrasonic image and storing the seven parameters that describe each Gaussian. These seven parameters can then be used as inputs to data analysis methods such as the defect sizing neural network presented in this paper. GFA is applied to ultrasonic defect sizing for inline pipe inspection as an example application. This approach is compared to sizing with the same neural network, and two other dimensionality reduction methods (the parameters of 6 dB drop boxes and principal component analysis), as well as a convolutional neural network applied to raw ultrasonic images. Of the dimensionality reduction methods tested, GFA features produce the closest sizing accuracy to sizing from the raw images, with only a 23% increase in RMSE, despite a 96.5% reduction in the dimensionality of the input data. Implementing ML with GFA is implicitly more interpretable than doing so with principal component analysis or raw images as inputs, and gives significantly more sizing accuracy than 6 dB drop boxes. Shapley additive explanations (SHAP) are used to calculate how each feature contributes to the prediction of an individual defect’s length. Analysis of SHAP values demonstrates that the GFA-based neural network proposed displays many of the same relationships between defect indications and their predicted size as occur in traditional NDE sizing methods
Online Network Source Optimization with Graph-Kernel MAB
We propose Grab-UCB, a graph-kernel multi-arms bandit algorithm to learn
online the optimal source placement in large scale networks, such that the
reward obtained from a priori unknown network processes is maximized. The
uncertainty calls for online learning, which suffers however from the curse of
dimensionality. To achieve sample efficiency, we describe the network processes
with an adaptive graph dictionary model, which typically leads to sparse
spectral representations. This enables a data-efficient learning framework,
whose learning rate scales with the dimension of the spectral representation
model instead of the one of the network. We then propose Grab-UCB, an online
sequential decision strategy that learns the parameters of the spectral
representation while optimizing the action strategy. We derive the performance
guarantees that depend on network parameters, which further influence the
learning curve of the sequential decision strategy We introduce a
computationally simplified solving method, Grab-arm-Light, an algorithm that
walks along the edges of the polytope representing the objective function.
Simulations results show that the proposed online learning algorithm
outperforms baseline offline methods that typically separate the learning phase
from the testing one. The results confirm the theoretical findings, and further
highlight the gain of the proposed online learning strategy in terms of
cumulative regret, sample efficiency and computational complexity
- …