773 research outputs found
Complexity of Many-Body Interactions in Transition Metals via Machine-Learned Force Fields from the TM23 Data Set
This work examines challenges associated with the accuracy of machine-learned
force fields (MLFFs) for bulk solid and liquid phases of d-block elements. In
exhaustive detail, we contrast the performance of force, energy, and stress
predictions across the transition metals for two leading MLFF models: a
kernel-based atomic cluster expansion method implemented using sparse Gaussian
processes (FLARE), and an equivariant message-passing neural network (NequIP).
Early transition metals present higher relative errors and are more difficult
to learn relative to late platinum- and coinage-group elements, and this trend
persists across model architectures. Trends in complexity of interatomic
interactions for different metals are revealed via comparison of the
performance of representations with different many-body order and angular
resolution. Using arguments based on perturbation theory on the occupied and
unoccupied d states near the Fermi level, we determine that the large, sharp d
density of states both above and below the Fermi level in early transition
metals leads to a more complex, harder-to-learn potential energy surface for
these metals. Increasing the fictitious electronic temperature (smearing)
modifies the angular sensitivity of forces and makes the early transition metal
forces easier to learn. This work illustrates challenges in capturing intricate
properties of metallic bonding with current leading MLFFs and provides a
reference data set for transition metals, aimed at benchmarking the accuracy
and improving the development of emerging machine-learned approximations.Comment: main text: 21 pages, 9 figures, 2 tables. supplementary information:
57 pages, 83 figures, 20 table
Modular lifelong machine learning
Deep learning has drastically improved the state-of-the-art in many important fields, including computer vision and natural language processing (LeCun et al., 2015). However, it is expensive to train a deep neural network on a machine learning problem. The overall training cost further increases when one wants to solve additional problems. Lifelong machine learning (LML) develops algorithms that aim to efficiently learn to solve a sequence of problems, which become available one at a time. New problems are solved with less resources by transferring previously learned knowledge. At the same time, an LML algorithm needs to retain good performance on all encountered problems, thus avoiding catastrophic forgetting. Current approaches do not possess all the desired properties of an LML algorithm. First, they primarily focus on preventing catastrophic forgetting (Diaz-Rodriguez et al., 2018; Delange et al., 2021). As a result, they neglect some knowledge transfer properties. Furthermore, they assume that all problems in a sequence share the same input space. Finally, scaling these methods to a large sequence of problems remains a challenge.
Modular approaches to deep learning decompose a deep neural network into sub-networks, referred to as modules. Each module can then be trained to perform an atomic transformation, specialised in processing a distinct subset of inputs. This modular approach to storing knowledge makes it easy to only reuse the subset of modules which are useful for the task at hand.
This thesis introduces a line of research which demonstrates the merits of a modular approach to lifelong machine learning, and its ability to address the aforementioned shortcomings of other methods. Compared to previous work, we show that a modular approach can be used to achieve more LML properties than previously demonstrated. Furthermore, we develop tools which allow modular LML algorithms to scale in order to retain said properties on longer sequences of problems.
First, we introduce HOUDINI, a neurosymbolic framework for modular LML. HOUDINI represents modular deep neural networks as functional programs and accumulates a library of pre-trained modules over a sequence of problems. Given a new problem, we use program synthesis to select a suitable neural architecture, as well as a high-performing combination of pre-trained and new modules. We show that our approach has most of the properties desired from an LML algorithm. Notably, it can perform forward transfer, avoid negative transfer and prevent catastrophic forgetting, even across problems with disparate input domains and problems which require different neural architectures.
Second, we produce a modular LML algorithm which retains the properties of HOUDINI but can also scale to longer sequences of problems. To this end, we fix the choice of a neural architecture and introduce a probabilistic search framework, PICLE, for searching through different module combinations. To apply PICLE, we introduce two probabilistic models over neural modules which allows us to efficiently identify promising module combinations.
Third, we phrase the search over module combinations in modular LML as black-box optimisation, which allows one to make use of methods from the setting of hyperparameter optimisation (HPO). We then develop a new HPO method which marries a multi-fidelity approach with model-based optimisation. We demonstrate that this leads to improvement in anytime performance in the HPO setting and discuss how this can in turn be used to augment modular LML methods.
Overall, this thesis identifies a number of important LML properties, which have not all been attained in past methods, and presents an LML algorithm which can achieve all of them, apart from backward transfer
Mathematical Problems in Rock Mechanics and Rock Engineering
With increasing requirements for energy, resources and space, rock engineering projects are being constructed more often and are operated in large-scale environments with complex geology. Meanwhile, rock failures and rock instabilities occur more frequently, and severely threaten the safety and stability of rock engineering projects. It is well-recognized that rock has multi-scale structures and involves multi-scale fracture processes. Meanwhile, rocks are commonly subjected simultaneously to complex static stress and strong dynamic disturbance, providing a hotbed for the occurrence of rock failures. In addition, there are many multi-physics coupling processes in a rock mass. It is still difficult to understand these rock mechanics and characterize rock behavior during complex stress conditions, multi-physics processes, and multi-scale changes. Therefore, our understanding of rock mechanics and the prevention and control of failure and instability in rock engineering needs to be furthered. The primary aim of this Special Issue âMathematical Problems in Rock Mechanics and Rock Engineeringâ is to bring together original research discussing innovative efforts regarding in situ observations, laboratory experiments and theoretical, numerical, and big-data-based methods to overcome the mathematical problems related to rock mechanics and rock engineering. It includes 12 manuscripts that illustrate the valuable efforts for addressing mathematical problems in rock mechanics and rock engineering
Decision-making with gaussian processes: sampling strategies and monte carlo methods
We study Gaussian processes and their application to decision-making in the real world. We begin by reviewing the foundations of Bayesian decision theory and show how these ideas give rise to methods such as Bayesian optimization. We investigate practical techniques for carrying out these strategies, with an emphasis on estimating and maximizing acquisition functions. Finally, we introduce pathwise approaches to conditioning Gaussian processes and demonstrate key benefits for representing random variables in this manner.Open Acces
Geometric Data Analysis: Advancements of the Statistical Methodology and Applications
Data analysis has become fundamental to our society and comes in multiple facets and approaches. Nevertheless, in research and applications, the focus was primarily on data from Euclidean vector spaces. Consequently, the majority of methods that are applied today are not suited for more general data types. Driven by needs from fields like image processing, (medical) shape analysis, and network analysis, more and more attention has recently been given to data from non-Euclidean spacesâparticularly (curved) manifolds. It has led to the field of geometric data analysis whose methods explicitly take the structure (for example, the topology and geometry) of the underlying space into account.
This thesis contributes to the methodology of geometric data analysis by generalizing several fundamental notions from multivariate statistics to manifolds. We thereby focus on two different viewpoints.
First, we use Riemannian structures to derive a novel regression scheme for general manifolds that relies on splines of generalized BĂ©zier curves. It can accurately model non-geodesic relationships, for example, time-dependent trends with saturation effects or cyclic trends. Since BĂ©zier curves can be evaluated with the constructive de Casteljau algorithm, working with data from manifolds of high dimensions (for example, a hundred thousand or more) is feasible. Relying on the regression, we further develop
a hierarchical statistical model for an adequate analysis of longitudinal data in manifolds, and a method to control for confounding variables.
We secondly focus on data that is not only manifold- but even Lie group-valued, which is frequently the case in applications. We can only achieve this by endowing the group with an affine connection structure that is generally not Riemannian. Utilizing it, we derive generalizations of several well-known dissimilarity measures between data distributions that can be used for various tasks, including hypothesis testing. Invariance under data translations is proven, and a connection to continuous distributions is given for one measure.
A further central contribution of this thesis is that it shows use cases for all notions in real-world applications, particularly in problems from shape analysis in medical imaging and archaeology. We can replicate or further quantify several known findings for shape changes of the femur and the right hippocampus under osteoarthritis and Alzheimer's, respectively. Furthermore, in an archaeological application, we obtain new insights into the construction principles of ancient sundials. Last but not least, we use the geometric structure underlying human brain connectomes to predict cognitive scores. Utilizing a sample selection procedure, we obtain state-of-the-art results
Thermodynamic and kinetic investigations of tannins using quantum chemistry
The minimum energy paths and transition states for the first two pyrolysis reactions of the tannin building blocks gallic acid and (+)-catechin were calculated by combining density functional theory with the climbing-image nudged elastic band method. For both investigated, the combined pyrolysis reaction was found to be endothermic across the full investigated temperature range and exergonic for temperatures of 1000 K and above when evaluated with the quantum chemical 'gold standard' approach CCSD(T). In the case of gallic acid, the dehydrogenation of pyrogallol was identified as the rate-determining pyrolysis step, whereas the catechol split-off was determined to be the rate-determining step of (+)-catechin pyrolysis. Additionally, simulated Raman spectra were able to explain the presence of subtle shoulder peaks in the spectrum of the binder CarboresÂźP. Another series of spectra assisted the identification of an ellagic acid pyrolysis product
Generalised latent variable models for location, scale, and shape parameters
Latent Variable Models (LVM) are widely used in social, behavioural, and educational sciences to uncover underlying associations in multivariate data using a smaller number of latent variables. However, the classical LVM framework has certain assumptions that can be restrictive in empirical applications. In particular, the distribution of the observed variables being from the exponential family and the latent variables influencing only the conditional mean of the observed variables. This thesis addresses these limitations and contributes to the current literature in two ways. First, we propose a novel class of models called Generalised Latent Variable Models for Location, Scale, and Shape parameters (GLVM-LSS). These models use linear functions of latent factors to model location, scale, and shape parameters of the itemsâ conditional distributions. By doing so, we model higher order moments such as variance, skewness, and kurtosis in terms of the latent variables, providing a more flexible framework compared to classical factor models. The model parameters are estimated using maximum likelihood estimation. Second, we address the challenge of interpreting the GLVM-LSS, which can be complex due to its increased number of parameters. We propose a penalised maximum likelihood estimation approach with automatic selection of tuning parameters. This extends previous work on penalised estimation in the LVM literature to cases without closed-form solutions. Our findings suggest that modelling the entire distribution of items, not just the conditional mean, leads to improved model fit and deeper insights into how the items reflect the latent constructs they are intended to measure. To assess the performance of the proposed methods, we conduct extensive simulation studies and apply it to real-world data from educational testing and public opinion research. The results highlight the efficacy of the GLVM-LSS framework in capturing complex relationships between observed variables and latent factors, providing valuable insights for researchers in various fields
From a better use of instrumentation to new detection methods in NMR and EPR spectroscopy
NMR and EPR spectroscopy are two of the most important techniques to get quantitative, structural or dynamical information on molecular systems. After covering the fundamentals of these magnetic resonance techniques, this thesis explores ways to improve the usage of current spectrometers and to create new instruments altogether using different detection methods with quantum sensing.
First, to deal with bandwidth and oscillating magnetic field limitations typically present for 19F nuclei in NMR or unpaired electrons in EPR, improved methods based on frequency-swept pulses are presented. The implementation of the CHORUS sequence in EPR spectroscopy is detailed. New pulse sequences, namely CHORUSCPMG, PROCHORUS and superposed frequency-swept pulses, are presented in the context of solution-state NMR spectroscopy.
Then, on-the-fly optimisation is proposed as a tool to automate EPR experiments and even develop new ones. A software package, ESR-POISE, was released to allow EPR users with commercial spectrometers to access such methods.
Finally, the construction of a spectrometer which can conduct magnetic resonance at unconventionally small scales thanks to quantum sensors (NV centres) is detailed. After describing design choices for the different elements of the instrument, a focus is made on the static magnetic field with Finite Element Analysis
Brain-wide representations of behavior spanning multiple timescales and states in C. elegans.
Changes in an animal's behavior and internal state are accompanied by widespread changes in activity across its brain. However, how neurons across the brain encode behavior and how this is impacted by state is poorly understood. We recorded brain-wide activity and the diverse motor programs of freely moving C. elegans and built probabilistic models that explain how each neuron encodes quantitative behavioral features. By determining the identities of the recorded neurons, we created an atlas of how the defined neuron classes in the C. elegans connectome encode behavior. Many neuron classes have conjunctive representations of multiple behaviors. Moreover, although many neurons encode current motor actions, others integrate recent actions. Changes in behavioral state are accompanied by widespread changes in how neurons encode behavior, and we identify these flexible nodes in the connectome. Our results provide a global map of how the cell types across an animal's brain encode its behavior
- âŠ