20 research outputs found
Curvature-informed multi-task learning for graph networks
Properties of interest for crystals and molecules, such as band gap,
elasticity, and solubility, are generally related to each other: they are
governed by the same underlying laws of physics. However, when state-of-the-art
graph neural networks attempt to predict multiple properties simultaneously
(the multi-task learning (MTL) setting), they frequently underperform a suite
of single property predictors. This suggests graph networks may not be fully
leveraging these underlying similarities. Here we investigate a potential
explanation for this phenomenon: the curvature of each property's loss surface
significantly varies, leading to inefficient learning. This difference in
curvature can be assessed by looking at spectral properties of the Hessians of
each property's loss function, which is done in a matrix-free manner via
randomized numerical linear algebra. We evaluate our hypothesis on two
benchmark datasets (Materials Project (MP) and QM8) and consider how these
findings can inform the training of novel multi-task learning models.Comment: Published at the ICML 2022 AI for Science workshop:
https://openreview.net/forum?id=m5RYtApKFO
Evaluating the diversity and utility of materials proposed by generative models
Generative machine learning models can use data generated by scientific
modeling to create large quantities of novel material structures. Here, we
assess how one state-of-the-art generative model, the physics-guided crystal
generation model (PGCGM), can be used as part of the inverse design process. We
show that the default PGCGM's input space is not smooth with respect to
parameter variation, making material optimization difficult and limited. We
also demonstrate that most generated structures are predicted to be
thermodynamically unstable by a separate property-prediction model, partially
due to out-of-domain data challenges. Our findings suggest how generative
models might be improved to enable better inverse design.Comment: 12 pages, 9 figures. Published at SynS & ML @ ICML2023:
https://openreview.net/forum?id=2ZYbmYTKo
A Domain-Agnostic Approach for Characterization of Lifelong Learning Systems
Despite the advancement of machine learning techniques in recent years,
state-of-the-art systems lack robustness to "real world" events, where the
input distributions and tasks encountered by the deployed systems will not be
limited to the original training context, and systems will instead need to
adapt to novel distributions and tasks while deployed. This critical gap may be
addressed through the development of "Lifelong Learning" systems that are
capable of 1) Continuous Learning, 2) Transfer and Adaptation, and 3)
Scalability. Unfortunately, efforts to improve these capabilities are typically
treated as distinct areas of research that are assessed independently, without
regard to the impact of each separate capability on other aspects of the
system. We instead propose a holistic approach, using a suite of metrics and an
evaluation framework to assess Lifelong Learning in a principled way that is
agnostic to specific domains or system techniques. Through five case studies,
we show that this suite of metrics can inform the development of varied and
complex Lifelong Learning systems. We highlight how the proposed suite of
metrics quantifies performance trade-offs present during Lifelong Learning
system development - both the widely discussed Stability-Plasticity dilemma and
the newly proposed relationship between Sample Efficient and Robust Learning.
Further, we make recommendations for the formulation and use of metrics to
guide the continuing development of Lifelong Learning systems and assess their
progress in the future.Comment: To appear in Neural Network