193,404 research outputs found

    Analyzing machine learning models to accelerate generation of fundamental materials insights

    Get PDF
    Machine learning for materials science envisions the acceleration of basic science research through automated identification of key data relationships to augment human interpretation and gain scientific understanding. A primary role of scientists is extraction of fundamental knowledge from data, and we demonstrate that this extraction can be accelerated using neural networks via analysis of the trained data model itself rather than its application as a prediction tool. Convolutional neural networks excel at modeling complex data relationships in multi-dimensional parameter spaces, such as that mapped by a combinatorial materials science experiment. Measuring a performance metric in a given materials space provides direct information about (locally) optimal materials but not the underlying materials science that gives rise to the variation in performance. By building a model that predicts performance (in this case photoelectrochemical power generation of a solar fuels photoanode) from materials parameters (in this case composition and Raman signal), subsequent analysis of gradients in the trained model reveals key data relationships that are not readily identified by human inspection or traditional statistical analyses. Human interpretation of these key relationships produces the desired fundamental understanding, demonstrating a framework in which machine learning accelerates data interpretation by leveraging the expertize of the human scientist. We also demonstrate the use of neural network gradient analysis to automate prediction of the directions in parameter space, such as the addition of specific alloying elements, that may increase performance by moving beyond the confines of existing data

    Graph neural networks for materials science and chemistry

    Get PDF
    Machine learning plays an increasingly important role in many areas of chemistry and materials science, being used to predict materials properties, accelerate simulations, design new structures, and predict synthesis routes of new materials. Graph neural networks (GNNs) are one of the fastest growing classes of machine learning models. They are of particular relevance for chemistry and materials science, as they directly work on a graph or structural representation of molecules and materials and therefore have full access to all relevant information required to characterize materials. In this Review, we provide an overview of the basic principles of GNNs, widely used datasets, and state-of-the-art architectures, followed by a discussion of a wide range of recent applications of GNNs in chemistry and materials science, and concluding with a road-map for the further development and application of GNNs

    Optimized Crystallographic Graph Generation for Material Science

    Full text link
    Graph neural networks are widely used in machine learning applied to chemistry, and in particular for material science discovery. For crystalline materials, however, generating graph-based representation from geometrical information for neural networks is not a trivial task. The periodicity of crystalline needs efficient implementations to be processed in real-time under a massively parallel environment. With the aim of training graph-based generative models of new material discovery, we propose an efficient tool to generate cutoff graphs and k-nearest-neighbours graphs of periodic structures within GPU optimization. We provide pyMatGraph a Pytorch-compatible framework to generate graphs in real-time during the training of neural network architecture. Our tool can update a graph of a structure, making generative models able to update the geometry and process the updated graph during the forward propagation on the GPU side. Our code is publicly available at https://github.com/aklipf/mat-graph

    Putting Chemical Knowledge to Work in Machine Learning for Reactivity

    Get PDF
    Machine learning has been used to study chemical reactivity for a long time in fields such as physical organic chemistry, chemometrics and cheminformatics. Recent advances in computer science have resulted in deep neural networks that can learn directly from the molecular structure. Neural networks are a good choice when large amounts of data are available. However, many datasets in chemistry are small, and models utilizing chemical knowledge are required for good performance. Adding chemical knowledge can be achieved either by adding more information about the molecules or by adjusting the model architecture itself. The current method of choice for adding more information is descriptors based on computed quantum-chemical properties. Exciting new research directions show that it is possible to augment deep learning with such descriptors for better performance in the low-data regime. To modify the models, differentiable programming enables seamless merging of neural networks with mathematical models from chemistry and physics. The resulting methods are also more data-efficient and make better predictions for molecules that are different from the initial dataset on which they were trained. Application of these chemistry-informed machine learning methods promise to accelerate research in fields such as drug design, materials design, catalysis and reactivity

    DiSCoMaT: Distantly Supervised Composition Extraction from Tables in Materials Science Articles

    Full text link
    A crucial component in the curation of KB for a scientific domain (e.g., materials science, foods & nutrition, fuels) is information extraction from tables in the domain's published research articles. To facilitate research in this direction, we define a novel NLP task of extracting compositions of materials (e.g., glasses) from tables in materials science papers. The task involves solving several challenges in concert, such as tables that mention compositions have highly varying structures; text in captions and full paper needs to be incorporated along with data in tables; and regular languages for numbers, chemical compounds and composition expressions must be integrated into the model. We release a training dataset comprising 4,408 distantly supervised tables, along with 1,475 manually annotated dev and test tables. We also present a strong baseline DISCOMAT, that combines multiple graph neural networks with several task-specific regular expressions, features, and constraints. We show that DISCOMAT outperforms recent table processing architectures by significant margins.Comment: Accepted long paper at ACL 2023 (https://2023.aclweb.org/program/accepted_main_conference/

    Material Informatics through Neural Networks on Ab-Initio Electron Charge Densities: the Role of Transfer Learning

    Full text link
    In this work, the dynamic realms of Materials Science and Computer Science advancements meet the critical challenge of identifying efficient descriptors capable of capturing the essential features of physical systems. Such task has remained formidable, with solutions often involving ad-hoc scalar and vectorial sets of materials properties, making optimization and transferability challenging. We extract representations directly from ab-initio differential electron charge density profiles using Neural Networks, highlighting the pivotal role of transfer learning in such task. Firstly, we demonstrate significant improvements in regression of a specific defected-materials property with respect to training a deep network from scratch, both in terms of predictions and their reproducibilities, by considering various pre-trained models and selecting the optimal one after fine-tuning. The remarkable performances obtained confirmed the transferability of the existent pre-trained Convolutional Neural Networks (CNNs) on physics domain data, very different from the original training data. Secondly, we demonstrate a saturation in the regression capabilities of computer vision models towards properties of an extensive variety of undefected systems, and how it can be overcome with the help of large language model (LLM) transformers, with as little text information as composition names. Finally, we prove the insufficiency of open-models, like GPT-4, in achieving the analogous tasks and performances as the proposed domain-specific ones. The work offers a promising avenue for enhancing the effectiveness of descriptor identification in complex physical systems, shedding light over the power of transfer learning to easily adapt and combine available models, with different modalities, to the physics domain, at the same time opening space to a benchmark for LLMs capabilities in such domain

    Quality by design approach for tablet formulations containing spray coated ramipril by using artificial intelligence techniques

    Get PDF
    Different software programs based on mathematical models have been developed to aid the product development process. Recent developments in mathematics and computer science have resulted in new programs based on artificial neural networks (ANN) techniques. These programs have been used to develop and formulate pharmaceutical products. In this study, intelligent software was used to predict the relationship between the materials that were used in tablet formulation and the tablet specifications and to determine highly detailed information about the interactions between the formulation parameters and the specifications. The input data were generated from historical data and the results obtained from analyzing tablets produced by different formulations. The relative significance of inputs on various outputs such as assay, dissolution in 30 min and crushing strengths, was investigated using the artificial neural networks (ANNs), neurofuzzy logic and genetic programming (FormRules, INForm ANN and GEP).This study indicated that ANN and GEP can be used effectively for optimizing formulations and that GEP can be evaluated statistically because of the openness of its equations. Additionally, FormRules was very helpful for teasing out the relationships between the inputs (formulation variables) and the outputs

    Quality by design approach for tablet formulations containing spray coated ramipril by using artificial intelligence techniques

    Get PDF
    Different software programs based on mathematical models have been developed to aid the product development process. Recent developments in mathematics and computer science have resulted in new programs based on artificial neural networks (ANN) techniques. These programs have been used to develop and formulate pharmaceutical products. In this study, intelligent software was used to predict the relationship between the materials that were used in tablet formulation and the tablet specifications and to determine highly detailed information about the interactions between the formulation parameters and the specifications. The input data were generated from historical data and the results obtained from analyzing tablets produced by different formulations. The relative significance of inputs on various outputs such as assay, dissolution in 30 min and crushing strengths, was investigated using the artificial neural networks (ANNs), neurofuzzy logic and genetic programming (FormRules, INForm ANN and GEP).This study indicated that ANN and GEP can be used effectively for optimizing formulations and that GEP can be evaluated statistically because of the openness of its equations. Additionally, FormRules was very helpful for teasing out the relationships between the inputs (formulation variables) and the outputs
    corecore