35 research outputs found
Explainable AI reveals changes in skin microbiome composition linked to phenotypic differences
Alterations in the human microbiome have been observed in a variety of conditions such as asthma, gingivitis, dermatitis and cancer, and much remains to be learned about the links between the microbiome and human health. The fusion of artificial intelligence with rich microbiome datasets can offer an improved understanding of the microbiome’s role in human health. To gain actionable insights it is essential to consider both the predictive power and the transparency of the models by providing explanations for the predictions. We combine the collection of leg skin microbiome samples from two healthy cohorts of women with the application of an explainable artificial intelligence (EAI) approach that provides accurate predictions of phenotypes with explanations. The explanations are expressed in terms of variations in the relative abundance of key microbes that drive the predictions. We predict skin hydration, subject's age, pre/post-menopausal status and smoking status from the leg skin microbiome. The changes in microbial composition linked to skin hydration can accelerate the development of personalized treatments for healthy skin, while those associated with age may offer insights into the skin aging process. The leg microbiome signatures associated with smoking and menopausal status are consistent with previous findings from oral/respiratory tract microbiomes and vaginal/gut microbiomes respectively. This suggests that easily accessible microbiome samples could be used to investigate health-related phenotypes, offering potential for non-invasive diagnosis and condition monitoring. Our EAI approach sets the stage for new work focused on understanding the complex relationships between microbial communities and phenotypes. Our approach can be applied to predict any condition from microbiome samples and has the potential to accelerate the development of microbiome-based personalized therapeutics and non-invasive diagnostics
Functional materials discovery using energy–structure–function maps
Molecular crystals cannot be designed in the same manner as macroscopic objects, because they do not assemble according to simple, intuitive rules. Their structures result from the balance of many weak interactions, rather than from the strong and predictable bonding patterns found in metal–organic frameworks and covalent organic frameworks. Hence, design strategies that assume a topology or other structural blueprint will often fail. Here we combine computational crystal structure prediction and property prediction to build energy–structure–function maps that describe the possible structures and properties that are available to a candidate molecule. Using these maps, we identify a highly porous solid, which has the lowest density reported for a molecular crystal so far. Both the structure of the crystal and its physical properties, such as methane storage capacity and guest-molecule selectivity, are predicted using the molecular structure as the only input. More generally, energy–structure–function maps could be used to guide the experimental discovery of materials with any target function that can be calculated from predicted crystal structures, such as electronic structure or mechanical properties
Predicting the outcomes of organic reactions via machine learning: are current descriptors sufficient?
As machine learning/artificial intelligence algorithms are defeating chess masters and, most recently, GO champions, there is interest -and hope -that they will prove equally useful in assisting chemists in predicting outcomes of organic reactions. This paper demonstrates, however, that the applicability of machine learning to the problems of chemical reactivity over diverse types of chemistries remains limited -in particular, with the currently available chemical descriptors, fundamental mathematical theorems impose upper bounds on the accuracy with which raction yields and times can be predicted. Improving the performance of machine-learning methods calls for the development of fundamentally new chemical descriptors
Reticular synthesis of porous molecular 1D nanotubes and 3D networks
Synthetic control over pore size and pore connectivity is the crowning achievement for porous metal–organic frameworks (MOFs). The same level of control has not been achieved for molecular crystals, which are not defined by strong, directional intermolecular coordination bonds. Hence, molecular crystallization is inherently less controllable than framework crystallization, and there are fewer examples of ‘reticular synthesis’, in which multiple building blocks can be assembled according to a common assembly motif. Here we apply a chiral recognition strategy to a new family of tubular covalent cages to create both 1D porous nanotubes and 3D diamondoid pillared porous networks. The diamondoid networks are analogous to MOFs prepared from tetrahedral metal nodes and linear ditopic organic linkers. The crystal structures can be rationalized by computational lattice-energy searches, which provide an in silico screening method to evaluate candidate molecular building blocks. These results are a blueprint for applying the ‘node and strut’ principles of reticular synthesis to molecular crystals
Parallel and distributed thompson sampling for large-scale accelerated exploration of chemical space
Chemical space is so large that brute force searches for new interesting molecules arc in-feasible. High-throughput virtual screening via computer cluster simulations can speed up the discovery process by collecting very large amounts of data in parallel, e.g., up to hundreds or thousands of parallel measurements. Bayesian optimization (BO) can produce additional acceleration by sequentially identifying the most useful simulations or experiments to be performed next. However, current BO methods cannot scale to the large numbers of parallel measurements and the massive libraries of molecules currently used in high-throughput screening. Here, we propose a scalable solution based on a parallel and distributed implementation of Thompson sampling (PDTS). We show that, in small scale problems, PDTS performs similarly as parallel expected improvement (EI), a batch version of the most widely used BO heuristic. Additionally, in settings where parallel EI does not scale, PDTS outperforms other scalable baselines such as a greedy search, c-grcedy approaches and a random search method. These results show that PDTS is a successful solution for large-scale par-allel BO
Utilizing Machine Learning for Efficient Parameterization of Coarse Grained Molecular Force Fields
This work demonstrates the use of open literature data to force field paramterization via a novel approach applying Bayesian optimization. We have selected Dissipative Particle Dynamics (DPD) as the simulation method in this proof-of-concept work