28 research outputs found

    Mechanism Deduction from Noisy Chemical Reaction Networks

    Full text link
    We introduce KiNetX, a fully automated meta-algorithm for the kinetic analysis of complex chemical reaction networks derived from semi-accurate but efficient electronic structure calculations. It is designed to (i) accelerate the automated exploration of such networks, and (ii) cope with model-inherent errors in electronic structure calculations on elementary reaction steps. We developed and implemented KiNetX to possess three features. First, KiNetX evaluates the kinetic relevance of every species in a (yet incomplete) reaction network to confine the search for new elementary reaction steps only to those species that are considered possibly relevant. Second, KiNetX identifies and eliminates all kinetically irrelevant species and elementary reactions to reduce a complex network graph to a comprehensible mechanism. Third, KiNetX estimates the sensitivity of species concentrations toward changes in individual rate constants (derived from relative free energies), which allows us to systematically select the most efficient electronic structure model for each elementary reaction given a predefined accuracy. The novelty of KiNetX consists in the rigorous propagation of correlated free-energy uncertainty through all steps of our kinetic analyis. To examine the performance of KiNetX, we developed AutoNetGen. It semirandomly generates chemistry-mimicking reaction networks by encoding chemical logic into their underlying graph structure. AutoNetGen allows us to consider a vast number of distinct chemistry-like scenarios and, hence, to discuss assess the importance of rigorous uncertainty propagation in a statistical context. Our results reveal that KiNetX reliably supports the deduction of product ratios, dominant reaction pathways, and possibly other network properties from semi-accurate electronic structure data.Comment: 36 pages, 4 figures, 2 table

    Heuristics-Guided Exploration of Reaction Mechanisms

    Full text link
    For the investigation of chemical reaction networks, the efficient and accurate determination of all relevant intermediates and elementary reactions is mandatory. The complexity of such a network may grow rapidly, in particular if reactive species are involved that might cause a myriad of side reactions. Without automation, a complete investigation of complex reaction mechanisms is tedious and possibly unfeasible. Therefore, only the expected dominant reaction paths of a chemical reaction network (e.g., a catalytic cycle or an enzymatic cascade) are usually explored in practice. Here, we present a computational protocol that constructs such networks in a parallelized and automated manner. Molecular structures of reactive complexes are generated based on heuristic rules derived from conceptual electronic-structure theory and subsequently optimized by quantum chemical methods to produce stable intermediates of an emerging reaction network. Pairs of intermediates in this network that might be related by an elementary reaction according to some structural similarity measure are then automatically detected and subjected to an automated search for the connecting transition state. The results are visualized as an automatically generated network graph, from which a comprehensive picture of the mechanism of a complex chemical process can be obtained that greatly facilitates the analysis of the whole network. We apply our protocol to the Schrock dinitrogen-fixation catalyst to study alternative pathways of catalytic ammonia production.Comment: 27 pages, 9 figure

    Electrosynthetic screening and modern optimization strategies for electrosynthesis of highly value-added products

    Get PDF
    Unlike common analytical techniques such as cyclic voltammetry, statistics-based optimization tools are not yet often in the toolbox of preparative organic electrochemists. In general, experimental effort is not optimally utilized because the selection of experimental conditions is based on the one-variable-at-a-time principle. We will summarize statistically motivated optimization approaches already used in the context of electroorganic synthesis. We discuss the central ideas of these optimization methods which originate from other fields of chemistry in relation to electrosynthetic applications

    A multi-label, dual-output deep neural network for automated bug triaging

    Full text link
    Bug tracking enables the monitoring and resolution of issues and bugs within organizations. Bug triaging, or assigning bugs to the owner(s) who will resolve them, is a critical component of this process because there are many incorrect assignments that waste developer time and reduce bug resolution throughput. In this work, we explore the use of a novel two-output deep neural network architecture (Dual DNN) for triaging a bug to both an individual team and developer, simultaneously. Dual DNN leverages this simultaneous prediction by exploiting its own guess of the team classes to aid in developer assignment. A multi-label classification approach is used for each of the two outputs to learn from all interim owners, not just the last one who closed the bug. We make use of a heuristic combination of the interim owners (owner-importance-weighted labeling) which is converted into a probability mass function (pmf). We employ a two-stage learning scheme, whereby the team portion of the model is trained first and then held static to train the team--developer and bug--developer relationships. The scheme employed to encode the team--developer relationships is based on an organizational chart (org chart), which renders the model robust to organizational changes as it can adapt to role changes within an organization. There is an observed average lift (with respect to both team and developer assignment) of 13%-points in 11-fold incremental-learning cross-validation (IL-CV) accuracy for Dual DNN utilizing owner-weighted labels compared with the traditional multi-class classification approach. Furthermore, Dual DNN with owner-weighted labels achieves average 11-fold IL-CV accuracies of 76% (team assignment) and 55% (developer assignment), outperforming reference models by 14%- and 25%-points, respectively, on a proprietary dataset with 236,865 entries.Comment: 8 pages, 2 figures, 9 table

    Reliable estimation of prediction uncertainty for physico-chemical property models

    Full text link
    The predictions of parameteric property models and their uncertainties are sensitive to systematic errors such as inconsistent reference data, parametric model assumptions, or inadequate computational methods. Here, we discuss the calibration of property models in the light of bootstrapping, a sampling method akin to Bayesian inference that can be employed for identifying systematic errors and for reliable estimation of the prediction uncertainty. We apply bootstrapping to assess a linear property model linking the 57Fe Moessbauer isomer shift to the contact electron density at the iron nucleus for a diverse set of 44 molecular iron compounds. The contact electron density is calculated with twelve density functionals across Jacob's ladder (PWLDA, BP86, BLYP, PW91, PBE, M06-L, TPSS, B3LYP, B3PW91, PBE0, M06, TPSSh). We provide systematic-error diagnostics and reliable, locally resolved uncertainties for isomer-shift predictions. Pure and hybrid density functionals yield average prediction uncertainties of 0.06-0.08 mm/s and 0.04-0.05 mm/s, respectively, the latter being close to the average experimental uncertainty of 0.02 mm/s. Furthermore, we show that both model parameters and prediction uncertainty depend significantly on the composition and number of reference data points. Accordingly, we suggest that rankings of density functionals based on performance measures (e.g., the coefficient of correlation, r2, or the root-mean-square error, RMSE) should not be inferred from a single data set. This study presents the first statistically rigorous calibration analysis for theoretical Moessbauer spectroscopy, which is of general applicability for physico-chemical property models and not restricted to isomer-shift predictions. We provide the statistically meaningful reference data set MIS39 and a new calibration of the isomer shift based on the PBE0 functional.Comment: 49 pages, 9 figures, 7 table

    Computational Systems Chemistry with Rigorous Uncertainty Quantification

    No full text
    The success of in silico design approaches for molecules and materials that attempt to solve major technological issues of our society depends crucially on knowing the uncertainty of property predictions. Calibration is an essential model-building approach in this respect as it renders the inference of uncertainty-equipped predictions based on computer simulations possible. However, there exist various pitfalls that may affect the transferability of a property model to new data. By resorting to Bayesian inference and resampling methods (bootstrapping and cross-validation), we discuss issues such as the proper selection of reference data and property models, the identification and elimination of systematic errors, and the rigorous quantification of prediction uncertainty. We apply this statistical calibration approach to the prediction of 57Fe Mössbauer isomer shifts from electron densities obtained with density functional theory. Our findings reveal that the specific selection of reference iron complexes can have a significant effect on the ranking of density functionals with respect to model transferability. Furthermore, we show that bootstrapping can be harnessed to determine the sensitivity of such model rankings to changes in the reference data set, which is inevitable to guide future computational studies. Such a statistically rigorous approach to calibration is almost unknown to chemistry. Our study is one of the very few addressing this issue and its results can be applied by all chemists to arbitrary property models with our open-source software reBoot. In this thesis, we define a new standard for the calibration of computational results due to the rigor, transparency, and generality of our statistical approach, which is completely automatable. Black-box uncertainty quantification can also be applied to macroscopic systems by propagating the uncertainties inferred for single-molecule properties, which will ultimately allow modeling in chemistry to accelerate the discovery of important drugs, organic materials for solar cells, electrolytes for flow batteries, etc. A rather fundamental application area of this systems-focussed uncertainty quantification approach is the understanding of complex chemical reaction mechanisms, which is therefore another focus of this thesis. For an approach that accounts for all elementary processes within a reactive mixture, it is essential to know all relevant intermediates and transition states, to determine relative (free) energies, to quantify their uncertainties, and to model the systems kinetics based on uncertainty propagation. The advantage of a holistic in silico approach to chemistry is that the origin of all data can be rigorously controlled, which allows for reliable uncertainty quantification and propagation. In this thesis, we present the first automated exploration of parts of chemical reaction space based on quantum mechanical descriptors at the example of synthetic nitrogen fixation. Moreover, an extension to the exploration strategy considering uncertainty propagation through all stages of in silico modeling is presented in detail at the example of the formose reaction. It is generally hard to model the kinetics of such complex reactive systems as they usually constitute processes spanning multiple time scales. Here, we present a simple and efficient strategy based on computational singular perturbation, which allows us to model the kinetics of complex chemical systems at arbitrary time scales. To study arbitrary reaction networks of dilute chemical systems (low-pressure gas or low-concentration solution phase), we implemented a generalized scheme of our kinetic modeling approach referred to as KiNetX. Main features of the completely automated KiNetX meta-algorithm are hierarchical network reduction, uncertainty propagation, and global sensitivity analysis, the latter of which detects critical (uncertainty-amplifying) regions of a network such that more complex electronic structure models are only employed if necessary. We also developed an automatic generator of abstract reaction networks encoding chemical logic, named AutoNetGen, which is coupled to KiNetX and allows us to examine a multitude of different chemical scenarios in short time. In a final case study, we apply the insights gained from computational systems chemistry with rigorous uncertainty quantification to model the thermochemistry, kinetics, and spectroscopic properties of iron porphyrin compounds, which constitute a crucial type of active centers in metalloenzyme research

    Uncertainty Quantification of Reactivity Scales

    No full text
    ​According to Mayr, polar organic synthesis can be rationalized by a simple empirical relationship linking bimolecular rate constants to as few as three reactivity parameters. Here, we propose an extension to Mayr’s reactivity method that is rooted in uncertainty quantification and transforms the reactivity parameters into probability distributions. Through uncertainty propagation, these distributions can be transformed into uncertainty estimates for bimolecular rate constants. Chemists can exploit these virtual error bars to enhance synthesis planning and to decrease the ambiguity of conclusions drawn from experimental data. We demonstrate the above at the example of the reference data set released by Mayr and co-workers [J. Am. Chem. Soc. 2001, 123, 9500; J. Am. Chem. Soc. 2012, 134, 13902]. As by-product of the new approach, we obtain revised reactivity parameters for 36 π-nucleophiles and 32 benzhydrylium ions

    Error Assessment of Computational Models in Chemistry

    Get PDF
    Computational models in chemistry rely on a number of approximations. The effect of such approximations on observables derived from them is often unpredictable. Therefore, it is challenging to quantify the uncertainty of a computational result, which, however, is necessary to assess the suitability of a computational model. Common performance statistics such as the mean absolute error are prone to failure as they do not distinguish the explainable (systematic) part of the errors from their unexplainable (random) part. In this paper, we discuss problems and solutions for performance assessment of computational models based on several examples from the quantum chemistry literature. For this purpose, we elucidate the different sources of uncertainty, the elimination of systematic errors, and the combination of individual uncertainty components to the uncertainty of a prediction
    corecore