63 research outputs found

    Charge-Transfer Selectivity and Quantum Interference in Real-Time Electron Dynamics: Gaining Insights from Time-Dependent Configuration Interaction Simulations

    Full text link
    Many-electron wavepacket dynamics based on time-dependent configuration interaction (TDCI) is a numerically rigorous approach to quantitatively model electron-transfer across molecular junctions. TDCI simulations of cyanobenzene thiolates---para- and meta-linked to an acceptor gold atom---show donor states \emph{conjugating} with the benzene π\pi-network to allow better through-molecule electron migration in the para isomer compared to the meta counterpart. For dynamics involving \emph{non-conjugating} states, we find electron-injection to stem exclusively from distance-dependent non-resonant quantum mechanical tunneling, in which case the meta isomer exhibits better dynamics. Computed trend in donor-to-acceptor net-electron transfer through differently linked azulene bridges agrees with the trend seen in low-bias conductivity measurements. Disruption of π\pi-conjugation has been shown to be the cause of diminished electron-injection through the 1,3-azulene, a pathological case for graph-based diagnosis of destructive quantum interference. Furthermore, we demonstrate quantum interference of many-electron wavefunctions to drive para- vs. meta- selectivity in the coherent evolution of superposed π\pi(CN)- and σ\sigma(NC-C)-type wavepackets. Analyses reveal that in the para-linked benzene, σ\sigma and π\pi MOs localized at the donor terminal are \emph{in-phase} leading to constructive interference of electron density distribution while phase-flip of one of the MOs in the meta isomer results in destructive interference. These findings suggest that \emph{a priori} detection of orbital phase-flip and quantum coherence conditions can aid in molecular device design strategies

    Many Molecular Properties from One Kernel in Chemical Space

    Get PDF
    We introduce property-independent kernels for machine learning modeling of arbitrarily many molecular properties. The kernels encode molecular structures for training sets of varying size, as well as similarity measures sufficiently diffuse in chemical space to sample over all training molecules. Corresponding molecular reference properties provided, they enable the instantaneous generation of ML models which can systematically be improved through the addition of more data. This idea is exemplified for single kernel based modeling of internal energy, enthalpy, free energy, heat capacity, polarizability, electronic spread, zero-point vibrational energy, energies of frontier orbitals, HOMO-LUMO gap, and the highest fundamental vibrational wavenumber. Models of these properties are trained and tested using 112 kilo organic molecules of similar size. Resulting models are discussed as well as the kernels' use for generating and using other property models

    Electronic Spectra from TDDFT and Machine Learning in Chemical Space

    Get PDF
    Due to its favorable computational efficiency time-dependent (TD) density functional theory (DFT) enables the prediction of electronic spectra in a high-throughput manner across chemical space. Its predictions, however, can be quite inaccurate. We resolve this issue with machine learning models trained on deviations of reference second-order approximate coupled-cluster singles and doubles (CC2) spectra from TDDFT counterparts, or even from DFT gap. We applied this approach to low-lying singlet-singlet vertical electronic spectra of over 20 thousand synthetically feasible small organic molecules with up to eight CONF atoms. The prediction errors decay monotonously as a function of training set size. For a training set of 10 thousand molecules, CC2 excitation energies can be reproduced to within ±\pm0.1 eV for the remaining molecules. Analysis of our spectral database via chromophore counting suggests that even higher accuracies can be achieved. Based on the evidence collected, we discuss open challenges associated with data-driven modeling of high-lying spectra, and transition intensities

    Genetic optimization of training sets for improved machine learning models of molecular properties

    Get PDF
    The training of molecular models of quantum mechanical properties based on statistical machine learning requires large datasets which exemplify the map from chemical structure to molecular property. Intelligent a priori selection of training examples is often difficult or impossible to achieve as prior knowledge may be sparse or unavailable. Ordinarily representative selection of training molecules from such datasets is achieved through random sampling. We use genetic algorithms for the optimization of training set composition consisting of tens of thousands of small organic molecules. The resulting machine learning models are considerably more accurate with respect to small randomly selected training sets: mean absolute errors for out-of-sample predictions are reduced to ~25% for enthalpies, free energies, and zero-point vibrational energy, to ~50% for heat-capacity, electron-spread, and polarizability, and by more than ~20% for electronic properties such as frontier orbital eigenvalues or dipole-moments. We discuss and present optimized training sets consisting of 10 molecular classes for all molecular properties studied. We show that these classes can be used to design improved training sets for the generation of machine learning models of the same properties in similar but unrelated molecular sets.Comment: 9 pages, 6 figure

    Big Data meets Quantum Chemistry Approximations: The Δ\Delta-Machine Learning Approach

    Full text link
    Chemically accurate and comprehensive studies of the virtual space of all possible molecules are severely limited by the computational cost of quantum chemistry. We introduce a composite strategy that adds machine learning corrections to computationally inexpensive approximate legacy quantum methods. After training, highly accurate predictions of enthalpies, free energies, entropies, and electron correlation energies are possible, for significantly larger molecular sets than used for training. For thermochemical properties of up to 16k constitutional isomers of C7_7H10_{10}O2_2 we present numerical evidence that chemical accuracy can be reached. We also predict electron correlation energy in post Hartree-Fock methods, at the computational cost of Hartree-Fock, and we establish a qualitative relationship between molecular entropy and electron correlation. The transferability of our approach is demonstrated, using semi-empirical quantum chemistry and machine learning models trained on 1 and 10\% of 134k organic molecules, to reproduce enthalpies of all remaining molecules at density functional theory level of accuracy

    Resolution-vs.-Accuracy Dilemma in Machine Learning Modeling of Electronic Excitation Spectra

    Full text link
    In this study, we explore the potential of machine learning for modeling molecular electronic spectral intensities as a continuous function in a given wavelength range. Since presently available chemical space datasets provide excitation energies and corresponding oscillator strengths for only a few valence transitions, here, we present a new dataset -- bigqm7ω\omega -- with 12,880 molecules containing up to 7 CONF atoms and report several ground state and excited state properties. A publicly accessible web-based data-mining platform is presented to facilitate on-the-fly screening of several molecular properties including harmonic vibrational and electronic spectra. For all molecules, we present all singlet electronic transitions calculated using the time-dependent density functional theory framework with the ω\omegaB97XD exchange-correlation functional and a diffuse-function augmented basis set. The resulting spectra predominantly span the X-ray to deep-UV region (10--120 nm). To compare the target spectra with predictions based on small basis sets, we integrate spectral intensities in bins and show that good agreement (confidence score >75>75) is obtained only at the expense of the resolution. Compared to this, machine learning models with latest structural representations trained directly using <10%<10\% of the target data recover the spectra of the remaining molecules with better accuracies at a desirable <1<1-nm wavelength resolution.Comment: Major update: New data described and new images include
    corecore