138 research outputs found
Prediction of Atomization Energy Using Graph Kernel and Active Learning
Data-driven prediction of molecular properties presents unique challenges to
the design of machine learning methods concerning data
structure/dimensionality, symmetry adaption, and confidence management. In this
paper, we present a kernel-based pipeline that can learn and predict the
atomization energy of molecules with high accuracy. The framework employs
Gaussian process regression to perform predictions based on the similarity
between molecules, which is computed using the marginalized graph kernel. To
apply the marginalized graph kernel, a spatial adjacency rule is first employed
to convert molecules into graphs whose vertices and edges are labeled by
elements and interatomic distances, respectively. We then derive formulas for
the efficient evaluation of the kernel. Specific functional components for the
marginalized graph kernel are proposed, while the effect of the associated
hyperparameters on accuracy and predictive confidence are examined. We show
that the graph kernel is particularly suitable for predicting extensive
properties because its convolutional structure coincides with that of the
covariance formula between sums of random variables. Using an active learning
procedure, we demonstrate that the proposed method can achieve a mean absolute
error of 0.62 +- 0.01 kcal/mol using as few as 2000 training samples on the QM7
data set
Unfolding Quantum Computer Readout Noise
In the current era of noisy intermediate-scale quantum (NISQ) computers,
noisy qubits can result in biased results for early quantum algorithm
applications. This is a significant challenge for interpreting results from
quantum computer simulations for quantum chemistry, nuclear physics, high
energy physics, and other emerging scientific applications. An important class
of qubit errors are readout errors. The most basic method to correct readout
errors is matrix inversion, using a response matrix built from simple
operations to probe the rate of transitions from known initial quantum states
to readout outcomes. One challenge with inverting matrices with large
off-diagonal components is that the results are sensitive to statistical
fluctuations. This challenge is familiar to high energy physics, where
prior-independent regularized matrix inversion techniques (`unfolding') have
been developed for years to correct for acceptance and detector effects when
performing differential cross section measurements. We study various unfolding
methods in the context of universal gate-based quantum computers with the goal
of connecting the fields of quantum information science and high energy physics
and providing a reference for future work. The method known as iterative
Bayesian unfolding is shown to avoid pathologies from commonly used matrix
inversion and least squares methods.Comment: 13 pages, 16 figures; v2 has a typo fixed in Eq. 3 and a series of
minor modification
Chemistry on quantum computers with virtual quantum subspace expansion
Several novel methods for performing calculations relevant to quantum
chemistry on quantum computers have been proposed but not yet explored
experimentally. Virtual quantum subspace expansion [T. Takeshita et al., Phys.
Rev. X 10, 011004 (2020)] is one such algorithm developed for modeling complex
molecules using their full orbital space and without the need for additional
quantum resources. We implement this method on the IBM Q platform and calculate
the potential energy curves of the hydrogen and lithium dimers using only two
qubits and simple classical post-processing. A comparable level of accuracy
would require twenty qubits with previous approaches. We also develop an
approach to minimize the impact of experimental noise on the stability of a
generalized eigenvalue problem that is a crucial component of the algorithm.
Our results demonstrate that virtual quantum subspace expansion works well in
practice
Resource-Efficient Chemistry on Quantum Computers with the Variational Quantum Eigensolver and the Double Unitary Coupled-Cluster Approach.
Applications of quantum simulation algorithms to obtain electronic energies of molecules on noisy intermediate-scale quantum (NISQ) devices require careful consideration of resources describing the complex electron correlation effects. In modeling second-quantized problems, the biggest challenge confronted is that the number of qubits scales linearly with the size of the molecular basis. This poses a significant limitation on the size of the basis sets and the number of correlated electrons included in quantum simulations of chemical processes. To address this issue and enable more realistic simulations on NISQ computers, we employ the double unitary coupled-cluster (DUCC) method to effectively downfold correlation effects into the reduced-size orbital space, commonly referred to as the active space. Using downfolding techniques, we demonstrate that properly constructed effective Hamiltonians can capture the effect of the whole orbital space in small-size active spaces. Combining the downfolding preprocessing technique with the variational quantum eigensolver, we solve for the ground-state energy of H2, Li2, and BeH2 in the cc-pVTZ basis using the DUCC-reduced active spaces. We compare these results to full configuration-interaction and high-level coupled-cluster reference calculations
ML4Chem: A Machine Learning Package for Chemistry and Materials Science
ML4Chem is an open-source machine learning library for chemistry and
materials science. It provides an extendable platform to develop and deploy
machine learning models and pipelines and is targeted to the non-expert and
expert users. ML4Chem follows user-experience design and offers the needed
tools to go from data preparation to inference. Here we introduce its atomistic
module for the implementation, deployment, and reproducibility of atom-centered
models. This module is composed of six core building blocks: data,
featurization, models, model optimization, inference, and visualization. We
present their functionality and easiness of use with demonstrations utilizing
neural networks and kernel ridge regression algorithms.Comment: 32 pages, 11 Figure
ArQTiC: A full-stack software package for simulating materials on quantum computers
ArQTiC is an open-source, full-stack software package built for the
simulations of materials on quantum computers. It currently can simulate
materials that can be modeled by any Hamiltonian derived from a generic,
one-dimensional, time-dependent Heisenberg Hamiltonain. ArQTiC includes modules
for generating quantum programs for real- and imaginary-time evolution, quantum
circuit optimization, connection to various quantum backends via the cloud, and
post-processing of quantum results. By enabling users to seamlessly perform and
analyze materials simulations on quantum computers by simply providing a
minimal input text file, ArQTiC opens this field to a broader community of
scientists from a wider range of scientific domains.Comment: 8 pages, 7 figure
Detecting Label Noise via Leave-One-Out Cross-Validation
We present a simple algorithm for identifying and correcting real-valued
noisy labels from a mixture of clean and corrupted sample points using Gaussian
process regression. A heteroscedastic noise model is employed, in which
additive Gaussian noise terms with independent variances are associated with
each and all of the observed labels. Optimizing the noise model using maximum
likelihood estimation leads to the containment of the GPR model's predictive
error by the posterior standard deviation in leave-one-out cross-validation. A
multiplicative update scheme is proposed for solving the maximum likelihood
estimation problem under non-negative constraints. While we provide proof of
convergence for certain special cases, the multiplicative scheme has
empirically demonstrated monotonic convergence behavior in virtually all our
numerical experiments. We show that the presented method can pinpoint corrupted
sample points and lead to better regression models when trained on synthetic
and real-world scientific data sets
Towards a Better Understanding of On and Off Target Effects of the Lymphocyte-Specific Kinase LCK for the Development of Novel and Safer Pharmaceuticals
In this work we have developed a multi-tiered computational platform to study protein-drug interactions. At the beginning of the workflow more efficient and less accurate methods are used to enable large libraries of proteins in many conformations and massive chemical libraries to be screened. At each subsequent step in the workflow a subset of input data is investigated with increased accuracy and more computationally expensive methods. We demonstrate the developed workflow with the investigation of the lymphocyte-specific kinase LCK, which is implicated as a drug target in many cancers and also known to have toxic effects when unintentionally targeted. Several LCK states and conformations are investigated using molecular docking and generalized Born and surface area continuum solvation (MM/GBSA). Different variations in the drug screening process provide unique results that may elucidate the biological mechanisms underlying the drug interactions
- …