101,638 research outputs found
SPH Simulations with Reconfigurable Hardware Accelerator
We present a novel approach to accelerate astrophysical hydrodynamical
simulations. In astrophysical many-body simulations, GRAPE (GRAvity piPE)
system has been widely used by many researchers. However, in the GRAPE systems,
its function is completely fixed because specially developed LSI is used as a
computing engine. Instead of using such LSI, we are developing a special
purpose computing system using Field Programmable Gate Array (FPGA) chips as
the computing engine. Together with our developed programming system, we have
implemented computing pipelines for the Smoothed Particle Hydrodynamics (SPH)
method on our PROGRAPE-3 system. The SPH pipelines running on PROGRAPE-3 system
have the peak speed of 85 GFLOPS and in a realistic setup, the SPH calculation
using one PROGRAPE-3 board is 5-10 times faster than the calculation on the
host computer. Our results clearly shows for the first time that we can
accelerate the speed of the SPH simulations of a simple astrophysical phenomena
using considerable computing power offered by the hardware.Comment: 27 pages, 13 figures, submitted to PAS
Modern Approaches to Exact Diagonalization and Selected Configuration Interaction with the Adaptive Sampling CI Method.
Recent advances in selected configuration interaction methods have made them competitive with the most accurate techniques available and, hence, creating an increasingly powerful tool for solving quantum Hamiltonians. In this work, we build on recent advances from the adaptive sampling configuration interaction (ASCI) algorithm. We show that a useful paradigm for generating efficient selected CI/exact diagonalization algorithms is driven by fast sorting algorithms, much in the same way iterative diagonalization is based on the paradigm of matrix vector multiplication. We present several new algorithms for all parts of performing a selected CI, which includes new ASCI search, dynamic bit masking, fast orbital rotations, fast diagonal matrix elements, and residue arrays. The ASCI search algorithm can be used in several different modes, which includes an integral driven search and a coefficient driven search. The algorithms presented here are fast and scalable, and we find that because they are built on fast sorting algorithms they are more efficient than all other approaches we considered. After introducing these techniques, we present ASCI results applied to a large range of systems and basis sets to demonstrate the types of simulations that can be practically treated at the full-CI level with modern methods and hardware, presenting double- and triple-ζ benchmark data for the G1 data set. The largest of these calculations is Si2H6 which is a simulation of 34 electrons in 152 orbitals. We also present some preliminary results for fast deterministic perturbation theory simulations that use hash functions to maintain high efficiency for treating large basis sets
Kinetic Monte Carlo Simulations of Crystal Growth in Ferroelectric Alloys
The growth rates and chemical ordering of ferroelectric alloys are studied
with kinetic Monte Carlo (KMC) simulations using an electrostatic model with
long-range Coulomb interactions, as a function of temperature, chemical
composition, and substrate orientation. Crystal growth is characterized by
thermodynamic processes involving adsorption and evaporation, with
solid-on-solid restrictions and excluding diffusion. A KMC algorithm is
formulated to simulate this model efficiently in the presence of long-range
interactions. Simulations were carried out on Ba(Mg_{1/3}Nb_{2/3})O_3 (BMN)
type materials. Compared to the simple rocksalt ordered structures, ordered BMN
grows only at very low temperatures and only under finely tuned conditions. For
materials with tetravalent compositions, such as (1-x)Ba(Mg_{1/3}Nb_{2/3})O_3 +
xBaZrO_3 (BMN-BZ), the model does not incorporate tetravalent ions at
low-temperature, exhibiting a phase-separated ground state instead. At higher
temperatures, tetravalent ions can be incorporated, but the resulting crystals
show no chemical ordering in the absence of diffusive mechanisms.Comment: 13 pages, 16 postscript figures, submitted to Physics Review B
Journa
TRIQS/CTHYB: A Continuous-Time Quantum Monte Carlo Hybridization Expansion Solver for Quantum Impurity Problems
We present TRIQS/CTHYB, a state-of-the art open-source implementation of the
continuous-time hybridisation expansion quantum impurity solver of the TRIQS
package. This code is mainly designed to be used with the TRIQS library in
order to solve the self-consistent quantum impurity problem in a multi-orbital
dynamical mean field theory approach to strongly-correlated electrons, in
particular in the context of realistic calculations. It is implemented in C++
for efficiency and is provided with a high-level Python interface. The code is
ships with a new partitioning algorithm that divides the local Hilbert space
without any user knowledge of the symmetries and quantum numbers of the
Hamiltonian. Furthermore, we implement higher-order configuration moves and
show that such moves are necessary to ensure ergodicity of the Monte Carlo in
common Hamiltonians even without symmetry-breaking.Comment: 19 pages, this is a companion article to that describing the TRIQS
librar
Quantum Monte Carlo with very large multideterminant wavefunctions
An algorithm to compute efficiently the first two derivatives of (very) large
multideterminant wavefunctions for quantum Monte Carlo calculations is
presented. The calculation of determinants and their derivatives is performed
using the Sherman-Morrison formula for updating the inverse Slater matrix. An
improved implementation based on the reduction of the number of column
substitutions and on a very efficient implementation of the calculation of the
scalar products involved is presented. It is emphasized that multideterminant
expansions contain in general a large number of identical spin-specific
determinants: for typical configuration interaction-type wavefunctions the
number of unique spin-specific determinants
() with a non-negligible weight in the expansion is
of order . We show that a careful implementation
of the calculation of the -dependent contributions can make this
step negligible enough so that in practice the algorithm scales as the total
number of unique spin-specific determinants, , over a wide range of total number of determinants (here,
up to about one million), thus greatly reducing the total
computational cost. Finally, a new truncation scheme for the multideterminant
expansion is proposed so that larger expansions can be considered without
increasing the computational time. The algorithm is illustrated with
all-electron Fixed-Node Diffusion Monte Carlo calculations of the total energy
of the chlorine atom. Calculations using a trial wavefunction including about
750 000 determinants with a computational increase of 400 compared to a
single-determinant calculation are shown to be feasible.Comment: 9 pages, 3 figure
- âŠ