101,638 research outputs found

    SPH Simulations with Reconfigurable Hardware Accelerator

    Full text link
    We present a novel approach to accelerate astrophysical hydrodynamical simulations. In astrophysical many-body simulations, GRAPE (GRAvity piPE) system has been widely used by many researchers. However, in the GRAPE systems, its function is completely fixed because specially developed LSI is used as a computing engine. Instead of using such LSI, we are developing a special purpose computing system using Field Programmable Gate Array (FPGA) chips as the computing engine. Together with our developed programming system, we have implemented computing pipelines for the Smoothed Particle Hydrodynamics (SPH) method on our PROGRAPE-3 system. The SPH pipelines running on PROGRAPE-3 system have the peak speed of 85 GFLOPS and in a realistic setup, the SPH calculation using one PROGRAPE-3 board is 5-10 times faster than the calculation on the host computer. Our results clearly shows for the first time that we can accelerate the speed of the SPH simulations of a simple astrophysical phenomena using considerable computing power offered by the hardware.Comment: 27 pages, 13 figures, submitted to PAS

    Modern Approaches to Exact Diagonalization and Selected Configuration Interaction with the Adaptive Sampling CI Method.

    Get PDF
    Recent advances in selected configuration interaction methods have made them competitive with the most accurate techniques available and, hence, creating an increasingly powerful tool for solving quantum Hamiltonians. In this work, we build on recent advances from the adaptive sampling configuration interaction (ASCI) algorithm. We show that a useful paradigm for generating efficient selected CI/exact diagonalization algorithms is driven by fast sorting algorithms, much in the same way iterative diagonalization is based on the paradigm of matrix vector multiplication. We present several new algorithms for all parts of performing a selected CI, which includes new ASCI search, dynamic bit masking, fast orbital rotations, fast diagonal matrix elements, and residue arrays. The ASCI search algorithm can be used in several different modes, which includes an integral driven search and a coefficient driven search. The algorithms presented here are fast and scalable, and we find that because they are built on fast sorting algorithms they are more efficient than all other approaches we considered. After introducing these techniques, we present ASCI results applied to a large range of systems and basis sets to demonstrate the types of simulations that can be practically treated at the full-CI level with modern methods and hardware, presenting double- and triple-ζ benchmark data for the G1 data set. The largest of these calculations is Si2H6 which is a simulation of 34 electrons in 152 orbitals. We also present some preliminary results for fast deterministic perturbation theory simulations that use hash functions to maintain high efficiency for treating large basis sets

    Kinetic Monte Carlo Simulations of Crystal Growth in Ferroelectric Alloys

    Full text link
    The growth rates and chemical ordering of ferroelectric alloys are studied with kinetic Monte Carlo (KMC) simulations using an electrostatic model with long-range Coulomb interactions, as a function of temperature, chemical composition, and substrate orientation. Crystal growth is characterized by thermodynamic processes involving adsorption and evaporation, with solid-on-solid restrictions and excluding diffusion. A KMC algorithm is formulated to simulate this model efficiently in the presence of long-range interactions. Simulations were carried out on Ba(Mg_{1/3}Nb_{2/3})O_3 (BMN) type materials. Compared to the simple rocksalt ordered structures, ordered BMN grows only at very low temperatures and only under finely tuned conditions. For materials with tetravalent compositions, such as (1-x)Ba(Mg_{1/3}Nb_{2/3})O_3 + xBaZrO_3 (BMN-BZ), the model does not incorporate tetravalent ions at low-temperature, exhibiting a phase-separated ground state instead. At higher temperatures, tetravalent ions can be incorporated, but the resulting crystals show no chemical ordering in the absence of diffusive mechanisms.Comment: 13 pages, 16 postscript figures, submitted to Physics Review B Journa

    TRIQS/CTHYB: A Continuous-Time Quantum Monte Carlo Hybridization Expansion Solver for Quantum Impurity Problems

    Get PDF
    We present TRIQS/CTHYB, a state-of-the art open-source implementation of the continuous-time hybridisation expansion quantum impurity solver of the TRIQS package. This code is mainly designed to be used with the TRIQS library in order to solve the self-consistent quantum impurity problem in a multi-orbital dynamical mean field theory approach to strongly-correlated electrons, in particular in the context of realistic calculations. It is implemented in C++ for efficiency and is provided with a high-level Python interface. The code is ships with a new partitioning algorithm that divides the local Hilbert space without any user knowledge of the symmetries and quantum numbers of the Hamiltonian. Furthermore, we implement higher-order configuration moves and show that such moves are necessary to ensure ergodicity of the Monte Carlo in common Hamiltonians even without symmetry-breaking.Comment: 19 pages, this is a companion article to that describing the TRIQS librar

    Quantum Monte Carlo with very large multideterminant wavefunctions

    Full text link
    An algorithm to compute efficiently the first two derivatives of (very) large multideterminant wavefunctions for quantum Monte Carlo calculations is presented. The calculation of determinants and their derivatives is performed using the Sherman-Morrison formula for updating the inverse Slater matrix. An improved implementation based on the reduction of the number of column substitutions and on a very efficient implementation of the calculation of the scalar products involved is presented. It is emphasized that multideterminant expansions contain in general a large number of identical spin-specific determinants: for typical configuration interaction-type wavefunctions the number of unique spin-specific determinants NdetσN_{\rm det}^\sigma (σ=↑,↓\sigma=\uparrow,\downarrow) with a non-negligible weight in the expansion is of order O(Ndet){\cal O}(\sqrt{N_{\rm det}}). We show that a careful implementation of the calculation of the NdetN_{\rm det}-dependent contributions can make this step negligible enough so that in practice the algorithm scales as the total number of unique spin-specific determinants,   Ndet↑+Ndet↓\; N_{\rm det}^\uparrow + N_{\rm det}^\downarrow, over a wide range of total number of determinants (here, NdetN_{\rm det} up to about one million), thus greatly reducing the total computational cost. Finally, a new truncation scheme for the multideterminant expansion is proposed so that larger expansions can be considered without increasing the computational time. The algorithm is illustrated with all-electron Fixed-Node Diffusion Monte Carlo calculations of the total energy of the chlorine atom. Calculations using a trial wavefunction including about 750 000 determinants with a computational increase of ∌\sim 400 compared to a single-determinant calculation are shown to be feasible.Comment: 9 pages, 3 figure
    • 

    corecore