409 research outputs found
Improvements on non-equilibrium and transport Green function techniques: the next-generation transiesta
We present novel methods implemented within the non-equilibrium Green
function code (NEGF) transiesta based on density functional theory (DFT). Our
flexible, next-generation DFT-NEGF code handles devices with one or multiple
electrodes () with individual chemical potentials and electronic
temperatures. We describe its novel methods for electrostatic gating, contour
opti- mizations, and assertion of charge conservation, as well as the newly
implemented algorithms for optimized and scalable matrix inversion,
performance-critical pivoting, and hybrid parallellization. Additionally, a
generic NEGF post-processing code (tbtrans/phtrans) for electron and phonon
transport is presented with several novelties such as Hamiltonian
interpolations, electrode capability, bond-currents, generalized
interface for user-defined tight-binding transport, transmission projection
using eigenstates of a projected Hamiltonian, and fast inversion algorithms for
large-scale simulations easily exceeding atoms on workstation computers.
The new features of both codes are demonstrated and bench-marked for relevant
test systems.Comment: 24 pages, 19 figure
Simulations of astronomical imaging phased arrays
We describe a theoretical procedure for analyzing astronomical phased arrays
with overlapping beams, and apply the procedure to simulate a simple example.
We demonstrate the effect of overlapping beams on the number of degrees of
freedom of the array, and on the ability of the array to recover a source. We
show that the best images are obtained using overlapping beams, contrary to
common practise, and show how the dynamic range of a phased array directly
affects the image quality.Comment: 16 pages, 26 figures, submitted to Journal of the Optical Society of
America
spBayes for Large Univariate and Multivariate Point-Referenced Spatio-Temporal Data Models
In this paper we detail the reformulation and rewrite of core functions in the spBayes R package. These efforts have focused on improving computational efficiency, flexibility, and usability for point-referenced data models. Attention is given to algorithm and computing developments that result in improved sampler convergence rate and efficiency by reducing parameter space; decreased sampler run-time by avoiding expensive matrix computations, and; increased scalability to large datasets by implementing a class of predictive process models that attempt to overcome computational hurdles by representing spatial processes in terms of lower-dimensional realizations. Beyond these general computational improvements for existing model functions, we detail new functions for modeling data indexed in both space and time. These new functions implement a class of dynamic spatio-temporal models for settings where space is viewed as continuous and time is taken as discrete
The LAPW method with eigendecomposition based on the Hari--Zimmermann generalized hyperbolic SVD
In this paper we propose an accurate, highly parallel algorithm for the
generalized eigendecomposition of a matrix pair , given in a factored
form . Matrices and are generally complex
and Hermitian, and is positive definite. This type of matrices emerges from
the representation of the Hamiltonian of a quantum mechanical system in terms
of an overcomplete set of basis functions. This expansion is part of a class of
models within the broad field of Density Functional Theory, which is considered
the golden standard in condensed matter physics. The overall algorithm consists
of four phases, the second and the fourth being optional, where the two last
phases are computation of the generalized hyperbolic SVD of a complex matrix
pair , according to a given matrix defining the hyperbolic scalar
product. If , then these two phases compute the GSVD in parallel very
accurately and efficiently.Comment: The supplementary material is available at
https://web.math.pmf.unizg.hr/mfbda/papers/sm-SISC.pdf due to its size. This
revised manuscript is currently being considered for publicatio
Approximate integrals of motion
We determine approximate numerical integrals of motion of 2D symmetric
Hamiltonian systems. We detail for a few gravitational potentials the
conditions under which quasi-integrals are obtained as polynomial series. We
show that each of these potentials has a wide range of regular orbits that are
accurately modelled with a unique approximate integral of motion.Comment: 11 pages, 11 figures, accepted for publication in Astronomy and
Astrophysic
The HPCG benchmark: analysis, shared memory preliminary improvements and evaluation on an Arm-based platform
The High-Performance Conjugate Gradient (HPCG) benchmark complements the LINPACK benchmark in the performance evaluation coverage of large High-Performance Computing (HPC) systems. Due to its lower arithmetic intensity and higher memory pressure, HPCG is recognized as a more representative benchmark for data-center and irregular memory access pattern workloads, therefore its popularity and acceptance is raising within the HPC community. As only a small fraction of the reference version of the HPCG benchmark is parallelized with shared memory techniques (OpenMP), we introduce in this report two OpenMP parallelization methods. Due to the increasing importance of Arm architecture in the HPC scenario, we evaluate our HPCG code at scale on a state-of-the-art HPC system based on Cavium ThunderX2 SoC. We consider our work as a contribution to the Arm ecosystem: along with this technical report, we plan in fact to release our code for boosting the tuning of the HPCG benchmark within the Arm community.Postprint (author's final draft
- …