25,311 research outputs found
Finite-volume Hamiltonian method for coupled channel interactions in lattice QCD
Within a multi-channel formulation of scattering, we investigate the
use of the finite-volume Hamiltonian approach to resolve scattering observables
from lattice QCD spectra. The asymptotic matching of the well-known L\"uscher
formalism encodes a unique finite-volume spectrum. Nevertheless, in many
practical situations, such as coupled-channel systems, it is advantageous to
interpolate isolated lattice spectra in order to extract physical scattering
parameters. Here we study the use of the Hamiltonian framework as a
parameterisation that can be fit directly to lattice spectra. We find that with
a modest amount of lattice data, the scattering parameters can be reproduced
rather well, with only a minor degree of model dependence.Comment: 25 pages, 16 figure
Electronic structure and bonding properties of cobalt oxide in the spinel structure
The spinel cobalt oxide Co3O4 is a magnetic semiconductor containing cobalt
ions in Co2+ and Co3+ oxidation states. We have studied the electronic,
magnetic and bonding properties of Co3O4 using density functional theory (DFT)
at the Generalized Gradient Approximation (GGA), GGA+U, and PBE0 hybrid
functional levels. The GGA correctly predicts Co3O4 to be a semiconductor, but
severely underestimates the band gap. The GGA+U band gap (1.96 eV) agrees well
with the available experimental value (~ 1.6 eV), whereas the band gap obtained
using the PBE0 hybrid functional (3.42 eV) is strongly overestimated. All the
employed exchange-correlation functionals predict 3 unpaired d electrons on the
Co2+ ions, in agreement with crystal field theory, but the values of the
magnetic moments given by GGA+U and PBE0 are in closer agreement with the
experiment than the GGA value, indicating a better description of the cobalt
localized d states. Bonding properties are studied by means of Maximally
Localized Wannier Functions (MLWFs). We find d-type MLWFs on the cobalt ions,
as well as Wannier functions with the character of sp3d bonds between cobalt
and oxygen ions. Such hybridized bonding states indicate the presence of a
small covalent component in the primarily ionic bonding mechanism of this
compound.Comment: 24 pages, 8 figure
The CJT calculation in studying nuclear matter beyond mean field approximation
We have introduced a CJT calculation in studying nuclear matter beyond mean
field approximation. Based on the CJT formalism and using Walecka model, we
have derived a set of coupled Dyson equations of nucleons and mesons.
Neglecting the medium effects of the mesons, the usual MFT results could be
reproduced. The beyond MFT calculations have been performed by thermodynamic
consistently determining the meson effective masses and solving the coupled gap
equations for nucleons and mesons. The numerical results for the nucleon and
meson effective masses at finite temperature and chemical potential in nuclear
matter are discussed.Comment: 8 pages, 8 figure
Gunrock: A High-Performance Graph Processing Library on the GPU
For large-scale graph analytics on the GPU, the irregularity of data access
and control flow, and the complexity of programming GPUs have been two
significant challenges for developing a programmable high-performance graph
library. "Gunrock", our graph-processing system designed specifically for the
GPU, uses a high-level, bulk-synchronous, data-centric abstraction focused on
operations on a vertex or edge frontier. Gunrock achieves a balance between
performance and expressiveness by coupling high performance GPU computing
primitives and optimization strategies with a high-level programming model that
allows programmers to quickly develop new graph primitives with small code size
and minimal GPU programming knowledge. We evaluate Gunrock on five key graph
primitives and show that Gunrock has on average at least an order of magnitude
speedup over Boost and PowerGraph, comparable performance to the fastest GPU
hardwired primitives, and better performance than any other GPU high-level
graph library.Comment: 14 pages, accepted by PPoPP'16 (removed the text repetition in the
previous version v5
Electronic structure and Jahn-Teller effect in GaN:Mn and ZnS:Cr
We present an ab-initio and analytical study of the Jahn-Teller effect in two
diluted magnetic semiconductors (DMS) with d4 impurities, namely Mn-doped GaN
and Cr-doped ZnS. We show that only the combined treatment of Jahn-Teller
distortion and strong electron correlation in the 3d shell may lead to the
correct insulating electronic structure. Using the LSDA+U approach we obtain
the Jahn-Teller energy gain in reasonable agreement with the available
experimental data. The ab-initio results are completed by a more
phenomenological ligand field theory.Comment: 15 pages, 5 figure
Optimized Broadcast for Deep Learning Workloads on Dense-GPU InfiniBand Clusters: MPI or NCCL?
Dense Multi-GPU systems have recently gained a lot of attention in the HPC
arena. Traditionally, MPI runtimes have been primarily designed for clusters
with a large number of nodes. However, with the advent of MPI+CUDA applications
and CUDA-Aware MPI runtimes like MVAPICH2 and OpenMPI, it has become important
to address efficient communication schemes for such dense Multi-GPU nodes. This
coupled with new application workloads brought forward by Deep Learning
frameworks like Caffe and Microsoft CNTK pose additional design constraints due
to very large message communication of GPU buffers during the training phase.
In this context, special-purpose libraries like NVIDIA NCCL have been proposed
for GPU-based collective communication on dense GPU systems. In this paper, we
propose a pipelined chain (ring) design for the MPI_Bcast collective operation
along with an enhanced collective tuning framework in MVAPICH2-GDR that enables
efficient intra-/inter-node multi-GPU communication. We present an in-depth
performance landscape for the proposed MPI_Bcast schemes along with a
comparative analysis of NVIDIA NCCL Broadcast and NCCL-based MPI_Bcast. The
proposed designs for MVAPICH2-GDR enable up to 14X and 16.6X improvement,
compared to NCCL-based solutions, for intra- and inter-node broadcast latency,
respectively. In addition, the proposed designs provide up to 7% improvement
over NCCL-based solutions for data parallel training of the VGG network on 128
GPUs using Microsoft CNTK.Comment: 8 pages, 3 figure
Relaxed 2-D Principal Component Analysis by Norm for Face Recognition
A relaxed two dimensional principal component analysis (R2DPCA) approach is
proposed for face recognition. Different to the 2DPCA, 2DPCA- and G2DPCA,
the R2DPCA utilizes the label information (if known) of training samples to
calculate a relaxation vector and presents a weight to each subset of training
data. A new relaxed scatter matrix is defined and the computed projection axes
are able to increase the accuracy of face recognition. The optimal -norms
are selected in a reasonable range. Numerical experiments on practical face
databased indicate that the R2DPCA has high generalization ability and can
achieve a higher recognition rate than state-of-the-art methods.Comment: 19 pages, 11 figure
The NN phase shifts in the extended quark-delocalization, color-screening model
An alternative method is applied to the study of nucleon-nucleon(NN)
scattering phase shifts in the framework of extended quark delocalization,
color-screening model(QDCSM), where the one-pion-exchange(OPE) with short-range
cutoff is included.Comment: 5 pages, 3 figures, two-colum
- …