Search CORE

5,714 research outputs found

FeCaffe: FPGA-enabled Caffe with OpenCL for Deep Learning Training and Inference on Intel Stratix 10

Author: Gu Dian
He Ke
Ling Andrew
Liu Bo
Zhang Yu
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 18/11/2019
Field of study

Deep learning and Convolutional Neural Network (CNN) have becoming increasingly more popular and important in both academic and industrial areas in recent years cause they are able to provide better accuracy and result in classification, detection and recognition areas, compared to traditional approaches. Currently, there are many popular frameworks in the market for deep learning development, such as Caffe, TensorFlow, Pytorch, and most of frameworks natively support CPU and consider GPU as the mainline accelerator by default. FPGA device, viewed as a potential heterogeneous platform, still cannot provide a comprehensive support for CNN development in popular frameworks, in particular to the training phase. In this paper, we firstly propose the FeCaffe, i.e. FPGA-enabled Caffe, a hierarchical software and hardware design methodology based on the Caffe to enable FPGA to support mainline deep learning development features, e.g. training and inference with Caffe. Furthermore, we provide some benchmarks with FeCaffe by taking some classical CNN networks as examples, and further analysis of kernel execution time in details accordingly. Finally, some optimization directions including FPGA kernel design, system pipeline, network architecture, user case application and heterogeneous platform levels, have been proposed gradually to improve FeCaffe performance and efficiency. The result demonstrates the proposed FeCaffe is capable of supporting almost full features during CNN network training and inference respectively with high degree of design flexibility, expansibility and reusability for deep learning development. Compared to prior studies, our architecture can support more network and training settings, and current configuration can achieve 6.4x and 8.4x average execution time improvement for forward and backward respectively for LeNet.Comment: 11 pages, 7 figures and 4 table

arXiv.org e-Print Archive

Crossref

Relation Between Gravitational Mass and Baryonic Mass for Non-Rotating and Rapidly Rotating Neutron Stars

Author: Ai Shun-Ke
Bauswein Andreas
Cao Zhou-Jian
Gao He
Li Ang
Zhang Bing
Zhang Nai-Bo
Zhu Zhen-Yu
Publication venue: Digital Scholarship@UNLV
Publication date: 09/05/2019
Field of study

With a selected sample of neutron star (NS) equations of state (EOSs) that are consistent with the current observations and have a range of maximum masses, we investigate the relations between NS gravitational mass Mg and baryonic mass Mb, and the relations between the maximum NS mass supported through uniform rotation (Mmax) and that of nonrotating NSs (MTOV). We find that for an EOS-independent quadratic, universal transformation formula (Mb=Mg+A×M2g)(Mb=Mg+A×Mg2), the best-fit A value is 0.080 for non-rotating NSs, 0.064 for maximally rotating NSs, and 0.073 when NSs with arbitrary rotation are considered. The residual error of the transformation is ∼ 0.1M⊙ for non-spin or maximum-spin, but is as large as ∼ 0.2M⊙ for all spins. For different EOSs, we find that the parameter A for non-rotating NSs is proportional to R−11.4R1.4−1 (where R1.4 is NS radius for 1.4M⊙ in units of km). For a particular EOS, if one adopts the best-fit parameters for different spin periods, the residual error of the transformation is smaller, which is of the order of 0.01M⊙ for the quadratic form and less than 0.01M⊙ for the cubic form ((Mb=Mg+A1×M2g+A2×M3g)(Mb=Mg+A1×Mg2+A2×Mg3)). We also find a very tight and general correlation between the normalized mass gain due to spin Δm = (Mmax − MTOV)/MTOV and the spin period normalized to the Keplerian period PP, i.e., log10Δm=(−2.74±0.05)log10P+log10(0.20±0.01)log10Δm=(−2.74±0.05)log10P+log10(0.20±0.01), which is independent of EOS models. These empirical relations are helpful to study NS-NS mergers with a long-lived NS merger product using multi-messenger data. The application of our results to GW170817 is discussed

arXiv.org e-Print Archive

University of Nevada, Las Vegas Repository

GSI Repository