27 research outputs found

    FP8 Formats for Deep Learning

    Full text link
    FP8 is a natural progression for accelerating deep learning training inference beyond the 16-bit formats common in modern processors. In this paper we propose an 8-bit floating point (FP8) binary interchange format consisting of two encodings - E4M3 (4-bit exponent and 3-bit mantissa) and E5M2 (5-bit exponent and 2-bit mantissa). While E5M2 follows IEEE 754 conventions for representatio of special values, E4M3's dynamic range is extended by not representing infinities and having only one mantissa bit-pattern for NaNs. We demonstrate the efficacy of the FP8 format on a variety of image and language tasks, effectively matching the result quality achieved by 16-bit training sessions. Our study covers the main modern neural network architectures - CNNs, RNNs, and Transformer-based models, leaving all the hyperparameters unchanged from the 16-bit baseline training sessions. Our training experiments include large, up to 175B parameter, language models. We also examine FP8 post-training-quantization of language models trained using 16-bit formats that resisted fixed point int8 quantization

    Pulseq-CEST : Towards multi-site multi-vendor compatibility and reproducibility of CEST experiments using an open-source sequence standard

    No full text
    PURPOSE: As the field of CEST grows, various novel preparation periods using different parameters are being introduced. At the same time, large, multisite clinical studies require clearly defined protocols, especially across different vendors. Here, we propose a CEST definition standard using the open Pulseq format for a shareable, simple, and exact definition of CEST protocols.METHODS: We present the benefits of such a standard in three ways: (1) an open database on GitHub, where fully defined, human-readable CEST protocols can be shared; (2) an open-source Bloch-McConnell simulation to test and optimize CEST preparation periods in silico; and (3) a hybrid MR sequence that plays out the CEST preparation period and can be combined with any existing readout module.RESULTS: The exact definition of the CEST preparation period, in combination with the flexible simulation, leads to a good match between simulations and measurements. The standard allowed finding consensus on three amide proton transfer-weighted protocols that could be compared in healthy subjects and a tumor patient. In addition, we could show coherent multisite results for a sophisticated CEST method, highlighting the benefits regarding protocol sharing and reproducibility.CONCLUSION: With Pulseq-CEST, we provide a straightforward approach to standardize, share, simulate, and measure different CEST preparation schemes, which are inherently completely defined
    corecore