10 research outputs found

    Efficient Post-training Quantization with FP8 Formats

    Full text link
    Recent advances in deep learning methods such as LLMs and Diffusion models have created a need for improved quantization methods that can meet the computational demands of these modern architectures while maintaining accuracy. Towards this goal, we study the advantages of FP8 data formats for post-training quantization across 75 unique network architectures covering a wide range of tasks, including machine translation, language modeling, text generation, image classification, generation, and segmentation. We examine three different FP8 representations (E5M2, E4M3, and E3M4) to study the effects of varying degrees of trade-off between dynamic range and precision on model accuracy. Based on our extensive study, we developed a quantization workflow that generalizes across different network architectures. Our empirical results show that FP8 formats outperform INT8 in multiple aspects, including workload coverage (92.64% vs. 65.87%), model accuracy and suitability for a broader range of operations. Furthermore, our findings suggest that E4M3 is better suited for NLP models, whereas E3M4 performs marginally better than E4M3 on computer vision tasks. The code is publicly available on Intel Neural Compressor: https://github.com/intel/neural-compressor

    Total body bone mineral density and various spinal disorders: a Mendelian randomization study

    Get PDF
    IntroductionObservational studies have yielded inconsistent findings regarding the correlation between bone mineral density (BMD) and various spinal disorders. To explore the relationship between total-body BMD and various spinal disorders further, we conducted a Mendelian randomization analysis to assess this association.MethodsTwo-sample bidirectional Mendelian randomization (MR) analysis was employed to investigate the association between total-body BMD and various spinal disorders. The inverse-variance weighted (IVW) method was used as the primary effect estimate, and additional methods, including weighted median, MR-Egger, simple mode, and weighted mode, were used to assess the reliability of the results. To examine the robustness of the data further, we conducted a sensitivity analysis using alternative bone-density databases, validating the outcome data.ResultsMR revealed a significant positive association between total-body BMD and the prevalence of spondylosis and spinal stenosis. When total-body BMD was considered as the exposure factor, the analysis demonstrated an increased risk of spinal stenosis (IVW odds ratio [OR] 1.23; 95% confidence interval [CI], 1.14–1.32; P < 0.001) and spondylosis (IVW: OR 1.24; 95%CI, 1.16–1.33; P < 0.001). Similarly, when focusing solely on heel BMD as the exposure factor, we found a positive correlation with the development of both spinal stenosis (IVW OR 1.13, 95%CI, 1.05–1.21; P < 0.001) and spondylosis (IVW OR 1.10, 95%CI, 1.03–1.18; P = 0.0048). However, no significant associations were found between total-body BMD and other spinal disorders, including spinal instability, spondylolisthesis/spondylolysis, and scoliosis (P > 0.05).ConclusionThis study verified an association of total-body BMD with spinal stenosis and with spondylosis. Our results imply that when an increasing trend in BMD is detected during patient examinations and if the patient complains of numbness and pain, the potential occurrence of conditions such as spondylosis or spinal stenosis should be investigated and treated appropriately

    Neuroblastoma of the lumbosacral canal in an adult: a case report and literature review

    Get PDF
    Neuroblastoma (NB) is a leading cause of death in children. It usually occurs in the adrenal gland and rarely in the spinal canal. Here, we report the case of a 48-year-old male patient with abnormal thickening of the cauda equina nerve as revealed by lumbosacral magnetic resonance imaging. The patient’s main clinical manifestations were numbness and pain in both lower limbs. The patient underwent surgical treatment; however, intraoperatively, an unclear border was observed between the cauda equina nerve and the tumor; therefore, the tumor was not forcibly excised. The postoperative pathological results were reported as NB. The disease known as NB, which is extremely rare. We believe that a pathological biopsy is extremely vital for diagnosing NB, and aggressive post-operative radio-chemotherapy could potentially prolong the patient’s survival time

    An essential role for sulfur in sulfide-silicate melt partitioning of gold and magmatic gold transport at subduction settings

    Get PDF
    Sulfide-silicate melt partitioning controls the behavior of gold in magmas, which is critical for understanding the Earth's deep gold cycle and formation of gold deposits. However, the mechanisms that control the sulfide-silicate melt partitioning of gold remain largely unknown. Here we present constraints from laboratory experiments on the partition coefficient of gold between monosulfide-solid-solution (MSS) and silicate melt (DAuMSS/SM) under conditions relevant for magmatism at subduction settings. Thirty-five experiments were performed in Au capsules to determine DAuMSS/SM at 950-1050°C, 0.5-3 GPa, oxygen fugacity (fO2) of ∼FMQ-1.7 to FMQ+2.7 (FMQ refers to the fayalite-magnetite-quartz buffer), and sulfur fugacity (fS2) of −2.2 to 2.1, using a piston cylinder apparatus. The silicate melt composition changes from dry to hydrous andesite to rhyolite. The results obtained from electron microprobe and laser-ablation ICP-MS analyses show that the gold solubility in silicate melts ranges from 0.01 to 55.3 ppm and is strongly correlated with the melt sulfur content [S]melt at fO2 of ∼FMQ-1.7 to FMQ+1.6, which can be explained by the formation of complex Au-S species in the silicate melts. The gold solubility in MSS ranges from 130 to 2800 ppm, which is mainly controlled by fS2. DAuMSS/SM ranges from 10 to 14000 at fO2 of ∼FMQ-1.7 to FMQ+1.6, the large variation of which can be fully explained by combined [S]melt and fS2. Therefore, all of the parameters that can directly affect [S]melt and fS2, such as alkali metals, water, FeO, and fO2, can indirectly affect DAuMSS/SM. The mechanisms that control the sulfide-silicate melt partitioning of gold and the other chalcophile elements, such as Ni, Re, and Mo, differ significantly. This is because gold is dissolved mainly as Au-S species in the silicate melts, while the other chalcophile elements are dissolved mainly as metal oxides in the silicate melts. Applying the correlation between DAuMSS/SM and [S]melt to slab melting and arc magmatic differentiation under different redox conditions, we find that ancient to modern slab melts carry negligible to less than 25% of the slab gold to the subarc mantle; however, gold-enrichment can occur in MSS-saturated arc magmas that have differentiated under moderately oxidized conditions with fO2 between FMQ and FMQ+1.6, in particular if the magmatic crystallization follows a fractional crystallization model. We conclude that moderately oxidized magmas with high contents of alkali metals, sulfur, and water, owing to their low DAuMSS/SM and efficient magma-to-fluid transfer of gold and sulfur, have a high potential to form gold deposits

    An Efficient Sparse Inference Software Accelerator for Transformer-based Language Models on CPUs

    Full text link
    In recent years, Transformer-based language models have become the standard approach for natural language processing tasks. However, stringent throughput and latency requirements in industrial applications are limiting their adoption. To mitigate the gap, model compression techniques such as structured pruning are being used to improve inference efficiency. However, most existing neural network inference runtimes lack adequate support for structured sparsity. In this paper, we propose an efficient sparse deep learning inference software stack for Transformer-based language models where the weights are pruned with constant block size. Our sparse software accelerator leverages Intel Deep Learning Boost to maximize the performance of sparse matrix - dense matrix multiplication (commonly abbreviated as SpMM) on CPUs. Our SpMM kernel outperforms the existing sparse libraries (oneMKL, TVM, and LIBXSMM) by an order of magnitude on a wide range of GEMM shapes under 5 representative sparsity ratios (70%, 75%, 80%, 85%, 90%). Moreover, our SpMM kernel shows up to 5x speedup over dense GEMM kernel of oneDNN, a well-optimized dense library widely used in industry. We apply our sparse accelerator on widely-used Transformer-based language models including Bert-Mini, DistilBERT, Bert-Base, and BERT-Large. Our sparse inference software shows up to 1.5x speedup over Neural Magic's Deepsparse under same configurations on Xeon on Amazon Web Services under proxy production latency constraints. We also compare our solution with two framework-based inference solutions, ONNX Runtime and PyTorch, and demonstrate up to 37x speedup over ONNX Runtime and 345x over PyTorch on Xeon under the latency constraints. All the source code is publicly available on Github: https://github.com/intel/intel-extension-for-transformers

    Single-cell transcriptomics links malignant T cells to the tumor immune landscape in cutaneous T cell lymphoma.

    Get PDF
    Cutaneous T cell lymphoma (CTCL) represents a heterogeneous group of non-Hodgkin lymphoma distinguished by the presence of clonal malignant T cells. The heterogeneity of malignant T cells and the complex tumor microenvironment remain poorly characterized. With single-cell RNA analysis and bulk whole-exome sequencing on 19 skin lesions from 15 CTCL patients, we decipher the intra-tumor and inter-lesion diversity of CTCL patients and propose a multi-step tumor evolution model. We further establish a subtyping scheme based on the molecular features of malignant T cells and their pro-tumorigenic microenvironments: the TCyEM group, demonstrating a cytotoxic effector memory T cell phenotype, shows more M2 macrophages infiltration, while the TCM group, featured by a central memory T cell phenotype and adverse patient outcome, is infiltrated by highly exhausted CD8+ reactive T cells, B cells and Tregs with suppressive activities. Our results establish a solid basis for understanding the nature of CTCL and pave the way for future precision medicine for CTCL patients
    corecore