10 research outputs found
Efficient Post-training Quantization with FP8 Formats
Recent advances in deep learning methods such as LLMs and Diffusion models
have created a need for improved quantization methods that can meet the
computational demands of these modern architectures while maintaining accuracy.
Towards this goal, we study the advantages of FP8 data formats for
post-training quantization across 75 unique network architectures covering a
wide range of tasks, including machine translation, language modeling, text
generation, image classification, generation, and segmentation. We examine
three different FP8 representations (E5M2, E4M3, and E3M4) to study the effects
of varying degrees of trade-off between dynamic range and precision on model
accuracy. Based on our extensive study, we developed a quantization workflow
that generalizes across different network architectures. Our empirical results
show that FP8 formats outperform INT8 in multiple aspects, including workload
coverage (92.64% vs. 65.87%), model accuracy and suitability for a broader
range of operations. Furthermore, our findings suggest that E4M3 is better
suited for NLP models, whereas E3M4 performs marginally better than E4M3 on
computer vision tasks. The code is publicly available on Intel Neural
Compressor: https://github.com/intel/neural-compressor
Total body bone mineral density and various spinal disorders: a Mendelian randomization study
IntroductionObservational studies have yielded inconsistent findings regarding the correlation between bone mineral density (BMD) and various spinal disorders. To explore the relationship between total-body BMD and various spinal disorders further, we conducted a Mendelian randomization analysis to assess this association.MethodsTwo-sample bidirectional Mendelian randomization (MR) analysis was employed to investigate the association between total-body BMD and various spinal disorders. The inverse-variance weighted (IVW) method was used as the primary effect estimate, and additional methods, including weighted median, MR-Egger, simple mode, and weighted mode, were used to assess the reliability of the results. To examine the robustness of the data further, we conducted a sensitivity analysis using alternative bone-density databases, validating the outcome data.ResultsMR revealed a significant positive association between total-body BMD and the prevalence of spondylosis and spinal stenosis. When total-body BMD was considered as the exposure factor, the analysis demonstrated an increased risk of spinal stenosis (IVW odds ratio [OR] 1.23; 95% confidence interval [CI], 1.14–1.32; P < 0.001) and spondylosis (IVW: OR 1.24; 95%CI, 1.16–1.33; P < 0.001). Similarly, when focusing solely on heel BMD as the exposure factor, we found a positive correlation with the development of both spinal stenosis (IVW OR 1.13, 95%CI, 1.05–1.21; P < 0.001) and spondylosis (IVW OR 1.10, 95%CI, 1.03–1.18; P = 0.0048). However, no significant associations were found between total-body BMD and other spinal disorders, including spinal instability, spondylolisthesis/spondylolysis, and scoliosis (P > 0.05).ConclusionThis study verified an association of total-body BMD with spinal stenosis and with spondylosis. Our results imply that when an increasing trend in BMD is detected during patient examinations and if the patient complains of numbness and pain, the potential occurrence of conditions such as spondylosis or spinal stenosis should be investigated and treated appropriately
Neuroblastoma of the lumbosacral canal in an adult: a case report and literature review
Neuroblastoma (NB) is a leading cause of death in children. It usually occurs in the adrenal gland and rarely in the spinal canal. Here, we report the case of a 48-year-old male patient with abnormal thickening of the cauda equina nerve as revealed by lumbosacral magnetic resonance imaging. The patient’s main clinical manifestations were numbness and pain in both lower limbs. The patient underwent surgical treatment; however, intraoperatively, an unclear border was observed between the cauda equina nerve and the tumor; therefore, the tumor was not forcibly excised. The postoperative pathological results were reported as NB. The disease known as NB, which is extremely rare. We believe that a pathological biopsy is extremely vital for diagnosing NB, and aggressive post-operative radio-chemotherapy could potentially prolong the patient’s survival time
An essential role for sulfur in sulfide-silicate melt partitioning of gold and magmatic gold transport at subduction settings
Sulfide-silicate melt partitioning controls the behavior of gold in magmas, which is critical for understanding the Earth's deep gold cycle and formation of gold deposits. However, the mechanisms that control the sulfide-silicate melt partitioning of gold remain largely unknown. Here we present constraints from laboratory experiments on the partition coefficient of gold between monosulfide-solid-solution (MSS) and silicate melt (DAuMSS/SM) under conditions relevant for magmatism at subduction settings. Thirty-five experiments were performed in Au capsules to determine DAuMSS/SM at 950-1050°C, 0.5-3 GPa, oxygen fugacity (fO2) of ∼FMQ-1.7 to FMQ+2.7 (FMQ refers to the fayalite-magnetite-quartz buffer), and sulfur fugacity (fS2) of −2.2 to 2.1, using a piston cylinder apparatus. The silicate melt composition changes from dry to hydrous andesite to rhyolite. The results obtained from electron microprobe and laser-ablation ICP-MS analyses show that the gold solubility in silicate melts ranges from 0.01 to 55.3 ppm and is strongly correlated with the melt sulfur content [S]melt at fO2 of ∼FMQ-1.7 to FMQ+1.6, which can be explained by the formation of complex Au-S species in the silicate melts. The gold solubility in MSS ranges from 130 to 2800 ppm, which is mainly controlled by fS2. DAuMSS/SM ranges from 10 to 14000 at fO2 of ∼FMQ-1.7 to FMQ+1.6, the large variation of which can be fully explained by combined [S]melt and fS2. Therefore, all of the parameters that can directly affect [S]melt and fS2, such as alkali metals, water, FeO, and fO2, can indirectly affect DAuMSS/SM. The mechanisms that control the sulfide-silicate melt partitioning of gold and the other chalcophile elements, such as Ni, Re, and Mo, differ significantly. This is because gold is dissolved mainly as Au-S species in the silicate melts, while the other chalcophile elements are dissolved mainly as metal oxides in the silicate melts. Applying the correlation between DAuMSS/SM and [S]melt to slab melting and arc magmatic differentiation under different redox conditions, we find that ancient to modern slab melts carry negligible to less than 25% of the slab gold to the subarc mantle; however, gold-enrichment can occur in MSS-saturated arc magmas that have differentiated under moderately oxidized conditions with fO2 between FMQ and FMQ+1.6, in particular if the magmatic crystallization follows a fractional crystallization model. We conclude that moderately oxidized magmas with high contents of alkali metals, sulfur, and water, owing to their low DAuMSS/SM and efficient magma-to-fluid transfer of gold and sulfur, have a high potential to form gold deposits
An Efficient Sparse Inference Software Accelerator for Transformer-based Language Models on CPUs
In recent years, Transformer-based language models have become the standard
approach for natural language processing tasks. However, stringent throughput
and latency requirements in industrial applications are limiting their
adoption. To mitigate the gap, model compression techniques such as structured
pruning are being used to improve inference efficiency. However, most existing
neural network inference runtimes lack adequate support for structured
sparsity. In this paper, we propose an efficient sparse deep learning inference
software stack for Transformer-based language models where the weights are
pruned with constant block size. Our sparse software accelerator leverages
Intel Deep Learning Boost to maximize the performance of sparse matrix - dense
matrix multiplication (commonly abbreviated as SpMM) on CPUs. Our SpMM kernel
outperforms the existing sparse libraries (oneMKL, TVM, and LIBXSMM) by an
order of magnitude on a wide range of GEMM shapes under 5 representative
sparsity ratios (70%, 75%, 80%, 85%, 90%). Moreover, our SpMM kernel shows up
to 5x speedup over dense GEMM kernel of oneDNN, a well-optimized dense library
widely used in industry. We apply our sparse accelerator on widely-used
Transformer-based language models including Bert-Mini, DistilBERT, Bert-Base,
and BERT-Large. Our sparse inference software shows up to 1.5x speedup over
Neural Magic's Deepsparse under same configurations on Xeon on Amazon Web
Services under proxy production latency constraints. We also compare our
solution with two framework-based inference solutions, ONNX Runtime and
PyTorch, and demonstrate up to 37x speedup over ONNX Runtime and 345x over
PyTorch on Xeon under the latency constraints. All the source code is publicly
available on Github: https://github.com/intel/intel-extension-for-transformers
Single-cell transcriptomics links malignant T cells to the tumor immune landscape in cutaneous T cell lymphoma.
Cutaneous T cell lymphoma (CTCL) represents a heterogeneous group of non-Hodgkin lymphoma distinguished by the presence of clonal malignant T cells. The heterogeneity of malignant T cells and the complex tumor microenvironment remain poorly characterized. With single-cell RNA analysis and bulk whole-exome sequencing on 19 skin lesions from 15 CTCL patients, we decipher the intra-tumor and inter-lesion diversity of CTCL patients and propose a multi-step tumor evolution model. We further establish a subtyping scheme based on the molecular features of malignant T cells and their pro-tumorigenic microenvironments: the TCyEM group, demonstrating a cytotoxic effector memory T cell phenotype, shows more M2 macrophages infiltration, while the TCM group, featured by a central memory T cell phenotype and adverse patient outcome, is infiltrated by highly exhausted CD8+ reactive T cells, B cells and Tregs with suppressive activities. Our results establish a solid basis for understanding the nature of CTCL and pave the way for future precision medicine for CTCL patients