631 research outputs found

    NVIDIA Tensor Core Programmability, Performance & Precision

    Full text link
    The NVIDIA Volta GPU microarchitecture introduces a specialized unit, called "Tensor Core" that performs one matrix-multiply-and-accumulate on 4x4 matrices per clock cycle. The NVIDIA Tesla V100 accelerator, featuring the Volta microarchitecture, provides 640 Tensor Cores with a theoretical peak performance of 125 Tflops/s in mixed precision. In this paper, we investigate current approaches to program NVIDIA Tensor Cores, their performances and the precision loss due to computation in mixed precision. Currently, NVIDIA provides three different ways of programming matrix-multiply-and-accumulate on Tensor Cores: the CUDA Warp Matrix Multiply Accumulate (WMMA) API, CUTLASS, a templated library based on WMMA, and cuBLAS GEMM. After experimenting with different approaches, we found that NVIDIA Tensor Cores can deliver up to 83 Tflops/s in mixed precision on a Tesla V100 GPU, seven and three times the performance in single and half precision respectively. A WMMA implementation of batched GEMM reaches a performance of 4 Tflops/s. While precision loss due to matrix multiplication with half precision input might be critical in many HPC applications, it can be considerably reduced at the cost of increased computation. Our results indicate that HPC applications using matrix multiplications can strongly benefit from using of NVIDIA Tensor Cores.Comment: This paper has been accepted by the Eighth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES) 201

    A Cascaded Approach for ultraly High Performance Lesion Detection and False Positive Removal in Liver CT Scans

    Full text link
    Liver cancer has high morbidity and mortality rates in the world. Multi-phase CT is a main medical imaging modality for detecting/identifying and diagnosing liver tumors. Automatically detecting and classifying liver lesions in CT images have the potential to improve the clinical workflow. This task remains challenging due to liver lesions' large variations in size, appearance, image contrast, and the complexities of tumor types or subtypes. In this work, we customize a multi-object labeling tool for multi-phase CT images, which is used to curate a large-scale dataset containing 1,631 patients with four-phase CT images, multi-organ masks, and multi-lesion (six major types of liver lesions confirmed by pathology) masks. We develop a two-stage liver lesion detection pipeline, where the high-sensitivity detecting algorithms in the first stage discover as many lesion proposals as possible, and the lesion-reclassification algorithms in the second stage remove as many false alarms as possible. The multi-sensitivity lesion detection algorithm maximizes the information utilization of the individual probability maps of segmentation, and the lesion-shuffle augmentation effectively explores the texture contrast between lesions and the liver. Independently tested on 331 patient cases, the proposed model achieves high sensitivity and specificity for malignancy classification in the multi-phase contrast-enhanced CT (99.2%, 97.1%, diagnosis setting) and in the noncontrast CT (97.3%, 95.7%, screening setting)

    Transformer-based Image Compression with Variable Image Quality Objectives

    Full text link
    This paper presents a Transformer-based image compression system that allows for a variable image quality objective according to the user's preference. Optimizing a learned codec for different quality objectives leads to reconstructed images with varying visual characteristics. Our method provides the user with the flexibility to choose a trade-off between two image quality objectives using a single, shared model. Motivated by the success of prompt-tuning techniques, we introduce prompt tokens to condition our Transformer-based autoencoder. These prompt tokens are generated adaptively based on the user's preference and input image through learning a prompt generation network. Extensive experiments on commonly used quality metrics demonstrate the effectiveness of our method in adapting the encoding and/or decoding processes to a variable quality objective. While offering the additional flexibility, our proposed method performs comparably to the single-objective methods in terms of rate-distortion performance

    TransTIC: Transferring Transformer-based Image Compression from Human Visualization to Machine Perception

    Full text link
    This work aims for transferring a Transformer-based image compression codec from human vision to machine perception without fine-tuning the codec. We propose a transferable Transformer-based image compression framework, termed TransTIC. Inspired by visual prompt tuning, we propose an instance-specific prompt generator to inject instance-specific prompts to the encoder and task-specific prompts to the decoder. Extensive experiments show that our proposed method is capable of transferring the codec to various machine tasks and outshining the competing methods significantly. To our best knowledge, this work is the first attempt to utilize prompting on the low-level image compression task

    Utilization of statins and aspirin among patients with diabetes and hyperlipidemia: Taiwan, 1998–2006

    Get PDF
    AbstractBackgroundThe proper use of statins and aspirin decrease the risk of coronary heart disease (CHD) among patients with diabetes (DM) and hyperlipidemia. The purpose of this study was to analyze the time trends and determinants of prescribing statins and aspirin among patients with DM and hyperlipidemia in medical practice in Taiwan.MethodsA cohort of 21,667 patients with DM and hyperlipidemia during the period from 1998 to 2006 was identified by using data of ambulatory care claims from Taiwan's National Health Insurance Database. The dataset was categorized into two equal calendar periods: Period 1 (September 1998–June 2002) and Period 2 (July 2002–April 2006). Multivariate logistic regression analyses were used to determine the independent determinants associated with receipt of lipid-lowering agents and aspirin among these patients.ResultsThere were significant increases in the prescribing of statins (OR 1.78; 95% CI 1.66−1.91) and aspirin (OR 1.47, 95% CI 1.50−1.59) in Period 2 as compared with Period 1. Nevertheless, 30% of patients with coexisting CHD neither received statins nor aspirin. Only 15% to 25% of DM patients with hyperlipidemia and CHD received the combined treatment with aspirin and statin. In multivariate logistic regression, we found that women received aspirin less frequently than men. Old patients (>45 years) with concomitant CHD were more likely to receive statins and aspirin.ConclusionDespite the increasing trend in the use of statins and aspirin in DM patients with hyperlipidemia in Taiwan, the improvements were at best modest, particularly for secondary prevention. Our data indicate the need for continued efforts to improve the utilization of these drugs in daily practice

    A Reinforcement Learning Badminton Environment for Simulating Player Tactics (Student Abstract)

    Full text link
    Recent techniques for analyzing sports precisely has stimulated various approaches to improve player performance and fan engagement. However, existing approaches are only able to evaluate offline performance since testing in real-time matches requires exhaustive costs and cannot be replicated. To test in a safe and reproducible simulator, we focus on turn-based sports and introduce a badminton environment by simulating rallies with different angles of view and designing the states, actions, and training procedures. This benefits not only coaches and players by simulating past matches for tactic investigation, but also researchers from rapidly evaluating their novel algorithms.Comment: Accepted by AAAI 2023 Student Abstract, code is available at https://github.com/wywyWang/CoachAI-Projects/tree/main/Strategic%20Environmen

    Enhanced photo-excitation and angular-momentum imprint of gray excitons in WSe2_{2} monolayers by spin-orbit-coupled vector vortex beams

    Full text link
    A light beam can be spatially structured in the complex amplitude to possess orbital angular momentum (OAM), which introduces a new degree of freedom alongside the intrinsic spin angular momentum (SAM) associated with circular polarization. Moreover, super-imposing two twisted lights with distinct SAM and OAM produces a vector vortex beam (VVB) in non-separable states where not only complex amplitude but also polarization are spatially structured and entangled with each other. In addition to the non-separability, the SAM and OAM in a VVB are intrinsically coupled by the optical spin-orbit interaction and constitute the profound spin-orbit physics in photonics. In this work, we present a comprehensive theoretical investigation, implemented on the first-principles base, of the intriguing light-matter interaction between VVBs and WSe2_{2} monolayers (WSe2_{2}-MLs), one of the best-known and promising two-dimensional (2D) materials in optoelectronics dictated by excitons, encompassing bright exciton (BX) as well as various dark excitons (DXs). One of the key findings of our study is the substantial enhancement of the photo-excitation of gray excitons (GXs), a type of spin-forbidden dark exciton, in a WSe2_2-ML through the utilization of a twisted light that possesses a longitudinal field associated with the optical spin-orbit interaction. Our research demonstrates that a spin-orbit-coupled VVB surprisingly allows for the imprinting of the carried optical information onto gray excitons in 2D materials, which is robust against the decoherence mechanisms in materials. This observation suggests a promising method for deciphering the transferred angular momentum from structured lights to excitons

    Comparative functional genomic analysis of Alzheimer’s affected and naturally aging brains

    Get PDF
    Background Alzheimer’s disease (AD) is a prevalent progressive neurodegenerative human disease whose cause remains unclear. Numerous initially highly hopeful anti-AD drugs based on the amyloid-β (Aβ) hypothesis of AD have failed recent late-phase tests. Natural aging (AG) is a high-risk factor for AD. Here, we aim to gain insights in AD that may lead to its novel therapeutic treatment through conducting meta-analyses of gene expression microarray data from AG and AD-affected brain. Methods Five sets of gene expression microarray data from different regions of AD (hereafter, ALZ when referring to data)-affected brain, and one set from AG, were analyzed by means of the application of the methods of differentially expressed genes and differentially co-expressed gene pairs for the identification of putatively disrupted biological pathways and associated abnormal molecular contents. Results Brain-region specificity among ALZ cases and AG-ALZ differences in gene expression and in KEGG pathway disruption were identified. Strong heterogeneity in AD signatures among the five brain regions was observed: HC/PC/SFG showed clear and pronounced AD signatures, MTG moderately so, and EC showed essentially none. There were stark differences between ALZ and AG. OXPHOS and Proteasome were the most disrupted pathways in HC/PC/SFG, while AG showed no OXPHOS disruption and relatively weak Proteasome disruption in AG. Metabolic related pathways including TCA cycle and Pyruvate metabolism were disrupted in ALZ but not in AG. Three pathogenic infection related pathways were disrupted in ALZ. Many cancer and signaling related pathways were shown to be disrupted AG but far less so in ALZ, and not at all in HC. We identified 54 “ALZ-only” differentially expressed genes, all down-regulated and which, when used to augment the gene list of the KEGG AD pathway, made it significantly more AD-specific

    Controlled Synthesis of Organic/Inorganic van der Waals Solid for Tunable Light-matter Interactions

    Full text link
    Van der Waals (vdW) solids, as a new type of artificial materials that consist of alternating layers bonded by weak interactions, have shed light on fascinating optoelectronic device concepts. As a result, a large variety of vdW devices have been engineered via layer-by-layer stacking of two-dimensional materials, although shadowed by the difficulties of fabrication. Alternatively, direct growth of vdW solids has proven as a scalable and swift way, highlighted by the successful synthesis of graphene/h-BN and transition metal dichalcogenides (TMDs) vertical heterostructures from controlled vapor deposition. Here, we realize high-quality organic and inorganic vdW solids, using methylammonium lead halide (CH3NH3PbI3) as the organic part (organic perovskite) and 2D inorganic monolayers as counterparts. By stacking on various 2D monolayers, the vdW solids behave dramatically different in light emission. Our studies demonstrate that h-BN monolayer is a great complement to organic perovskite for preserving its original optical properties. As a result, organic/h-BN vdW solid arrays are patterned for red light emitting. This work paves the way for designing unprecedented vdW solids with great potential for a wide spectrum of applications in optoelectronics
    corecore