2,463 research outputs found

    NVIDIA Tensor Core Programmability, Performance & Precision

    Full text link
    The NVIDIA Volta GPU microarchitecture introduces a specialized unit, called "Tensor Core" that performs one matrix-multiply-and-accumulate on 4x4 matrices per clock cycle. The NVIDIA Tesla V100 accelerator, featuring the Volta microarchitecture, provides 640 Tensor Cores with a theoretical peak performance of 125 Tflops/s in mixed precision. In this paper, we investigate current approaches to program NVIDIA Tensor Cores, their performances and the precision loss due to computation in mixed precision. Currently, NVIDIA provides three different ways of programming matrix-multiply-and-accumulate on Tensor Cores: the CUDA Warp Matrix Multiply Accumulate (WMMA) API, CUTLASS, a templated library based on WMMA, and cuBLAS GEMM. After experimenting with different approaches, we found that NVIDIA Tensor Cores can deliver up to 83 Tflops/s in mixed precision on a Tesla V100 GPU, seven and three times the performance in single and half precision respectively. A WMMA implementation of batched GEMM reaches a performance of 4 Tflops/s. While precision loss due to matrix multiplication with half precision input might be critical in many HPC applications, it can be considerably reduced at the cost of increased computation. Our results indicate that HPC applications using matrix multiplications can strongly benefit from using of NVIDIA Tensor Cores.Comment: This paper has been accepted by the Eighth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES) 201

    Soil Moisture Active/Passive (SMAP) Forward Brightness Temperature Simulator

    Get PDF
    The SMAP is one of four first-tier missions recommended by the US National Research Council's Committee on Earth Science and Applications from Space (Earth Science and Applications from Space: National Imperatives for the Next Decade and Beyond, Space Studies Board, National Academies Press, 2007) [1]. It is to measure the global soil moisture and freeze/thaw from space. One of the spaceborne instruments is an L-band radiometer with a shared single feedhorn and parabolic mesh reflector. While the radiometer measures the emission over a footprint of interest, unwanted emissions are also received by the antenna through the antenna sidelobes from the cosmic background and other error sources such as the Sun, the Moon and the galaxy. Their effects need to be considered accurately, and the analysis of the overall performance of the radiometer requires end-to-end performance simulation from Earth emission to antenna brightness temperature, such as the global simulation of L-band brightness temperature simulation over land and sea [2]. To assist with the SMAP radiometer level 1B algorithm development, the SMAP forward brightness temperature simulator is developed by adapting the Aquarius simulator [2] with necessary modifications. This poster presents the current status of the SMAP forward brightness simulator s development including incorporating the land microwave emission model and its input datasets, and a simplified atmospheric radiative transfer model. The latest simulation results are also presented to demonstrate the ability of supporting the SMAP L1B algorithm development

    Elemental tellurium as a chiral p-type thermoelectric material

    Get PDF
    The thermoelectric transport properties of elemental tellurium are investigated by density functional theory combined with the Boltzmann transport equation in the rigid band approximation. We find that the thermoelectric transport properties parallel and perpendicular to the helical chains are highly asymmetric (almost symmetric) for p- (n-) type doped tellurium due to the anisotropic (isotropic) hole (electron) pockets of the Fermi surface. The electronic band structure shows that the lone-pair derived uppermost heavy-hole and extremely light-hole lower valence bands offer the opportunity to obtain both a high Seebeck coefficient and electrical conductivity along the chains through Sb or Bi doping. Furthermore, the stairlike density of states yields a large asymmetry for the transport distribution function relative to the Fermi energy which leads to large thermopower. The calculations reveal that tellurium has the potential to be a good p-type thermoelectric material with an optimum figure of merit zT of 0.31 (0.56) at room temperature (500 K) at a hole concentration around 1×10^19 cm^−3. Exploiting the rich chemistry of lone pairs in chiral solids may have important implications for the discovery of high-zT polychalcogenide-based thermoelectric materials

    Freeze-drying Silica Based Aerogels Using Cryoprotectants and Eutectic Solvent Mixtures

    Get PDF
    Silica based aerogels have unique properties, including good thermal insulation and convective inhibition. A sol-gel process can be used to produce semi-opaque, monolithic gels, which can then be dried to produce aerogels. Multiple drying methods are available industrially, however, these methods require high temperatures and pressures, specialized equipment, and are time consuming. This project aims to experimentally study the possibility of a new method for drying wet gels through a freeze-drying process, with the use of cryoprotectants, eutectics, and polymers to inhibit and control ice formation and growth during drying. Silica wet gels were produced using tetraethylorthosilicate (TEOS), ethanol, water, and hydrochloric acid/ammonia hydroxide. After gelation the gels were subjected to solvent exchanges with varying concentrations of cryoprotectants, eutectics, polymers and combinations of the three. A customized freeze-dryer was used to obtain silica aerogels from wet gels, with monolithicity and porosity of the resulting aerogel measured by SEM and BET. The results indicated that the addition of cryoprotectants, eutectics, and polymers yielded monolithic foams which were structurally stable and had measurable porosity and surface area. Using the processes developed in this work would allow for simpler, more cost effective methods for drying wet gels to be developed; these methods could be used to produce freeze-dried aerogels with better properties and have potential for industrial implementation

    Optimization principles and the figure of merit for triboelectric generators

    Get PDF
    Energy harvesting with triboelectric nanogenerators is a burgeoning field, with a growing portfolio of creative application schemes attracting much interest. Although power generation capabilities and its optimization are one of the most important subjects, a satisfactory elemental model that illustrates the basic principles and sets the optimization guideline remains elusive. We use a simple model to clarify how the energy generation mechanism is electrostatic induction but with a time-varying character that makes the optimal matching for power generation more restrictive. By combining multiple parameters into dimensionless variables, we pinpoint the optimum condition with only two independent parameters, leading to predictions of the maximum limit of power density, which allows us to derive the triboelectric material and device figure of merit. We reveal the importance of optimizing device capacitance, not only load resistance, and minimizing the impact of parasitic capacitance. Optimized capacitances can lead to an overall increase in power density of more than 10 times
    • …
    corecore