413 research outputs found

    Exploring Scientific Application Performance Using Large Scale Object Storage

    Full text link
    One of the major performance and scalability bottlenecks in large scientific applications is parallel reading and writing to supercomputer I/O systems. The usage of parallel file systems and consistency requirements of POSIX, that all the traditional HPC parallel I/O interfaces adhere to, pose limitations to the scalability of scientific applications. Object storage is a widely used storage technology in cloud computing and is more frequently proposed for HPC workload to address and improve the current scalability and performance of I/O in scientific applications. While object storage is a promising technology, it is still unclear how scientific applications will use object storage and what the main performance benefits will be. This work addresses these questions, by emulating an object storage used by a traditional scientific application and evaluating potential performance benefits. We show that scientific applications can benefit from the usage of object storage on large scales.Comment: Preprint submitted to WOPSSS workshop at ISC 201

    NVIDIA Tensor Core Programmability, Performance & Precision

    Full text link
    The NVIDIA Volta GPU microarchitecture introduces a specialized unit, called "Tensor Core" that performs one matrix-multiply-and-accumulate on 4x4 matrices per clock cycle. The NVIDIA Tesla V100 accelerator, featuring the Volta microarchitecture, provides 640 Tensor Cores with a theoretical peak performance of 125 Tflops/s in mixed precision. In this paper, we investigate current approaches to program NVIDIA Tensor Cores, their performances and the precision loss due to computation in mixed precision. Currently, NVIDIA provides three different ways of programming matrix-multiply-and-accumulate on Tensor Cores: the CUDA Warp Matrix Multiply Accumulate (WMMA) API, CUTLASS, a templated library based on WMMA, and cuBLAS GEMM. After experimenting with different approaches, we found that NVIDIA Tensor Cores can deliver up to 83 Tflops/s in mixed precision on a Tesla V100 GPU, seven and three times the performance in single and half precision respectively. A WMMA implementation of batched GEMM reaches a performance of 4 Tflops/s. While precision loss due to matrix multiplication with half precision input might be critical in many HPC applications, it can be considerably reduced at the cost of increased computation. Our results indicate that HPC applications using matrix multiplications can strongly benefit from using of NVIDIA Tensor Cores.Comment: This paper has been accepted by the Eighth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES) 201

    Baseband Transceiver Design of a High Definition Radio FM System Using Joint Theoretical Analysis and FPGA Implementation

    Get PDF
    Advances in wireless communications have enabled various technologies for wireless digital communication. In the field of digital radio broadcasting, several specifications have been proposed, such as Eureka-147 and digital radio mondiale (DRM). These systems require a new spectrum assignment, which incurs heavy cost due to the depletion of the available spectrum. Therefore, the in-band on-channel (IBOC) system has been developed to work in the same band with the conventional analog radio and to provide digital broadcasting services. This paper discusses the function and algorithm of the high definition (HD) radio frequency modulation (FM) digital radio broadcasting system. Content includes data format allocation, constellation mapping, orthogonal frequency division multiplexing (OFDM) modulation of the transmitter, timing synchronization, OFDM demodulation, integer and fraction carrier frequency (integer carrier frequency offset (ICFO) and fractional CFO (FCFO)) estimation, and channel estimation of the receiver. When we implement this system to the field programmable gate array (FPGA) based on a hardware platform, both theoretical and practical aspects have been considered to accommodate the available hardware resources

    PolyPIC: the Polymorphic-Particle-in-Cell Method for Fluid-Kinetic Coupling

    Get PDF
    Particle-in-Cell (PIC) methods are widely used computational tools for fluid and kinetic plasma modeling. While both the fluid and kinetic PIC approaches have been successfully used to target either kinetic or fluid simulations, little was done to combine fluid and kinetic particles under the same PIC framework. This work addresses this issue by proposing a new PIC method, PolyPIC, that uses polymorphic computational particles. In this numerical scheme, particles can be either kinetic or fluid, and fluid particles can become kinetic when necessary, e.g. particles undergoing a strong acceleration. We design and implement the PolyPIC method, and test it against the Landau damping of Langmuir and ion acoustic waves, two stream instability and sheath formation. We unify the fluid and kinetic PIC methods under one common framework comprising both fluid and kinetic particles, providing a tool for adaptive fluid-kinetic coupling in plasma simulations.Comment: Submitted to Frontier

    The role of interactive super-computing in using HPC for urgent decision making

    Get PDF
    Technological advances are creating exciting new opportunities that have the potential to move HPC well beyond traditional computational workloads. In this paper we focus on the potential for HPC to be instrumental in responding to disasters such as wildfires, hurricanes, extreme flooding, earthquakes, tsunamis, winter weather conditions, and accidents. Driven by the VESTEC EU funded H2020 project, our research looks to prove HPC as a tool not only capable of simulating disasters once they have happened, but also one which is able to operate in a responsive mode, supporting disaster response teams making urgent decisions in real-time. Whilst this has the potential to revolutionise disaster response, it requires the ability to drive HPC interactively, both from the user's perspective and also based upon the arrival of data. As such interactivity is a critical component in enabling HPC to be exploited in the role of supporting disaster response teams so that urgent decision makers can make the correct decision first time, every time

    Chemical-free inactivated whole influenza virus vaccine prepared by ultrashort pulsed laser treatment

    Get PDF
    There is an urgent need for rapid methods to develop vaccines in response to emerging viral pathogens. Whole inactivated virus (WIV) vaccines represent an ideal strategy for this purpose; however, a universal method for producing safe and immunogenic inactivated vaccines is lacking. Conventional pathogen inactivation methods such as formalin, heat, ultraviolet light, and gamma rays cause structural alterations in vaccines that lead to reduced neutralizing antibody specificity, and in some cases, disastrous T helper type 2-mediated immune pathology. We have evaluated the potential of a visible ultrashort pulsed (USP) laser method to generate safe and immunogenic WIV vaccines without adjuvants. Specifically, we demonstrate that vaccination of mice with laser-inactivated H1N1 influenza virus at about a 10-fold lower dose than that required using conventional formalin-inactivated influenza vaccines results in protection against lethal H1N1 challenge in mice. The virus, inactivated by the USP laser irradiation, has been shown to retain its surface protein structure through hemagglutination assay. Unlike conventional inactivation methods, laser treatment did not generate carbonyl groups in protein, thereby reducing the risk of adverse vaccine-elicited T helper type 2 responses. Therefore, USP laser treatment is an attractive potential strategy to generate WIV vaccines with greater potency and safety than vaccines produced by current inactivation techniques

    EFFECTS OF POLYMER MORPHOLOGY ON THE RHEOLOGICAL BEHAVIOR OF MELT WITHIN MICRO-CHANNELS

    Get PDF
    ABSTRACT Micro molding has shown great commercial potential in recent years and determination of the rheological behavior of the polymer melt within micro structured geometry is vital for accurate simulation modeling of micro molding. The lack of commercial equipment is one of main hurdles in the investigation of micro melt rheology. In this study, a melt viscosity measurement system for low and high density polyethylene polymer melt flowing through micro-channels was established using a micro channel mold operated at a mold temperature as high as the melt temperature. For measured pressure drop and volumetric flow rate, capillary flow model was used for the calculation of viscosity utilizing Rabinowitsch correction. The calculated results of low crystallinity LDPE resin were also compared with those of high crystallinity HDPE resin to discuss the effect of degree of crystallinity on the viscosity characteristics of polymer within micro-channels. It was found that the measured LDPE and HDPE viscosity values in the test ranges are significantly lower (about 40~56% and 22~29% for LDPE and HDPE, respectively, flowing through a channel size of 150μm) than those obtained with a traditional capillary rheometer. Meanwhile, the percentage reduction in the viscosity value and the ratio of slip velocity relative to mean velocity all increase with decreasing micro-channel size. In the present study we emphasize that the rheological behavior of the high crystallinity HDPE and low crystallinity LDPE resins in microscopic scale are all different from that of macroscopic scale but HDPE displays a less significant lower. The reason can be attributed to for LDPE resin within the micro-channel can create the higher extra bonding force between the bulk chains than HDPE resin. Thus, it will have the lower adhesive force between the bulk chains with the micro-channel wall, resulting in higher degree of wall slip

    Anti-Bladder-Tumor Effect of Baicalein from Scutellaria baicalensis Georgi and Its Application In Vivo

    Get PDF
    Some phytochemicals with the characteristics of cytotoxicity and/or antimetastasis have generated intense interest among the anticancer studies. In this study, a natural flavonoid baicalein was evaluated in bladder cancer in vitro and in vivo. Baicalein inhibits 5637 cell proliferation. It arrests cells in G1 phase at 100 μM and in S phase below 75 μM. The protein expression of cyclin B1 and cyclin D1 is reduced by baicalein. Baicalein-induced p-ERK plays a minor role in cyclin B1 reduction. Baicalein-inhibited p65NF-κB results in reduction of cell growth. Baicalein-induced pGSK(ser9) has a little effect in increasing cyclin B1/D1 expression instead. The translation inhibitor cycloheximide blocks baicalein-reduced cyclin B1, suggesting that the reduction is caused by protein synthesis inhibition. On the other hand, neither cycloheximide nor proteasome inhibitor MG132 completely blocks baicalein-reduced cyclin D1, suggesting that baicalein reduces cyclin D1 through protein synthesis inhibition and proteasomal degradation activation. In addition, baicalein also inhibits cell invasion by inhibiting MMP-2 and MMP-9 mRNA expression and activity. In mouse orthotopic bladder tumor model, baicalein slightly reduces tumor size but with some hepatic toxicity. In summary, these results demonstrate the anti-bladder-tumor properties of the natural compound baicalein which shows a slight anti-bladder-tumor effect in vivo

    Aberrant Sensory Gating of the Primary Somatosensory Cortex Contributes to the Motor Circuit Dysfunction in Paroxysmal Kinesigenic Dyskinesia

    Get PDF
    Paroxysmal kinesigenic dyskinesia (PKD) is conventionally regarded as a movement disorder (MD) and characterized by episodic hyperkinesia by sudden movements. However, patients of PKD often have sensory aura and respond excellently to antiepileptic agents. PRRT2 mutations, the most common genetic etiology of PKD, could cause epilepsy syndromes as well. Standing in the twilight zone between MDs and epilepsy, the pathogenesis of PKD is unclear. Gamma oscillations arise from the inhibitory interneurons which are crucial in the thalamocortical circuits. The role of synchronized gamma oscillations in sensory gating is an important mechanism of automatic cortical inhibition. The patterns of gamma oscillations have been used to characterize neurophysiological features of many neurological diseases, including epilepsy and MDs. This study was aimed to investigate the features of gamma synchronizations in PKD. In the paired-pulse electrical-stimulation task, we recorded the magnetoencephalographic data with distributed source modeling and time-frequency analysis in 19 patients of newly-diagnosed PKD without receiving pharmacotherapy and 18 healthy controls. In combination with the magnetic resonance imaging, the source of gamma oscillations was localized in the primary somatosensory cortex. Somatosensory evoked fields of PKD patients had a reduced peak frequency (p < 0.001 for the first and the second response) and a prolonged peak latency (the first response p = 0.02, the second response p = 0.002), indicating the synchronization of gamma oscillation is significantly attenuated. The power ratio between two responses was much higher in the PKD group (p = 0.013), indicating the incompetence of activity suppression. Aberrant gamma synchronizations revealed the defective sensory gating of the somatosensory area contributes the pathogenesis of PKD. Our findings documented disinhibited cortical function is a pathomechanism common to PKD and epilepsy, thus rationalized the clinical overlaps of these two diseases and the therapeutic effect of antiepileptic agents for PKD. There is a greater reduction of the peak gamma frequency in PRRT2-related PKD than the non-PRRT PKD group (p = 0.028 for the first response, p = 0.004 for the second response). Loss-of-function PRRT2 mutations could lead to synaptic dysfunction. The disinhibiton change on neurophysiology reflected the impacts of PRRT2 mutations on human neurophysiology
    corecore