171 research outputs found

    Linpack evaluation on a supercomputer with heterogeneous accelerators

    Full text link
    Abstract—We report Linpack benchmark results on the TSUBAME supercomputer, a large scale heterogeneous system equipped with NVIDIA Tesla GPUs and ClearSpeed SIMD accelerators. With all of 10,480 Opteron cores, 640 Xeon cores, 648 ClearSpeed accelerators and 624 NVIDIA Tesla GPUs, we have achieved 87.01TFlops, which is the third record as a heterogeneous system in the world. This paper describes careful tuning and load balancing method required to achieve this performance. On the other hand, since the peak speed is 163 TFlops, the efficiency is 53%, which is lower than other systems. This paper also analyses this gap from the aspect of system architecture. I

    Scalable Reed-Solomon-based Reliable Local Storage for HPC Applications on IaaS Clouds

    Get PDF
    International audienceWith increasing interest among mainstream users to run HPC applications, Infrastructure-as-a-Service (IaaS) cloud computing platforms represent a viable alternative to the acquisition and maintenance of expensive hardware, often out of the financial capabilities of such users. Also, one of the critical needs of HPC applications is an efficient, scalable and persistent storage. Unfortunately, storage options proposed by cloud providers are not standardized and typically use a different access model. In this context, the local disks on the compute nodes can be used to save large data sets such as the data generated by Checkpoint-Restart (CR). This local storage offers high throughput and scalability but it needs to be combined with persistency techniques, such as block replication or erasure codes. One of the main challenges that such techniques face is to minimize the overhead of performance and I/O resource utilization (i.e., storage space and bandwidth), while at the same time guaranteeing high reliability of the saved data. This paper introduces a novel persistency technique that leverages Reed-Solomon (RS) encoding to save data in a reliable fashion. Compared to traditional approaches that rely on block replication, we demonstrate about 50% higher throughput while reducing network bandwidth and storage utilization by a factor of 2 for the same targeted reliability level. This is achieved both by modeling and real life experimentation on hundreds of nodes

    The Case for Strong Scaling in Deep Learning: Training Large 3D CNNs with Hybrid Parallelism

    Full text link
    We present scalable hybrid-parallel algorithms for training large-scale 3D convolutional neural networks. Deep learning-based emerging scientific workflows often require model training with large, high-dimensional samples, which can make training much more costly and even infeasible due to excessive memory usage. We solve these challenges by extensively applying hybrid parallelism throughout the end-to-end training pipeline, including both computations and I/O. Our hybrid-parallel algorithm extends the standard data parallelism with spatial parallelism, which partitions a single sample in the spatial domain, realizing strong scaling beyond the mini-batch dimension with a larger aggregated memory capacity. We evaluate our proposed training algorithms with two challenging 3D CNNs, CosmoFlow and 3D U-Net. Our comprehensive performance studies show that good weak and strong scaling can be achieved for both networks using up 2K GPUs. More importantly, we enable training of CosmoFlow with much larger samples than previously possible, realizing an order-of-magnitude improvement in prediction accuracy.Comment: 12 pages, 10 figure

    REMODEL: Rethinking Deep CNN Models to Detect and Count on a NeuroSynaptic System

    Get PDF
    In this work, we perform analysis of detection and counting of cars using a low-power IBM TrueNorth Neurosynaptic System. For our evaluation we looked at a publicly-available dataset that has overhead imagery of cars with context present in the image. The trained neural network for image analysis was deployed on the NS16e system using IBM's EEDN training framework. Through multiple experiments we identify the architectural bottlenecks present in TrueNorth system that does not let us deploy large neural network structures. Following these experiments we propose changes to CNN model to circumvent these architectural bottlenecks. The results of these evaluations have been compared with caffe-based implementations of standard neural networks that were deployed on a Titan-X GPU. Results showed that TrueNorth can detect cars from the dataset with 97.60% accuracy and can be used to accurately count the number of cars in the image with 69.04% accuracy. The car detection accuracy and car count (–/+ 2 error margin) accuracy are comparable to high-precision neural networks like AlexNet, GoogLeNet, and ResCeption, but show a manifold improvement in power consumption

    Circadian Rhythms Fluctuate the Treatment Effects of Intravesical Treatments on Rat Urinary Frequency Models

    Get PDF
    Objectives. It is still not clear how the intravesical instillation of drugs affects rat urinary frequency. This study aimed to examine the dynamics of intravesical treatments' treatment effect on rat urinary frequency models by real-time and extended monitoring using a novel continuous urination monitoring system. Methods. Nine eleven-week-old female Wistar rats were divided into three groups to receive intravesical instillation of 0.1% acetic acid (AA), 1.0% AA, or phosphate-buffered saline (PBS). Thirty minutes later, these drugs were voided, and rats were moved to a continuous urination monitoring system, UM-100. UM-100 monitored rat urination quantitatively and continuously for 24 hours. Rats were then euthanized, and histopathologic examinations using a damage score validated the severity of bladder inflammation. We used nine additional rats to determine the treatment effect of various drugs against the urinary frequency. These rats were also treated with 1.0% AA in the same way and divided into three groups (n = 3 each) to receive intravesical instillation of lidocaine, silver nitrate (AgNO3), or dimethyl sulfoxide (DMSO), respectively. Thirty minutes later, rats were catheterized again and moved to the UM-100, and their voiding was monitored for 24 hours. Results. Intravesical instillation of AA increased the urinary frequency and decreased the mean voided volume (VV) in a concentration-dependent manner, with statistical significance at a concentration of 1.0% (urinary frequency; p = 0.0007 , mean VV; p = 0.0032 , respectively) compared with PBS. Histopathological analysis of these models demonstrated a significantly higher damage score of bladder mucosa in both 0.1% AA and 1.0% AA compared with PBS, with the severity in concordance with the clinical severity of urinary frequency (0.1% AA: p < 0.0001 , 1.0% AA: p < 0.0001 ). Moreover, intravesical instillation of lidocaine, AgNO3, and DMSO decreased the urinary frequency. Continuous monitoring with UM-100 also demonstrated that the treatment effect of these intravesically instilled drugs occurred only at night. Conclusions. The extended monitoring of rat urination by UM-100 revealed a significant fluctuation in the treatment effect of intravesically instilled drugs between day and night. These findings may help establish novel therapies for urinary frequency

    腹膜および胸膜悪性中皮腫におけるEGFR発現の比較

    Get PDF
    An evaluation of epidermal growth factor receptor (EGFR) phenotypic expression in malignant pleural and peritoneal mesothelioma was undertaken, using immunohistochemical (IHC) and fluorescence in situ hybridization (FISH) analysis. Thirty-eight malignant mesothelioma (MM) specimens were subjected to IHC staining and FISH to evaluate the expression of EGFR protein and gene status. Overall positive IHC reaction was detected in 20/38 (53%) cases, in 11/22 (50%) pleural MM, and in 9/16 (56%) peritoneal MM. Our study confirmed that EGFR membranous expression is a common feature in MM, but not in benign mesothelial lesion. Thirty-seven cases did not show a gene copy number gain. Only one case showed a copy number gain. The protein overexpression of EGFR was not related to a gene copy number gain.博士(医学)・乙第1299号・平成24年5月28日© 2012 The Authors. Pathology International© 2012 Japanese Society of Pathology and Blackwell Publishing Asia Pty Ltd
    corecore