533 research outputs found

    Linpack evaluation on a supercomputer with heterogeneous accelerators

    Full text link
    Abstract—We report Linpack benchmark results on the TSUBAME supercomputer, a large scale heterogeneous system equipped with NVIDIA Tesla GPUs and ClearSpeed SIMD accelerators. With all of 10,480 Opteron cores, 640 Xeon cores, 648 ClearSpeed accelerators and 624 NVIDIA Tesla GPUs, we have achieved 87.01TFlops, which is the third record as a heterogeneous system in the world. This paper describes careful tuning and load balancing method required to achieve this performance. On the other hand, since the peak speed is 163 TFlops, the efficiency is 53%, which is lower than other systems. This paper also analyses this gap from the aspect of system architecture. I

    Geochemical Clogging in Fracture and Porous Rock for CO2 Mineral Trapping

    Get PDF
    AbstractGeochemical trapping is regarded as one of the promising geologic sequestration of carbon dioxide (CO2). Also carbonate mineralization takes advantage of permeability reduction to seal formations with decreasing CO2 leakage risk and increasing storage safety. As precipitation rates tend to be faster and the solubility product shows lower value at higher temperature, the calcite- and kaolinite- rich rock produced through CO2-water-rock interaction is expected to form the scale in geothermal reservoirs. Ca2+ released from rocks could be removed as carbonate minerals (CaCO3) during CO2 sequestration into aquifer rocks. However, when, where, and how much calcite deposits at the reservoir. For this reason, flow experiments and numerical calculation of advection-reaction model have been done to predict where and when the mineral deposits and permeability changes.The experimental and numerical results provided that fluid velocity change between fracture and porous media have more than one-order discrepancy at isothermal condition. When the fluid velocity in fracture exceeds the critical velocity, surface erosion allows re-entrainment. Critical velocity in porous media is likely to be larger than that in fracture because internal erosion might interrupt the migration of deposit by re-settlement in pore spaces

    Efficient high-precision integer multiplication on the GPU

    Get PDF
    Dieguez AP, Amor M, Doallo R, Nukada A, Matsuoka S. Efficient high precision integer multiplication on the GPU. The International Journal of High Performance Computing Applications. 2022;36(3):356-369.© The Author(s) 2022. Publisher: SAGE Publications. https://doi.org/10.1177/10943420221077964[Abstract]: The multiplication of large integers, which has many applications in computer science, is an operation that can be expressed as a polynomial multiplication followed by a carry normalization. This work develops two approaches for efficient polynomial multiplication: one approach is based on tiling the classical convolution algorithm, but taking advantage of new CUDA architectures, a novelty approach to compute the multiplication using integers without accuracy lossless; the other one is based on the Strassen algorithm, an algorithm that multiplies large polynomials using the FFT operation, but adapting the fastest FFT libraries for current GPUs and working on the complex field. Previous studies reported that the Strassen algorithm is an effective implementation for “large enough” integers on GPUs. Additionally, most previous studies do not examine the implementation of the carry normalization, but this work describes a parallel implementation for this operation. Our results show the efficiency of our approaches for short, medium, and large sizes.The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work has been supported by the Ministry of Science and Innovation of Spain (PID2019-104184RB-I00), by the Galician Government and FEDER funds under the Consolidation Program of Competitive Reference Groups (UDC/GI-000265, ref. ED431C 2021/30), by the Consolidation Program of Competitive Research Units (ED431G2019/01), and by the FPU Program of the Ministry of Education of Spain (FPU14/02801). It is also partially supported by JST CREST [JPMJCR1303 and JPMJCR1687] and NVIDIA GPU Center of Excellence and conducted as research activities of AIST-TokyoTech Real World Big-Data Computation Open Innovation Laboratory (RWBC-OIL).Xunta de Galicia; ED431C 2021/3

    Long-term outcomes of SN idenitification

    Get PDF
    Background: Sentinel node (SN) biopsy is used in the management of numerous cancers to avoid unnecessary lymphadenectomy. This was a clinical exploration/feasibility study of a novel identification technique for SN biopsy using indocyanine green (ICG) fluorescence imaging during lung cancer surgery. Methods: SN biopsy using ICG was performed on 22 patients who had cT1 or T2N0M0 lung cancer. ICG was injected just around the primary tumor. The fluorescence imaging system enabled visualization of the lymphatic vessels draining from the primary tumor toward the lymph nodes. Fluorescently labeled nodes were dissected, and patients were followed-up for prognosis and recurrence to confirm the pattern of lymph node metastasis after surgery. Results: SNs were successfully identified in 16 (72.7%) of 22 patients. A total of 13 of 16 patients had pathological N0 and three had SN metastasis. The median follow-up time was 92.7 months. Only one patient had no SN metastasis at the postoperative pathological examination and lymph node metastasis during the follow-up period. The accuracy rate was 93.8% (15/16) and the false-negative rate was 7.7% (1/13). Conclusions: SNs were identified by ICG fluorescence imaging, and this technique during lung cancer surgery had good identification and accuracy rates throughout the follow-up period
    corecore