304 research outputs found

    Proceedings of the First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016)

    Get PDF
    Proceedings of the First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016) Timisoara, Romania. February 8-11, 2016.The PhD Symposium was a very good opportunity for the young researchers to share information and knowledge, to present their current research, and to discuss topics with other students in order to look for synergies and common research topics. The idea was very successful and the assessment made by the PhD Student was very good. It also helped to achieve one of the major goals of the NESUS Action: to establish an open European research network targeting sustainable solutions for ultrascale computing aiming at cross fertilization among HPC, large scale distributed systems, and big data management, training, contributing to glue disparate researchers working across different areas and provide a meeting ground for researchers in these separate areas to exchange ideas, to identify synergies, and to pursue common activities in research topics such as sustainable software solutions (applications and system software stack), data management, energy efficiency, and resilience.European Cooperation in Science and Technology. COS

    Sensing and Signal Processing in Smart Healthcare

    Get PDF
    In the last decade, we have witnessed the rapid development of electronic technologies that are transforming our daily lives. Such technologies are often integrated with various sensors that facilitate the collection of human motion and physiological data and are equipped with wireless communication modules such as Bluetooth, radio frequency identification, and near-field communication. In smart healthcare applications, designing ergonomic and intuitive human–computer interfaces is crucial because a system that is not easy to use will create a huge obstacle to adoption and may significantly reduce the efficacy of the solution. Signal and data processing is another important consideration in smart healthcare applications because it must ensure high accuracy with a high level of confidence in order for the applications to be useful for clinicians in making diagnosis and treatment decisions. This Special Issue is a collection of 10 articles selected from a total of 26 contributions. These contributions span the areas of signal processing and smart healthcare systems mostly contributed by authors from Europe, including Italy, Spain, France, Portugal, Romania, Sweden, and Netherlands. Authors from China, Korea, Taiwan, Indonesia, and Ecuador are also included

    Investigation of Parallel Data Processing Using Hybrid High Performance CPU + GPU Systems and CUDA Streams

    Get PDF
    The paper investigates parallel data processing in a hybrid CPU+GPU(s) system using multiple CUDA streams for overlapping communication and computations. This is crucial for efficient processing of data, in particular incoming data stream processing that would naturally be forwarded using multiple CUDA streams to GPUs. Performance is evaluated for various compute time to host-device communication time ratios, numbers of CUDA streams, for various numbers of threads managing computations on GPUs. Tests also reveal benefits of using CUDA MPS for overlapping communication and computations when using multiple processes. Furthermore, using standard memory allocation on a GPU and Unified Memory versions are compared, the latter including programmer added prefetching. Performance of a hybrid CPU+GPU version as well as scaling across multiple GPUs are demonstrated showing good speed-ups of the approach. Finally, the performance per power consumption of selected configurations are presented for various numbers of streams and various relative performances of GPUs and CPUs

    Optimization of atmospheric transport models on HPC platforms

    Get PDF
    The performance and scalability of atmospheric transport models on high performance computing environments is often far from optimal for multiple reasons including, for example, sequential input and output, synchronous communications, work unbalance, memory access latency or lack of task overlapping. We investigate how different software optimizations and porting to non general-purpose hardware architectures improve code scalability and execution times considering, as an example, the FALL3D volcanic ash transport model. To this purpose, we implement the FALL3D model equations in the WARIS framework, a software designed from scratch to solve in a parallel and efficient way different geoscience problems on a wide variety of architectures. In addition, we consider further improvements in WARIS such as hybrid MPI-OMP parallelization, spatial blocking, auto-tuning and thread affinity. Considering all these aspects together, the FALL3D execution times for a realistic test case running on general-purpose cluster architectures (Intel Sandy Bridge) decrease by a factor between 7 and 40 depending on the grid resolution. Finally, we port the application to Intel Xeon Phi (MIC) and NVIDIA GPUs (CUDA) accelerator-based architectures and compare performance, cost and power consumption on all the architectures. Implications on time-constrained operational model configurations are discussed.We thank M.S. Osores from the Argentinean National Scientific and Technical Research Council (CONICET) for providing hourly column heights for the CordĂłn Caulle eruption simulation and the constructive comments from two anonymous reviewers. This work was supported by NVIDIA through the UPC/BSC GPU Center of Excellence, and the Spanish Ministry of Science and Technology through the TIN2012-34557 project. Finally, we dedicate this work to our colleague and co-author Nacho Navarro, who sadly passed away during the reviewing process.Peer ReviewedPostprint (author's final draft

    Integrated Development and Parallelization of Automated Dicentric Chromosome Identification Software to Expedite Biodosimetry Analysis

    Get PDF
    Manual cytogenetic biodosimetry lacks the ability to handle mass casualty events. We present an automated dicentric chromosome identification (ADCI) software utilizing parallel computing technology. A parallelization strategy combining data and task parallelism, as well as optimization of I/O operations, has been designed, implemented, and incorporated in ADCI. Experiments on an eight-core desktop show that our algorithm can expedite the process of ADCI by at least four folds. Experiments on Symmetric Computing, SHARCNET, Blue Gene/Q multi-processor computers demonstrate the capability of parallelized ADCI to process thousands of samples for cytogenetic biodosimetry in a few hours. This increase in speed underscores the effectiveness of parallelization in accelerating ADCI. Our software will be an important tool to handle the magnitude of mass casualty ionizing radiation events by expediting accurate detection of dicentric chromosomes

    Supercomputing Frontiers

    Get PDF
    This open access book constitutes the refereed proceedings of the 7th Asian Conference Supercomputing Conference, SCFA 2022, which took place in Singapore in March 2022. The 8 full papers presented in this book were carefully reviewed and selected from 21 submissions. They cover a range of topics including file systems, memory hierarchy, HPC cloud platform, container image configuration workflow, large-scale applications, and scheduling
    • 

    corecore