76 research outputs found

    Characterization and Acceleration of High Performance Compute Workloads

    Get PDF

    Characterization and Acceleration of High Performance Compute Workloads

    Get PDF

    PMT: Power Measurement Toolkit

    Full text link
    Efficient use of energy is essential for today's supercomputing systems, as energy cost is generally a major component of their operational cost. Research into "green computing" is needed to reduce the environmental impact of running these systems. As such, several scientific communities are evaluating the trade-off between time-to-solution and energy-to-solution. While the runtime of an application is typically easy to measure, power consumption is not. Therefore, we present the Power Measurement Toolkit (PMT), a high-level software library capable of collecting power consumption measurements on various hardware. The library provides a standard interface to easily measure the energy use of devices such as CPUs and GPUs in critical application sections

    Memory and Parallelism Analysis Using a Platform-Independent Approach

    Full text link
    Emerging computing architectures such as near-memory computing (NMC) promise improved performance for applications by reducing the data movement between CPU and memory. However, detecting such applications is not a trivial task. In this ongoing work, we extend the state-of-the-art platform-independent software analysis tool with NMC related metrics such as memory entropy, spatial locality, data-level, and basic-block-level parallelism. These metrics help to identify the applications more suitable for NMC architectures.Comment: 22nd ACM International Workshop on Software and Compilers for Embedded Systems (SCOPES '19), May 201

    Near Memory Acceleration on High Resolution Radio Astronomy Imaging

    Full text link
    Modern radio telescopes like the Square Kilometer Array (SKA) will need to process in real-time exabytes of radio-astronomical signals to construct a high-resolution map of the sky. Near-Memory Computing (NMC) could alleviate the performance bottlenecks due to frequent memory accesses in a state-of-the-art radio-astronomy imaging algorithm. In this paper, we show that a sub-module performing a two-dimensional fast Fourier transform (2D FFT) is memory bound using CPI breakdown analysis on IBM Power9. Then, we present an NMC approach on FPGA for 2D FFT that outperforms a CPU by up to a factor of 120x and performs comparably to a high-end GPU, while using less bandwidth and memory

    NMPO:Near-Memory Computing Profiling and Offloading

    Get PDF
    Real-world applications are now processing big-data sets, often bottlenecked by the data movement between the compute units and the main memory. Near-memory computing (NMC), a modern data-centric computational paradigm, can alleviate these bottlenecks, thereby improving the performance of applications. The lack of NMC system availability makes simulators the primary evaluation tool for performance estimation. However, simulators are usually time-consuming, and methods that can reduce this overhead would accelerate the early-stage design process of NMC systems. This work proposes Near-Memory computing Profiling and Offloading (NMPO), a high-level framework capable of predicting NMC offloading suitability employing an ensemble machine learning model. NMPO predicts NMC suitability with an accuracy of 85.6% and, compared to prior works, can reduce the prediction time by using hardware-dependent applications features by up to 3 order of magnitude

    Quantum Radio Astronomy: Data Encodings and Quantum Image Processing

    Full text link
    We explore applications of quantum computing for radio interferometry and astronomy using recent developments in quantum image processing. We evaluate the suitability of different quantum image representations using a toy quantum computing image reconstruction pipeline, and compare its performance to the classical computing counterpart. For identifying and locating bright radio sources, quantum computing can offer an exponential speedup over classical algorithms, even when accounting for data encoding cost and repeated circuit evaluations. We also propose a novel variational quantum computing algorithm for self-calibration of interferometer visibilities, and discuss future developments and research that would be necessary to make quantum computing for radio astronomy a reality.Comment: 10 pages, 8 figure

    TDO-CIM: Transparent Detection and Offloading for Computation In-memory

    Get PDF
    Computation in-memory is a promising non-von Neumann approach aiming at completely diminishing the data transfer to and from the memory subsystem. Although a lot of architectures have been proposed, compiler support for such architectures is still lagging behind. In this paper, we close this gap by proposing an end-to-end compilation flow for in-memory computing based on the LLVM compiler infrastructure. Starting from sequential code, our approach automatically detects, optimizes, and offloads kernels suitable for in-memory acceleration. We demonstrate our compiler tool-flow on the PolyBench/C benchmark suite and evaluate the benefits of our proposed in-memory architecture simulated in Gem5 by comparing it with a state-of-the-art von Neumann architecture.Comment: Full version of DATE2020 publicatio

    On the Effect of Complex Permeability and Thermal Material Properties for 3D-CFD Simulation of PEM Fuel Cells

    Get PDF
    Fuel cells are considered a key technology to decarbonize the power generation sector, thanks to the absence of pollutants emissions related to the direct chemical-electric energy conversion, their high global efficiency, and the possibility for on-board electricity production, overcoming the storage limits of batteries. An example of the renewed interest towards fuel cells is the research in Proton Exchange Membrane Fuel Cell (PEMFC) in the automotive sector, as a candidate alternative to fossil fuels-fed internal combustion engines (ICEs). The complex interplay of electrochemical and physical phenomena concurring in PEMFC makes their understanding and optimization a challenging task. This is a field of active research thanks to the development of advanced CAE tools, e.g., 3D-CFD simulations of non-isothermal reactive flows, in which all the relevant physics is numerically solved, allowing to identify governing mechanisms as well as system bottlenecks. Among the multiple complex aspects, the material property characterization of PEMFC components is one of the major modelling challenges for modern CAE tools. This is usually provided as a set of boundary conditions for the numerical model, having a large impact on the simulated results which is often motivated by an oversimplification of materials characteristics. Examples of commonly overlooked aspects are direction-independent thermal/flow properties for fibrous materials, the neglection on the deformed (compressed) status, and the simplified contact approach. All of these might alter the key parameters (e.g., water management) and mislead designers' conclusions on PEMFC optimization. In this paper three-dimensional CFD simulations are used to weight the impact of orthotropic diffusion layer properties on both flow distribution and heat transfer. In the first part, a simplified test case from literature is created and used to investigate the flow convection/diffusion balance in the gas diffusion layer considering the orthotropic permeability typical of pressed fibrous layers. Differences with respect to the still widely used isotropic permeability will be assessed, and implications on channel bypass and mass transport to the catalyst layer will be provided. In the second part, the analysis moves to the use of orthotropic thermal conductivity for the fibrous gas diffusion layers, which is another commonly discarded aspect despite being well documented in literature. A critical analysis of heat transfer routes between parts of different heat capacity (membrane, diffusion layers, solid plates) and thermal field for all the components will be assessed. Finally, thermal contact resistance between adjacent pressed materials will be applied. The altered thermal pathways for heat removal will be critically analyzed, as well as the differences in temperature distribution and their implication on electricity production and water management. This hierarchical flow/thermal analysis will provide guidelines for more accurate 3D-CFD models for a deeper understanding of flow and heat dynamics in PEMFC
    corecore