160 research outputs found
Data Parallel C++
Learn how to accelerate C++ programs using data parallelism. This open access book enables C++ programmers to be at the forefront of this exciting and important new development that is helping to push computing to new levels. It is full of practical advice, detailed explanations, and code examples to illustrate key topics. Data parallelism in C++ enables access to parallel resources in a modern heterogeneous system, freeing you from being locked into any particular computing device. Now a single C++ application can use any combination of devices—including GPUs, CPUs, FPGAs and AI ASICs—that are suitable to the problems at hand. This book begins by introducing data parallelism and foundational topics for effective use of the SYCL standard from the Khronos Group and Data Parallel C++ (DPC++), the open source compiler used in this book. Later chapters cover advanced topics including error handling, hardware-specific programming, communication and synchronization, and memory model considerations. Data Parallel C++ provides you with everything needed to use SYCL for programming heterogeneous systems. What You'll Learn Accelerate C++ programs using data-parallel programming Target multiple device types (e.g. CPU, GPU, FPGA) Use SYCL and SYCL compilers Connect with computing’s heterogeneous future via Intel’s oneAPI initiative Who This Book Is For Those new data-parallel programming and computer programmers interested in data-parallel programming using C++
A Performance-Portable SYCL Implementation of CRK-HACC for Exascale
The first generation of exascale systems will include a variety of machine
architectures, featuring GPUs from multiple vendors. As a result, many
developers are interested in adopting portable programming models to avoid
maintaining multiple versions of their code. It is necessary to document
experiences with such programming models to assist developers in understanding
the advantages and disadvantages of different approaches.
To this end, this paper evaluates the performance portability of a SYCL
implementation of a large-scale cosmology application (CRK-HACC) running on
GPUs from three different vendors: AMD, Intel, and NVIDIA. We detail the
process of migrating the original code from CUDA to SYCL and show that
specializing kernels for specific targets can greatly improve performance
portability without significantly impacting programmer productivity. The SYCL
version of CRK-HACC achieves a performance portability of 0.96 with a code
divergence of almost 0, demonstrating that SYCL is a viable programming model
for performance-portable applications.Comment: 12 pages, 13 figures, 2023 International Workshop on Performance,
Portability & Productivity in HP
Data Parallel C++
Learn how to accelerate C++ programs using data parallelism. This open access book enables C++ programmers to be at the forefront of this exciting and important new development that is helping to push computing to new levels. It is full of practical advice, detailed explanations, and code examples to illustrate key topics. Data parallelism in C++ enables access to parallel resources in a modern heterogeneous system, freeing you from being locked into any particular computing device. Now a single C++ application can use any combination of devices—including GPUs, CPUs, FPGAs and AI ASICs—that are suitable to the problems at hand. This book begins by introducing data parallelism and foundational topics for effective use of the SYCL standard from the Khronos Group and Data Parallel C++ (DPC++), the open source compiler used in this book. Later chapters cover advanced topics including error handling, hardware-specific programming, communication and synchronization, and memory model considerations. Data Parallel C++ provides you with everything needed to use SYCL for programming heterogeneous systems. What You'll Learn Accelerate C++ programs using data-parallel programming Target multiple device types (e.g. CPU, GPU, FPGA) Use SYCL and SYCL compilers Connect with computing’s heterogeneous future via Intel’s oneAPI initiative Who This Book Is For Those new data-parallel programming and computer programmers interested in data-parallel programming using C++
A multiscale generative model to understand disorder in domain boundaries
A continuing challenge in atomic resolution microscopy is to identify
significant structural motifs and their assembly rules in synthesized materials
with limited observations. Here we propose and validate a simple and effective
hybrid generative model capable of predicting unseen domain boundaries in a
potassium sodium niobate thin film from only a small number of observations,
without expensive first-principles calculation. Our results demonstrate that
complicated domain boundary structures can arise from simple interpretable
local rules, played out probabilistically. We also found new significant
tileable boundary motifs and evidence that our system creates domain boundaries
with the highest entropy. More broadly, our work shows that simple yet
interpretable machine learning models can help us describe and understand the
nature and origin of disorder in complex materials
- …