14 research outputs found
Updates on the Low-Level Abstraction of Memory Access
Choosing the best memory layout for each hardware architecture is
increasingly important as more and more programs become memory bound. For
portable codes that run across heterogeneous hardware architectures, the choice
of the memory layout for data structures is ideally decoupled from the rest of
a program. The low-level abstraction of memory access (LLAMA) is a C++ library
that provides a zero-runtime-overhead abstraction layer, underneath which
memory mappings can be freely exchanged to customize data layouts, memory
access and access instrumentation, focusing on multidimensional arrays of
nested, structured data.
After its scientific debut, several improvements and extensions have been
added to LLAMA. This includes compile-time array extents for
zero-memory-overhead views, support for computations during memory access, new
mappings for bit-packing, switching types, byte-splitting, memory access
instrumentation, and explicit SIMD support. This contribution provides an
overview of recent developments in the LLAMA library
Challenges and opportunities integrating LLAMA into AdePT
Particle transport simulations are a cornerstone of high-energy physics
(HEP), constituting a substantial part of the computing workload performed in
HEP. To boost the simulation throughput and energy efficiency, GPUs as
accelerators have been explored in recent years, further driven by the
increasing use of GPUs on HPCs. The Accelerated demonstrator of electromagnetic
Particle Transport (AdePT) is an advanced prototype for offloading the
simulation of electromagnetic showers in Geant4 to GPUs, and still undergoes
continuous development and optimization. Improving memory layout and data
access is vital to use modern, massively parallel GPU hardware efficiently,
contributing to the challenge of migrating traditional CPU based data
structures to GPUs in AdePT. The low-level abstraction of memory access (LLAMA)
is a C++ library that provides a zero-runtime-overhead data structure
abstraction layer, focusing on multidimensional arrays of nested, structured
data. It provides a framework for defining and switching custom memory mappings
at compile time to define data layouts and instrument data access, making LLAMA
an ideal tool to tackle the memory-related optimization challenges in AdePT.
Our contribution shares insights gained with LLAMA when instrumenting data
access inside AdePT, complementing traditional GPU profiler outputs. We
demonstrate traces of read/write counts to data structure elements as well as
memory heatmaps. The acquired knowledge allowed for subsequent data layout
optimizations
ROOT for the HL-LHC: data format
This document discusses the state, roadmap, and risks of the foundational
components of ROOT with respect to the experiments at the HL-LHC (Run 4 and
beyond). As foundational components, the document considers in particular the
ROOT input/output (I/O) subsystem. The current HEP I/O is based on the TFile
container file format and the TTree binary event data format. The work going
into the new RNTuple event data format aims at superseding TTree, to make
RNTuple the production ROOT event data I/O that meets the requirements of Run 4
and beyond
Identification of regulatory variants associated with genetic susceptibility to meningococcal disease.
Non-coding genetic variants play an important role in driving susceptibility to complex diseases but their characterization remains challenging. Here, we employed a novel approach to interrogate the genetic risk of such polymorphisms in a more systematic way by targeting specific regulatory regions relevant for the phenotype studied. We applied this method to meningococcal disease susceptibility, using the DNA binding pattern of RELA - a NF-kB subunit, master regulator of the response to infection - under bacterial stimuli in nasopharyngeal epithelial cells. We designed a custom panel to cover these RELA binding sites and used it for targeted sequencing in cases and controls. Variant calling and association analysis were performed followed by validation of candidate polymorphisms by genotyping in three independent cohorts. We identified two new polymorphisms, rs4823231 and rs11913168, showing signs of association with meningococcal disease susceptibility. In addition, using our genomic data as well as publicly available resources, we found evidences for these SNPs to have potential regulatory effects on ATXN10 and LIF genes respectively. The variants and related candidate genes are relevant for infectious diseases and may have important contribution for meningococcal disease pathology. Finally, we described a novel genetic association approach that could be applied to other phenotypes
LLAMA: The Low-Level Abstraction For Memory Access
The performance gap between CPU and memory widens continuously. Choosing the
best memory layout for each hardware architecture is increasingly important as
more and more programs become memory bound. For portable codes that run across
heterogeneous hardware architectures, the choice of the memory layout for data
structures is ideally decoupled from the rest of a program. This can be
accomplished via a zero-runtime-overhead abstraction layer, underneath which
memory layouts can be freely exchanged.
We present the Low-Level Abstraction of Memory Access (LLAMA), a C++ library
that provides such a data structure abstraction layer with example
implementations for multidimensional arrays of nested, structured data. LLAMA
provides fully C++ compliant methods for defining and switching custom memory
layouts for user-defined data types. The library is extensible with third-party
allocators.
Providing two close-to-life examples, we show that the LLAMA-generated AoS
(Array of Structs) and SoA (Struct of Arrays) layouts produce identical code
with the same performance characteristics as manually written data structures.
Integrations into the SPEC CPU\textsuperscript{\textregistered} lbm benchmark
and the particle-in-cell simulation PIConGPU demonstrate LLAMA's abilities in
real-world applications. LLAMA's layout-aware copy routines can significantly
speed up transfer and reshuffling of data between layouts compared with naive
element-wise copying.
LLAMA provides a novel tool for the development of high-performance C++
applications in a heterogeneous environment.Comment: 40 pages, 10 figures, 11 listing
Gradients of microstructure, stresses and mechanical properties in a multi-layered diamond thin film revealed by correlative cross-sectional nano-analytics
Thin diamond films deposited by chemical vapour deposition (CVD) usually feature cross-sectional gradients of microstructure, residual stress and mechanical properties, which decisively influence their functional properties. This work introduces a novel correlative cross-sectional nano-analytics approach, which is applied to a multi-layered CVD diamond film grown using microwave plasma-enhanced CVD and consisting of a ∼8 μm thick nanocrystalline (NCD) base and a ∼14.5 μm thick polycrystalline (PCD) top diamond sublayers. Complementary cross-sectional 30 nm beam synchrotron X-ray diffraction, depth-resolved micro-cantilever and hardness testing and electron microscopy analyses reveal correlations between microstructure, residual stress and mechanical properties. The NCD sublayer exhibits a 1.5 μm thick isotropic nucleation region with the highest stresses of ∼1.3 GPa and defect-rich nanocrystallites. With increasing sublayer thickness, a fibre texture evolves gradually, accompanied by an increase in crystallite size and a decrease in stress. At the NCD/PCD sublayer interface, texture, stresses and crystallite size change abruptly and the PCD sublayer exhibits the presence of Zone T competitive grain growth microstructure. NCD and PCD sublayers differ in fracture stresses of ∼14 and ∼31 GPa, respectively, as well as in elastic moduli and hardness, which are correlated with their particular microstructures. In summary, the introduced nano-analytics approach provides complex correlations between microstructure, stresses, functional properties and deposition conditions
HL-LHC Analysis With ROOT
ROOT is high energy physics' software for storing and mining data in a statistically sound way, to publish results with scientific graphics. It is evolving since 25 years, now providing the storage format for more than one exabyte of data; virtually all high energy physics experiments use ROOT. With another significant increase in the amount of data to be handled scheduled to arrive in 2027, ROOT is preparing for a massive upgrade of its core ingredients. As part of a review of crucial software for high energy physics, the ROOT team has documented its R&D plans for the coming years
Software Training in HEP
The long-term sustainability of the high-energy physics (HEP) research software ecosystem is essential to the field. With new facilities and upgrades coming online throughout the 2020s, this will only become increasingly important. Meeting the sustainability challenge requires a workforce with a combination of HEP domain knowledge and advanced software skills. The required software skills fall into three broad groups. The first is fundamental and generic software engineering (e.g., Unix, version control, C++, and continuous integration). The second is knowledge of domain-specific HEP packages and practices (e.g., the ROOT data format and analysis framework). The third is more advanced knowledge involving specialized techniques, including parallel programming, machine learning and data science tools, and techniques to maintain software projects at all scales. This paper discusses the collective software training program in HEP led by the HEP Software Foundation (HSF) and the Institute for Research and Innovation in Software in HEP (IRIS-HEP). The program equips participants with an array of software skills that serve as ingredients for the solution of HEP computing challenges. Beyond serving the community by ensuring that members are able to pursue research goals, the program serves individuals by providing intellectual capital and transferable skills important to careers in the realm of software and computing, inside or outside HEP
CAESAR: Space Robotics Technology for Assembly, Maintenance, and Repair
The Compliant Assistance and Exploration SpAce Robot (CAESAR) is DLR's consistent continuation in the development of force/torque controlled robot systems. The basis is DLR’s world-famous light-weight robot technology (LWR III) which was successfully transferred to KUKA, one of the world’s leading suppliers of robotics. CAESAR is the space qualified equivalent to the current service robot systems for manufacturing and human-robot cooperation. It is designed for a variety of on-orbit services e.g. assembly, maintenance, repair, and debris removal in LEO/GEO. The dexterity and diversity of CAESAR will push the performance of space robotics to the next level in a comparable way as the current intelligent and sensor based service robots changed robotics on earth