244 research outputs found

    Enabling Radiative Transfer on AMR grids in CRASH

    Full text link
    We introduce CRASH-AMR, a new version of the cosmological Radiative Transfer (RT) code CRASH, enabled to use refined grids. This new feature allows us to attain higher resolution in our RT simulations and thus to describe more accurately ionisation and temperature patterns in high density regions. We have tested CRASH-AMR by simulating the evolution of an ionised region produced by a single source embedded in gas at constant density, as well as by a more realistic configuration of multiple sources in an inhomogeneous density field. While we find an excellent agreement with the previous version of CRASH when the AMR feature is disabled, showing that no numerical artifact has been introduced in CRASH-AMR, when additional refinement levels are used the code can simulate more accurately the physics of ionised gas in high density regions. This result has been attained at no computational loss, as RT simulations on AMR grids with maximum resolution equivalent to that of a uniform cartesian grid can be run with a gain of up to 60% in computational time.Comment: 19 pages, 17 figures. MNRAS, in pres

    Cluster-based communication and load balancing for simulations on dynamically adaptive grids

    Get PDF
    short paperThe present paper introduces a new communication and load-balancing scheme based on a clustering of the grid which we use for the efficient parallelization of simulations on dynamically adaptive grids. With a partitioning based on space-filling curves (SFCs), this yields several advantageous properties regarding the memory requirements and load balancing. However, for such an SFC- based partitioning, additional connectivity information has to be stored and updated for dynamically changing grids. In this work, we present our approach to keep this connectivity information run-length encoded (RLE) only for the interfaces shared between partitions. Using special properties of the underlying grid traversal and used communication scheme, we update this connectivity information implicitly for dynamically changing grids and can represent the connectivity information as a sparse communication graph: graph nodes (partitions) represent bulks of connected grid cells and each graph edge (RLE connectivity information) a unique relation between adjacent partitions. This directly leads to an efficient shared-memory parallelization with graph nodes assigned to computing cores and an efficient en bloc data exchange via graph edges. We further refer to such a partitioning approach with RLE meta information as a cluster-based domain decomposition and to each partition as a cluster. With the sparse communication graph in mind, we then extend the connectivity information represented by the graph edges with MPI ranks, yielding an en bloc communication for distributed-memory systems and a hybrid parallelization. For data migration, the stack-based intra-cluster communication allows a very low memory footprint for data migration and the RLE leads to efficient updates of connectivity information. Our benchmark is based on a shallow water simulation on a dynamically adaptive grid. We conducted performance studies for MPI-only and hybrid parallelizations, yielding an efficiency of over 90% on 256 cores. Furthermore, we demonstrate the applicability of cluster-based optimizations on distributed-memory systems.We like to thank the Munich Centre of Advanced Computing for for funding this project by providing computing time on the MAC Cluster. This work was partly supported by the German Research Foundation (DFG) as part of the Transregional Collaborative Research Centre ”Invasive Computing” (SFB/TR 89)

    Efficient cosmological parameter sampling using sparse grids

    Full text link
    We present a novel method to significantly speed up cosmological parameter sampling. The method relies on constructing an interpolation of the CMB-log-likelihood based on sparse grids, which is used as a shortcut for the likelihood-evaluation. We obtain excellent results over a large region in parameter space, comprising about 25 log-likelihoods around the peak, and we reproduce the one-dimensional projections of the likelihood almost perfectly. In speed and accuracy, our technique is competitive to existing approaches to accelerate parameter estimation based on polynomial interpolation or neural networks, while having some advantages over them. In our method, there is no danger of creating unphysical wiggles as it can be the case for polynomial fits of a high degree. Furthermore, we do not require a long training time as for neural networks, but the construction of the interpolation is determined by the time it takes to evaluate the likelihood at the sampling points, which can be parallelised to an arbitrary degree. Our approach is completely general, and it can adaptively exploit the properties of the underlying function. We can thus apply it to any problem where an accurate interpolation of a function is needed.Comment: Submitted to MNRAS, 13 pages, 13 figure

    SFC-based Communication Metadata Encoding for Adaptive Mesh

    Get PDF
    This volume of the series “Advances in Parallel Computing” contains the proceedings of the International Conference on Parallel Programming – ParCo 2013 – held from 10 to 13 September 2013 in Garching, Germany. The conference was hosted by the Technische Universität München (Department of Informatics) and the Leibniz Supercomputing Centre.The present paper studies two adaptive mesh refinement (AMR) codes whose grids rely on recursive subdivison in combination with space-filling curves (SFCs). A non-overlapping domain decomposition based upon these SFCs yields several well-known advantageous properties with respect to communication demands, balancing, and partition connectivity. However, the administration of the meta data, i.e. to track which partitions exchange data in which cardinality, is nontrivial due to the SFC’s fractal meandering and the dynamic adaptivity. We introduce an analysed tree grammar for the meta data that restricts it without loss of information hierarchically along the subdivision tree and applies run length encoding. Hence, its meta data memory footprint is very small, and it can be computed and maintained on-the-fly even for permanently changing grids. It facilitates a forkjoin pattern for shared data parallelism. And it facilitates replicated data parallelism tackling latency and bandwidth constraints respectively due to communication in the background and reduces memory requirements by avoiding adjacency information stored per element. We demonstrate this at hands of shared and distributed parallelized domain decompositions.This work was supported by the German Research Foundation (DFG) as part of the Transregional Collaborative Research Centre “Invasive Computing (SFB/TR 89). It is partially based on work supported by Award No. UK-c0020, made by the King Abdullah University of Science and Technology (KAUST)

    Invasive Computing in HPC with X10

    Get PDF
    High performance computing with thousands of cores relies on distributed memory due to memory consistency reasons. The resource management on such systems usually relies on static assignment of resources at the start of each application. Such a static scheduling is incapable of starting applications with required resources being used by others since a reduction of resources assigned to applications without stopping them is not possible. This lack of dynamic adaptive scheduling leads to idling resources until the remaining amount of requested resources gets available. Additionally, applications with changing resource requirements lead to idling or less efficiently used resources. The invasive computing paradigm suggests dynamic resource scheduling and applications able to dynamically adapt to changing resource requirements. As a case study, we developed an invasive resource manager as well as a multigrid with dynamically changing resource demands. Such a multigrid has changing scalability behavior during its execution and requires data migration upon reallocation due to distributed memory systems. To counteract the additional complexity introduced by the additional interfaces, e. g. for data migration, we use the X10 programming language for improved programmability. Our results show improved application throughput and the dynamic adaptivity. In addition, we show our extension for the distributed arrays of X10 to support data migrationThis work was supported by the German Research Foundation (DFG) as part of the Transregional Collaborative Research Centre “Invasive Computing” (SFB/TR 89)

    Smolyak's algorithm: A powerful black box for the acceleration of scientific computations

    Full text link
    We provide a general discussion of Smolyak's algorithm for the acceleration of scientific computations. The algorithm first appeared in Smolyak's work on multidimensional integration and interpolation. Since then, it has been generalized in multiple directions and has been associated with the keywords: sparse grids, hyperbolic cross approximation, combination technique, and multilevel methods. Variants of Smolyak's algorithm have been employed in the computation of high-dimensional integrals in finance, chemistry, and physics, in the numerical solution of partial and stochastic differential equations, and in uncertainty quantification. Motivated by this broad and ever-increasing range of applications, we describe a general framework that summarizes fundamental results and assumptions in a concise application-independent manner

    A Dimension-Adaptive Multi-Index Monte Carlo Method Applied to a Model of a Heat Exchanger

    Full text link
    We present an adaptive version of the Multi-Index Monte Carlo method, introduced by Haji-Ali, Nobile and Tempone (2016), for simulating PDEs with coefficients that are random fields. A classical technique for sampling from these random fields is the Karhunen-Lo\`eve expansion. Our adaptive algorithm is based on the adaptive algorithm used in sparse grid cubature as introduced by Gerstner and Griebel (2003), and automatically chooses the number of terms needed in this expansion, as well as the required spatial discretizations of the PDE model. We apply the method to a simplified model of a heat exchanger with random insulator material, where the stochastic characteristics are modeled as a lognormal random field, and we show consistent computational savings

    The Superposition Principle: A Conceptual Perspective on Pedestrian Stream Simulations

    Get PDF
    Models using a superposition of scalar fields for navigation are prevalent in microscopic pedestrian stream simulations. However, classifications, differences, and similarities of models are not clear at the conceptual level of navigation mechanisms. In this paper, we describe the superposition of scalar fields as an approach to microscopic crowd modelling and corresponding motion schemes. We use this background discussion to focus on the similarities and differences of models, and find that many models make use of similar mechanisms for the navigation of virtual agents. In some cases, the differences between models can be reduced to differences between discretisation schemes. The interpretation of scalar fields varies across models, but most of the time this variation does not have a large impact on simulation outcomes. The conceptual analysis of different models of pedestrian dynamics allows for a better understanding of their capabilities and limitations and may lead to better model development and validation

    A posteriori error analysis and adaptive non-intrusive numerical schemes for systems of random conservation laws

    Full text link
    In this article we consider one-dimensional random systems of hyperbolic conservation laws. We first establish existence and uniqueness of random entropy admissible solutions for initial value problems of conservation laws which involve random initial data and random flux functions. Based on these results we present an a posteriori error analysis for a numerical approximation of the random entropy admissible solution. For the stochastic discretization, we consider a non-intrusive approach, the Stochastic Collocation method. The spatio-temporal discretization relies on the Runge--Kutta Discontinuous Galerkin method. We derive the a posteriori estimator using continuous reconstructions of the discrete solution. Combined with the relative entropy stability framework this yields computable error bounds for the entire space-stochastic discretization error. The estimator admits a splitting into a stochastic and a deterministic (space-time) part, allowing for a novel residual-based space-stochastic adaptive mesh refinement algorithm. We conclude with various numerical examples investigating the scaling properties of the residuals and illustrating the efficiency of the proposed adaptive algorithm

    Research and Education in Computational Science and Engineering

    Get PDF
    Over the past two decades the field of computational science and engineering (CSE) has penetrated both basic and applied research in academia, industry, and laboratories to advance discovery, optimize systems, support decision-makers, and educate the scientific and engineering workforce. Informed by centuries of theory and experiment, CSE performs computational experiments to answer questions that neither theory nor experiment alone is equipped to answer. CSE provides scientists and engineers of all persuasions with algorithmic inventions and software systems that transcend disciplines and scales. Carried on a wave of digital technology, CSE brings the power of parallelism to bear on troves of data. Mathematics-based advanced computing has become a prevalent means of discovery and innovation in essentially all areas of science, engineering, technology, and society; and the CSE community is at the core of this transformation. However, a combination of disruptive developments---including the architectural complexity of extreme-scale computing, the data revolution that engulfs the planet, and the specialization required to follow the applications to new frontiers---is redefining the scope and reach of the CSE endeavor. This report describes the rapid expansion of CSE and the challenges to sustaining its bold advances. The report also presents strategies and directions for CSE research and education for the next decade.Comment: Major revision, to appear in SIAM Revie
    • …
    corecore