Search CORE

1,324 research outputs found

A performance- and energy-oriented extended tuning process for time-step-based scientific applications

Author: Kalinnik Natalia
Kiesel Robert
Rauber Thomas
Richter Marcel
Rünger Gudula
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

DeepFlame: A deep learning empowered open-source platform for reacting flow simulations

Author: Chen Zhi X.
Lin Minqi
Mao Runze
Xu Zhi-Qin John
Zhang Tianhan
Zhang Yan
Publication venue
Publication date: 15/10/2022
Field of study

In this work, we introduce DeepFlame, an open-source C++ platform with the capabilities of utilising machine learning algorithms and pre-trained models to solve for reactive flows. We combine the individual strengths of the computational fluid dynamics library OpenFOAM, machine learning framework Torch, and chemical kinetics program Cantera. The complexity of cross-library function and data interfacing (the core of DeepFlame) is minimised to achieve a simple and clear workflow for code maintenance, extension and upgrading. As a demonstration, we apply our recent work on deep learning for predicting chemical kinetics (Zhang et al. Combust. Flame vol. 245 pp. 112319, 2022) to highlight the potential of machine learning in accelerating reacting flow simulation. A thorough code validation is conducted via a broad range of canonical cases to assess its accuracy and efficiency. The results demonstrate that the convection-diffusion-reaction algorithms implemented in DeepFlame are robust and accurate for both steady-state and transient processes. In addition, a number of methods aiming to further improve the computational efficiency, e.g. dynamic load balancing and adaptive mesh refinement, are explored. Their performances are also evaluated and reported. With the deep learning method implemented in this work, a speed-up of two orders of magnitude is achieved in a simple hydrogen ignition case when performed on a medium-end graphics processing unit (GPU). Further gain in computational efficiency is expected for hydrocarbon and other complex fuels. A similar level of acceleration is obtained on an AI-specific chip - deep computing unit (DCU), highlighting the potential of DeepFlame in leveraging the next-generation computing architecture and hardware

arXiv.org e-Print Archive

Recommended from our members

Hybrid Analog-Digital Co-Processing for Scientific Computation

Author: Huang Yipeng
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2018
Field of study

In the past 10 years computer architecture research has moved to more heterogeneity and less adherence to conventional abstractions. Scientists and engineers hold an unshakable belief that computing holds keys to unlocking humanity's Grand Challenges. Acting on that belief they have looked deeper into computer architecture to find specialized support for their applications. Likewise, computer architects have looked deeper into circuits and devices in search of untapped performance and efficiency. The lines between computer architecture layers---applications, algorithms, architectures, microarchitectures, circuits and devices---have blurred. Against this backdrop, a menagerie of computer architectures are on the horizon, ones that forgo basic assumptions about computer hardware, and require new thinking of how such hardware supports problems and algorithms. This thesis is about revisiting hybrid analog-digital computing in support of diverse modern workloads. Hybrid computing had extensive applications in early computing history, and has been revisited for small-scale applications in embedded systems. But architectural support for using hybrid computing in modern workloads, at scale and with high accuracy solutions, has been lacking. I demonstrate solving a variety of scientific computing problems, including stochastic ODEs, partial differential equations, linear algebra, and nonlinear systems of equations, as case studies in hybrid computing. I solve these problems on a system of multiple prototype analog accelerator chips built by a team at Columbia University. On that team I made contributions toward programming the chips, building the digital interface, and validating the chips' functionality. The analog accelerator chip is intended for use in conjunction with a conventional digital host computer. The appeal and motivation for using an analog accelerator is efficiency and performance, but it comes with limitations in accuracy and problem sizes that we have to work around. The first problem is how to do problems in this unconventional computation model. Scientific computing phrases problems as differential equations and algebraic equations. Differential equations are a continuous view of the world, while algebraic equations are a discrete one. Prior work in analog computing mostly focused on differential equations; algebraic equations played a minor role in prior work in analog computing. The secret to using the analog accelerator to support modern workloads on conventional computers is that these two viewpoints are interchangeable. The algebraic equations that underlie most workloads can be solved as differential equations, and differential equations are naturally solvable in the analog accelerator chip. A hybrid analog-digital computer architecture can focus on solving linear and nonlinear algebra problems to support many workloads. The second problem is how to get accurate solutions using hybrid analog-digital computing. The reason that the analog computation model gives less accurate solutions is it gives up representing numbers as digital binary numbers, and instead uses the full range of analog voltage and current to represent real numbers. Prior work has established that encoding data in analog signals gives an energy efficiency advantage as long as the analog data precision is limited. While the analog accelerator alone may be useful for energy-constrained applications where inputs and outputs are imprecise, we are more interested in using analog in conjunction with digital for precise solutions. This thesis gives novel insight that the trick to do so is to solve nonlinear problems where low-precision guesses are useful for conventional digital algorithms. The third problem is how to solve large problems using hybrid analog-digital computing. The reason the analog computation model can't handle large problems is it gives up step-by-step discrete-time operation, instead allowing variables to evolve smoothly in continuous time. To make that happen the analog accelerator works by chaining hardware for mathematical operations end-to-end. During computation analog data flows through the hardware with no overheads in control logic and memory accesses. The downside is then the needed hardware size grows alongside problem sizes. While scientific computing researchers have for a long time split large problems into smaller subproblems to fit in digital computer constraints, this thesis is a first attempt to consider these divide-and-conquer algorithms as an essential tool in using the analog model of computation. As we enter the post-Moore’s law era of computing, unconventional architectures will offer specialized models of computation that uniquely support specific problem types. Two prominent examples are deep neural networks and quantum computers. Recent trends in computer science research show these unconventional architectures will soon have broad adoption. In this thesis I show another specialized, unconventional architecture is to use analog accelerators to solve problems in scientific computing. Computer architecture researchers will discover other important models of computation in the future. This thesis is an example of the discovery process, implementation, and evaluation of how an unconventional architecture supports specialized workloads

Columbia University Academic Commons

A Domain-Specific Language and Editor for Parallel Particle Methods

Author: Castrillon Jeronimo
Karol Sven
Nett Tobias
Sbalzarini Ivo F.
Publication venue
Publication date: 17/09/2017
Field of study

Domain-specific languages (DSLs) are of increasing importance in scientific high-performance computing to reduce development costs, raise the level of abstraction and, thus, ease scientific programming. However, designing and implementing DSLs is not an easy task, as it requires knowledge of the application domain and experience in language engineering and compilers. Consequently, many DSLs follow a weak approach using macros or text generators, which lack many of the features that make a DSL a comfortable for programmers. Some of these features---e.g., syntax highlighting, type inference, error reporting, and code completion---are easily provided by language workbenches, which combine language engineering techniques and tools in a common ecosystem. In this paper, we present the Parallel Particle-Mesh Environment (PPME), a DSL and development environment for numerical simulations based on particle methods and hybrid particle-mesh methods. PPME uses the meta programming system (MPS), a projectional language workbench. PPME is the successor of the Parallel Particle-Mesh Language (PPML), a Fortran-based DSL that used conventional implementation strategies. We analyze and compare both languages and demonstrate how the programmer's experience can be improved using static analyses and projectional editing. Furthermore, we present an explicit domain model for particle abstractions and the first formal type system for particle methods.Comment: Submitted to ACM Transactions on Mathematical Software on Dec. 25, 201

arXiv.org e-Print Archive

MPG.PuRe

Doctor of Philosophy

Author: Ha Linh Khanh
Publication venue: University of Utah
Publication date: 15/08/2011
Field of study

dissertationStochastic methods, dense free-form mapping, atlas construction, and total variation are examples of advanced image processing techniques which are robust but computationally demanding. These algorithms often require a large amount of computational power as well as massive memory bandwidth. These requirements used to be ful lled only by supercomputers. The development of heterogeneous parallel subsystems and computation-specialized devices such as Graphic Processing Units (GPUs) has brought the requisite power to commodity hardware, opening up opportunities for scientists to experiment and evaluate the in uence of these techniques on their research and practical applications. However, harnessing the processing power from modern hardware is challenging. The di fferences between multicore parallel processing systems and conventional models are signi ficant, often requiring algorithms and data structures to be redesigned signi ficantly for efficiency. It also demands in-depth knowledge about modern hardware architectures to optimize these implementations, sometimes on a per-architecture basis. The goal of this dissertation is to introduce a solution for this problem based on a 3D image processing framework, using high performance APIs at the core level to utilize parallel processing power of the GPUs. The design of the framework facilitates an efficient application development process, which does not require scientists to have extensive knowledge about GPU systems, and encourages them to harness this power to solve their computationally challenging problems. To present the development of this framework, four main problems are described, and the solutions are discussed and evaluated: (1) essential components of a general 3D image processing library: data structures and algorithms, as well as how to implement these building blocks on the GPU architecture for optimal performance; (2) an implementation of unbiased atlas construction algorithms|an illustration of how to solve a highly complex and computationally expensive algorithm using this framework; (3) an extension of the framework to account for geometry descriptors to solve registration challenges with large scale shape changes and high intensity-contrast di fferences; and (4) an out-of-core streaming model, which enables developers to implement multi-image processing techniques on commodity hardware

The University of Utah: J. Willard Marriott Digital Library

Workshop - Systems Design Meets Equation-based Languages

Author: Rantzer Anders
Åkesson Johan
Publication venue: Department of Automatic Control, Lund Institute of Technology, Lund University
Publication date: 01/01/2013
Field of study

Lund University Publications

On Fast Simulation of Dynamical System with Neural Vector Enhanced Numerical Solver

Author: Huang Zhongzhan
Liang Senwei
Lin Liang
Yang Haizhao
Zhang Hong
Publication venue
Publication date: 19/09/2023
Field of study

The large-scale simulation of dynamical systems is critical in numerous scientific and engineering disciplines. However, traditional numerical solvers are limited by the choice of step sizes when estimating integration, resulting in a trade-off between accuracy and computational efficiency. To address this challenge, we introduce a deep learning-based corrector called Neural Vector (NeurVec), which can compensate for integration errors and enable larger time step sizes in simulations. Our extensive experiments on a variety of complex dynamical system benchmarks demonstrate that NeurVec exhibits remarkable generalization capability on a continuous phase space, even when trained using limited and discrete data. NeurVec significantly accelerates traditional solvers, achieving speeds tens to hundreds of times faster while maintaining high levels of accuracy and stability. Moreover, NeurVec's simple-yet-effective design, combined with its ease of implementation, has the potential to establish a new paradigm for fast-solving differential equations based on deep learning.Comment: Accepted by Scientific Repor

arXiv.org e-Print Archive

The OpenModelica integrated environment for modeling, simulation, and model-based development

Author: Abdelhak Karim
Asghar Adeel
Bachmann Bernhard
Bouskela Daniel
Braun Robert
Braun Willi
Buffoni Lena
Casella Francesco
Castro Rodrigo Daniel
Franke Rüdiger
Fritzson Dag
Fritzson Peter
Gebremedhin Mahder
Heuermann Andreas
Lie Bernt
Mengist Alachew
Mikelsons Lars
Moudgalya Kannan
Ochel Lennart
Ostlund Per
Palanisamy Arunkumar
Pop Adrian
Ruge Vitalij
Schamai Wladimir
Sjolund Martin
Thiele Bernhard
Tinnerholm John
Publication venue: Mic
Publication date: 01/03/2022
Field of study

OpenModelica is a unique large-scale integrated open-source Modelica- and FMI-based modeling, simulation, optimization, model-based analysis and development environment. Moreover, the OpenModelica environment provides a number of facilities such as debugging; optimization; visualization and 3D animation; web-based model editing and simulation; scripting from Modelica, Python, Julia, and Matlab; efficient simulation and co-simulation of FMI-based models; compilation for embedded systems; Modelica- UML integration; requirement verification; and generation of parallel code for multi-core architectures. The environment is based on the equation-based object-oriented Modelica language and currently uses the MetaModelica extended version of Modelica for its model compiler implementation. This overview paper gives an up-to-date description of the capabilities of the system, short overviews of used open source symbolic and numeric algorithms with pointers to published literature, tool integration aspects, some lessons learned, and the main vision behind its development.Fil: Fritzson, Peter. Linköping University; SueciaFil: Pop, Adrian. Linköping University; SueciaFil: Abdelhak, Karim. Fachhochschule Bielefeld; AlemaniaFil: Asghar, Adeel. Linköping University; SueciaFil: Bachmann, Bernhard. Fachhochschule Bielefeld; AlemaniaFil: Braun, Willi. Fachhochschule Bielefeld; AlemaniaFil: Bouskela, Daniel. Electricité de France; FranciaFil: Braun, Robert. Linköping University; SueciaFil: Buffoni, Lena. Linköping University; SueciaFil: Casella, Francesco. Politecnico di Milano; ItaliaFil: Castro, Rodrigo Daniel. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Investigación en Ciencias de la Computación. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigación en Ciencias de la Computación; ArgentinaFil: Franke, Rüdiger. Abb Group; AlemaniaFil: Fritzson, Dag. Linköping University; SueciaFil: Gebremedhin, Mahder. Linköping University; SueciaFil: Heuermann, Andreas. Linköping University; SueciaFil: Lie, Bernt. University of South-Eastern Norway; NoruegaFil: Mengist, Alachew. Linköping University; SueciaFil: Mikelsons, Lars. Linköping University; SueciaFil: Moudgalya, Kannan. Indian Institute Of Technology Bombay; IndiaFil: Ochel, Lennart. Linköping University; SueciaFil: Palanisamy, Arunkumar. Linköping University; SueciaFil: Ruge, Vitalij. Fachhochschule Bielefeld; AlemaniaFil: Schamai, Wladimir. Danfoss Power Solutions GmbH & Co; AlemaniaFil: Sjolund, Martin. Linköping University; SueciaFil: Thiele, Bernhard. Linköping University; SueciaFil: Tinnerholm, John. Linköping University; SueciaFil: Ostlund, Per. Linköping University; Sueci

CONICET Digital