184 research outputs found
Computer Architectures to Close the Loop in Real-time Optimization
© 2015 IEEE.Many modern control, automation, signal processing and machine learning applications rely on solving a sequence of optimization problems, which are updated with measurements of a real system that evolves in time. The solutions of each of these optimization problems are then used to make decisions, which may be followed by changing some parameters of the physical system, thereby resulting in a feedback loop between the computing and the physical system. Real-time optimization is not the same as fast optimization, due to the fact that the computation is affected by an uncertain system that evolves in time. The suitability of a design should therefore not be judged from the optimality of a single optimization problem, but based on the evolution of the entire cyber-physical system. The algorithms and hardware used for solving a single optimization problem in the office might therefore be far from ideal when solving a sequence of real-time optimization problems. Instead of there being a single, optimal design, one has to trade-off a number of objectives, including performance, robustness, energy usage, size and cost. We therefore provide here a tutorial introduction to some of the questions and implementation issues that arise in real-time optimization applications. We will concentrate on some of the decisions that have to be made when designing the computing architecture and algorithm and argue that the choice of one informs the other
Research in Applied Mathematics, Fluid Mechanics and Computer Science
This report summarizes research conducted at the Institute for Computer Applications in Science and Engineering in applied mathematics, fluid mechanics, and computer science during the period October 1, 1998 through March 31, 1999
Efficient parallelization strategy for real-time FE simulations
This paper introduces an efficient and generic framework for finite-element
simulations under an implicit time integration scheme. Being compatible with
generic constitutive models, a fast matrix assembly method exploits the fact
that system matrices are created in a deterministic way as long as the mesh
topology remains constant. Using the sparsity pattern of the assembled system
brings about significant optimizations on the assembly stage. As a result,
developed techniques of GPU-based parallelization can be directly applied with
the assembled system. Moreover, an asynchronous Cholesky precondition scheme is
used to improve the convergence of the system solver. On this basis, a
GPU-based Cholesky preconditioner is developed, significantly reducing the data
transfer between the CPU/GPU during the solving stage. We evaluate the
performance of our method with different mesh elements and hyperelastic models
and compare it with typical approaches on the CPU and the GPU
PERFORMANCE EVALUATION OF DIFFERENT LINEAR EQUATION SOLVERS FOR SOLVING NONLINEAR FE PROBLEMS ON MULTICORE ARCHITECTURES
The aim of this paper is to evaluate the performance of existing parallel linear equation solvers to solving large-scale, nonlinear finite element analysis problems on systems with distributed memory. The parallel approach allows us to take an advantage of the distributed memory enabling forming large system matrices and of multiple processing units to achieve significant speedups. Our study is based on comparison of parallel direct solver and parallel iterative solver implemented in SuperLU DIST library from Portable, Extensible Toolkit for Scientific Computation (PETSc). Both considered solvers are designed for distributed system memory model and are based on a Massage Passing Interface (MPI).The efficiency of individual solvers is evaluated on a selected benchmark problems, with different solution strategies by comparing computation times and obtained speedups.</p
Application of HPC in eddy current electromagnetic problem solution
As engineering problems are becoming more and more advanced, the size of an average model solved by partial differential equations is rapidly growing and, in order to keep simulation times within reasonable bounds, both faster computers and more efficient software implementations are needed.
In the first part of this thesis, the full potential of simulation software has been exploited through high performance parallel computing techniques. In particular, the simulation of induction heating processes is accomplished within reasonable solution times, by implementing different parallel direct solvers for large sparse linear system, in the solution process of a commercial software. The performance of such library on shared memory systems has been remarkably improved by implementing a multithreaded version of MUMPS (MUltifrontal Massively Parallel Solver) library, which have been tested on benchmark matrices arising from typical induction heating process simulations.
A new multithreading approach and a low rank approximation technique have been implemented and developed by MUMPS team in Lyon and Toulouse. In the context of a collaboration between MUMPS team and DII-University of Padova, a preliminary version of such functionalities could be tested on induction heating benchmark problems, and a substantial reduction of the computational cost and memory requirements could be achieved.
In the second part of this thesis, some examples of design methodology by virtual prototyping have been described. Complex multiphysics simulations involving electromagnetic, circuital, thermal and mechanical problems have been performed by exploiting parallel solvers, as developed in the first part of this thesis. Finally, multiobjective stochastic optimization algorithms have been applied to multiphysics 3D model simulations in search of a set of improved induction heating device configurations
Parallel Algorithms for Time and Frequency Domain Circuit Simulation
As a most critical form of pre-silicon verification, transistor-level circuit simulation
is an indispensable step before committing to an expensive manufacturing process.
However, considering the nature of circuit simulation, it can be computationally
expensive, especially for ever-larger transistor circuits with more complex device models.
Therefore, it is becoming increasingly desirable to accelerate circuit simulation.
On the other hand, the emergence of multi-core machines offers a promising solution
to circuit simulation besides the known application of distributed-memory clustered
computing platforms, which provides abundant hardware computing resources. This
research addresses the limitations of traditional serial circuit simulations and proposes
new techniques for both time-domain and frequency-domain parallel circuit
simulations.
For time-domain simulation, this dissertation presents a parallel transient simulation
methodology. This new approach, called WavePipe, exploits coarse-grained
application-level parallelism by simultaneously computing circuit solutions at multiple
adjacent time points in a way resembling hardware pipelining. There are two
embodiments in WavePipe: backward and forward pipelining schemes. While the
former creates independent computing tasks that contribute to a larger future time
step, the latter performs predictive computing along the forward direction. Unlike
existing relaxation methods, WavePipe facilitates parallel circuit simulation without jeopardizing convergence and accuracy. As a coarse-grained parallel approach, it requires
low parallel programming effort, furthermore it creates new avenues to have a
full utilization of increasingly parallel hardware by going beyond conventional finer
grained parallel device model evaluation and matrix solutions.
This dissertation also exploits the recently developed explicit telescopic projective
integration method for efficient parallel transient circuit simulation by addressing the
stability limitation of explicit numerical integration. The new method allows the
effective time step controlled by accuracy requirement instead of stability limitation.
Therefore, it not only leads to noticeable efficiency improvement, but also lends itself
to straightforward parallelization due to its explicit nature.
For frequency-domain simulation, this dissertation presents a parallel harmonic
balance approach, applicable to the steady-state and envelope-following analyses of
both driven and autonomous circuits. The new approach is centered on a naturally-parallelizable
preconditioning technique that speeds up the core computation in harmonic
balance based analysis. The proposed method facilitates parallel computing
via the use of domain knowledge and simplifies parallel programming compared with
fine-grained strategies. As a result, favorable runtime speedups are achieved
Modelling Fluid Structure Interaction problems using Boundary Element Method
This dissertation investigates the application of Boundary Element Methods (BEM)
to Fluid Structure Interaction (FSI) problems under three main different perspectives.
This work is divided in three main parts: i) the derivation of BEM for the Laplace
equation and its application to analyze ship-wave interaction problems, ii) the imple-
mentation of efficient and parallel BEM solvers addressing the newest challenges of
High Performance Computing, iii) the developing of a BEM for the Stokes system and
its application to study micro-swimmers.First we develop a BEM for the Laplace equation and we apply it to predict ship-wave interactions making use of an innovative coupling with Finite Element Method stabilization techniques. As well known, the wave pattern around a body depends on the Froude number associated to the flow. Thus, we throughly investigate the robustness and accuracy of the developed methodology assessing the solution dependence on such parameter.
To improve the performance and tackle problems with higher number of unknowns,
the BEM developed for the Laplace equation is parallelized using OpenSOURCE tech-
nique in a hybrid distributed-shared memory environment. We perform several tests
to demonstrate both the accuracy and the performance of the parallel BEM developed.
In addition, we explore two different possibilities to reduce the overall computational
cost from O(N2) to O(N). Firstly we couple the library with a Fast Multiple Method that allows us to reach for higher order of complexity and efficiency. Then we perform a preliminary study on the implementation of a parallel Non Uniform Fast Fourier
Transform to be coupled with the newly developed algorithm Sparse Cardinal Sine De-
composition (SCSD).Finally we consider the application of the BEM framework to a different kind of FSI problem represented by the Stokes flow of a liquid medium surrounding swimming
micro-organisms. We maintain the parallel structure derived for the Laplace equation
even in the Stokes setting. Our implementation is able to simulate both prokaryotic and
eukaryotic organisms, matching literature and experimental benchmarks. We finally
present a deep analysis of the importance of hydrodynamic interactions between the
different parts of micro-swimmers in the prevision of optimal swimming conditions,
focusing our attention on the study of flagellated \u201crobotic\u201d composite swimmers
Real-time deformation and fracture in a game environment
This paper describes a simulation system that has been developed to model the deformation and fracture of solid objects in a real-time gaming context. Based around a corotational tetrahedral finite element method, this system has been constructed from components published in the graphics and computational physics literatures. The goal of this paper is to describe how these components can be combined to produce an engine that is robust to unpredictable user interactions, fast enough to model reasonable scenarios at real-time speeds, suitable for use in the design of a game level, and with appropriate controls allowing content creators to match artistic direction. Details concerning parallel implementation, solver design, rendering method, and other aspects of the simulation are elucidated with the intent of providing a guide to others wishing to implement similar systems. Examples from in-game scenes captured on the Xbox 360, PS3, and PC platforms are included. © 2009 ACM
- …