184 research outputs found

    Computer Architectures to Close the Loop in Real-time Optimization

    Get PDF
    © 2015 IEEE.Many modern control, automation, signal processing and machine learning applications rely on solving a sequence of optimization problems, which are updated with measurements of a real system that evolves in time. The solutions of each of these optimization problems are then used to make decisions, which may be followed by changing some parameters of the physical system, thereby resulting in a feedback loop between the computing and the physical system. Real-time optimization is not the same as fast optimization, due to the fact that the computation is affected by an uncertain system that evolves in time. The suitability of a design should therefore not be judged from the optimality of a single optimization problem, but based on the evolution of the entire cyber-physical system. The algorithms and hardware used for solving a single optimization problem in the office might therefore be far from ideal when solving a sequence of real-time optimization problems. Instead of there being a single, optimal design, one has to trade-off a number of objectives, including performance, robustness, energy usage, size and cost. We therefore provide here a tutorial introduction to some of the questions and implementation issues that arise in real-time optimization applications. We will concentrate on some of the decisions that have to be made when designing the computing architecture and algorithm and argue that the choice of one informs the other

    Research in Applied Mathematics, Fluid Mechanics and Computer Science

    Get PDF
    This report summarizes research conducted at the Institute for Computer Applications in Science and Engineering in applied mathematics, fluid mechanics, and computer science during the period October 1, 1998 through March 31, 1999

    Efficient parallelization strategy for real-time FE simulations

    Full text link
    This paper introduces an efficient and generic framework for finite-element simulations under an implicit time integration scheme. Being compatible with generic constitutive models, a fast matrix assembly method exploits the fact that system matrices are created in a deterministic way as long as the mesh topology remains constant. Using the sparsity pattern of the assembled system brings about significant optimizations on the assembly stage. As a result, developed techniques of GPU-based parallelization can be directly applied with the assembled system. Moreover, an asynchronous Cholesky precondition scheme is used to improve the convergence of the system solver. On this basis, a GPU-based Cholesky preconditioner is developed, significantly reducing the data transfer between the CPU/GPU during the solving stage. We evaluate the performance of our method with different mesh elements and hyperelastic models and compare it with typical approaches on the CPU and the GPU

    PERFORMANCE EVALUATION OF DIFFERENT LINEAR EQUATION SOLVERS FOR SOLVING NONLINEAR FE PROBLEMS ON MULTICORE ARCHITECTURES

    Get PDF
    The aim of this paper is to evaluate the performance of existing parallel linear equation solvers to solving large-scale, nonlinear finite element analysis problems on systems with distributed memory. The parallel approach allows us to take an advantage of the distributed memory enabling forming large system matrices and of multiple processing units to achieve significant speedups. Our study is based on comparison of parallel direct solver and parallel iterative solver implemented in SuperLU DIST library from Portable, Extensible Toolkit for Scientific Computation (PETSc). Both considered solvers are designed for distributed system memory model and are based on a Massage Passing Interface (MPI).The efficiency of individual solvers is evaluated on a selected benchmark problems, with different solution strategies by comparing computation times and obtained speedups.</p

    Application of HPC in eddy current electromagnetic problem solution

    Get PDF
    As engineering problems are becoming more and more advanced, the size of an average model solved by partial differential equations is rapidly growing and, in order to keep simulation times within reasonable bounds, both faster computers and more efficient software implementations are needed. In the first part of this thesis, the full potential of simulation software has been exploited through high performance parallel computing techniques. In particular, the simulation of induction heating processes is accomplished within reasonable solution times, by implementing different parallel direct solvers for large sparse linear system, in the solution process of a commercial software. The performance of such library on shared memory systems has been remarkably improved by implementing a multithreaded version of MUMPS (MUltifrontal Massively Parallel Solver) library, which have been tested on benchmark matrices arising from typical induction heating process simulations. A new multithreading approach and a low rank approximation technique have been implemented and developed by MUMPS team in Lyon and Toulouse. In the context of a collaboration between MUMPS team and DII-University of Padova, a preliminary version of such functionalities could be tested on induction heating benchmark problems, and a substantial reduction of the computational cost and memory requirements could be achieved. In the second part of this thesis, some examples of design methodology by virtual prototyping have been described. Complex multiphysics simulations involving electromagnetic, circuital, thermal and mechanical problems have been performed by exploiting parallel solvers, as developed in the first part of this thesis. Finally, multiobjective stochastic optimization algorithms have been applied to multiphysics 3D model simulations in search of a set of improved induction heating device configurations

    Parallel Algorithms for Time and Frequency Domain Circuit Simulation

    Get PDF
    As a most critical form of pre-silicon verification, transistor-level circuit simulation is an indispensable step before committing to an expensive manufacturing process. However, considering the nature of circuit simulation, it can be computationally expensive, especially for ever-larger transistor circuits with more complex device models. Therefore, it is becoming increasingly desirable to accelerate circuit simulation. On the other hand, the emergence of multi-core machines offers a promising solution to circuit simulation besides the known application of distributed-memory clustered computing platforms, which provides abundant hardware computing resources. This research addresses the limitations of traditional serial circuit simulations and proposes new techniques for both time-domain and frequency-domain parallel circuit simulations. For time-domain simulation, this dissertation presents a parallel transient simulation methodology. This new approach, called WavePipe, exploits coarse-grained application-level parallelism by simultaneously computing circuit solutions at multiple adjacent time points in a way resembling hardware pipelining. There are two embodiments in WavePipe: backward and forward pipelining schemes. While the former creates independent computing tasks that contribute to a larger future time step, the latter performs predictive computing along the forward direction. Unlike existing relaxation methods, WavePipe facilitates parallel circuit simulation without jeopardizing convergence and accuracy. As a coarse-grained parallel approach, it requires low parallel programming effort, furthermore it creates new avenues to have a full utilization of increasingly parallel hardware by going beyond conventional finer grained parallel device model evaluation and matrix solutions. This dissertation also exploits the recently developed explicit telescopic projective integration method for efficient parallel transient circuit simulation by addressing the stability limitation of explicit numerical integration. The new method allows the effective time step controlled by accuracy requirement instead of stability limitation. Therefore, it not only leads to noticeable efficiency improvement, but also lends itself to straightforward parallelization due to its explicit nature. For frequency-domain simulation, this dissertation presents a parallel harmonic balance approach, applicable to the steady-state and envelope-following analyses of both driven and autonomous circuits. The new approach is centered on a naturally-parallelizable preconditioning technique that speeds up the core computation in harmonic balance based analysis. The proposed method facilitates parallel computing via the use of domain knowledge and simplifies parallel programming compared with fine-grained strategies. As a result, favorable runtime speedups are achieved

    Modelling Fluid Structure Interaction problems using Boundary Element Method

    Get PDF
    This dissertation investigates the application of Boundary Element Methods (BEM) to Fluid Structure Interaction (FSI) problems under three main different perspectives. This work is divided in three main parts: i) the derivation of BEM for the Laplace equation and its application to analyze ship-wave interaction problems, ii) the imple- mentation of efficient and parallel BEM solvers addressing the newest challenges of High Performance Computing, iii) the developing of a BEM for the Stokes system and its application to study micro-swimmers.First we develop a BEM for the Laplace equation and we apply it to predict ship-wave interactions making use of an innovative coupling with Finite Element Method stabilization techniques. As well known, the wave pattern around a body depends on the Froude number associated to the flow. Thus, we throughly investigate the robustness and accuracy of the developed methodology assessing the solution dependence on such parameter. To improve the performance and tackle problems with higher number of unknowns, the BEM developed for the Laplace equation is parallelized using OpenSOURCE tech- nique in a hybrid distributed-shared memory environment. We perform several tests to demonstrate both the accuracy and the performance of the parallel BEM developed. In addition, we explore two different possibilities to reduce the overall computational cost from O(N2) to O(N). Firstly we couple the library with a Fast Multiple Method that allows us to reach for higher order of complexity and efficiency. Then we perform a preliminary study on the implementation of a parallel Non Uniform Fast Fourier Transform to be coupled with the newly developed algorithm Sparse Cardinal Sine De- composition (SCSD).Finally we consider the application of the BEM framework to a different kind of FSI problem represented by the Stokes flow of a liquid medium surrounding swimming micro-organisms. We maintain the parallel structure derived for the Laplace equation even in the Stokes setting. Our implementation is able to simulate both prokaryotic and eukaryotic organisms, matching literature and experimental benchmarks. We finally present a deep analysis of the importance of hydrodynamic interactions between the different parts of micro-swimmers in the prevision of optimal swimming conditions, focusing our attention on the study of flagellated \u201crobotic\u201d composite swimmers

    Real-time deformation and fracture in a game environment

    Full text link
    This paper describes a simulation system that has been developed to model the deformation and fracture of solid objects in a real-time gaming context. Based around a corotational tetrahedral finite element method, this system has been constructed from components published in the graphics and computational physics literatures. The goal of this paper is to describe how these components can be combined to produce an engine that is robust to unpredictable user interactions, fast enough to model reasonable scenarios at real-time speeds, suitable for use in the design of a game level, and with appropriate controls allowing content creators to match artistic direction. Details concerning parallel implementation, solver design, rendering method, and other aspects of the simulation are elucidated with the intent of providing a guide to others wishing to implement similar systems. Examples from in-game scenes captured on the Xbox 360, PS3, and PC platforms are included. © 2009 ACM
    corecore