86 research outputs found

    Factorized solution of generalized stable Sylvester equations using many-core GPU accelerators

    Full text link
    [EN] We investigate the factorized solution of generalized stable Sylvester equations such as those arising in model reduction, image restoration, and observer design. Our algorithms, based on the matrix sign function, take advantage of the current trend to integrate high performance graphics accelerators (also known as GPUs) in computer systems. As a result, our realisations provide a valuable tool to solve large-scale problems on a variety of platforms.We acknowledge support of the ANII - MPG Independent Research Group: "Efficient Hetergenous Computing" at UdelaR, a partner group of the Max Planck Institute in Magdeburg.Benner, P.; Dufrechou, E.; Ezzatti, P.; Gallardo, R.; Quintana-Ortí, ES. (2021). Factorized solution of generalized stable Sylvester equations using many-core GPU accelerators. The Journal of Supercomputing (Online). 77(9):10152-19164. https://doi.org/10.1007/s11227-021-03658-y101521916477

    Factorized Solution of Generalized Stable Sylvester Equations Using Many-Core GPU Accelerators

    Get PDF

    Massively parallel split-step Fourier techniques for simulating quantum systems on graphics processing units

    Get PDF
    The split-step Fourier method is a powerful technique for solving partial differential equations and simulating ultracold atomic systems of various forms. In this body of work, we focus on several variations of this method to allow for simulations of one, two, and three-dimensional quantum systems, along with several notable methods for controlling these systems. In particular, we use quantum optimal control and shortcuts to adiabaticity to study the non-adiabatic generation of superposition states in strongly correlated one-dimensional systems, analyze chaotic vortex trajectories in two dimensions by using rotation and phase imprinting methods, and create stable, threedimensional vortex structures in Bose–Einstein condensates through artificial magnetic fields generated by the evanescent field of an optical nanofiber. We also discuss algorithmic optimizations for implementing the split-step Fourier method on graphics processing units. All computational methods present in this work are demonstrated on physical systems and have been incorporated into a state-of-the-art and open-source software suite known as GPUE, which is currently the fastest quantum simulator of its kind.Okinawa Institute of Science and Technology Graduate Universit

    An Ale Approach For The Numerical Simulation Of Insect Flight

    Get PDF
    Tez (Doktora) -- İstanbul Teknik Üniversitesi, Fen Bilimleri Enstitüsü, 2014Thesis (PhD) -- İstanbul Technical University, Institute of Science and Technology, 2014Bu çalışmada öncelikle büyük ölçekli (large-scale) hareketli yüzey problemlerinin tamamen birleşmiş (fully coupled) formda çözülmesi için kenar merkezli yapısal olmayan sonlu hacimler yöntemine dayalı Arbitrary Lagrangian-Eulerian (ALE) yöntemi geliştirilmiştir. Kenar merkezli sonlu hacim metoduna dayanan bu sayısal yöntemde hız vektör bileşenleri her bir elemanın yüzeylerinin orta noktasında tanımlanırken, basınç değerleri her bir elemanın merkezinde tanımlanmaktadır. Basınç ve hız değerlerinin mevcut şekilde düzenlenmesi kararlı bir sayısal şemaya yol açar ve böylece basınç noktalarının birbirleriyle etkileşmesi (pressure coupling) için ayrıca doğal olmayan bir değişikliğe ihtiyaç kalmaz. Süreklilik denklemi her bir eleman içerisinde tam olarak sağlanmakta ve bu süreklilik denklemlerinin toplamı hesaplama bölgesinin sınırlarında tanımlanan küresel süreklilik denklemini vermektedir. Geometrik korunum kanununun (GCL) ayrık biçimde (discrete formda) sağlanması için özel bir özen gösterilmiştir. Ağ deformasyonu her bir zaman adımında direkt olmayan radyal bazlı fonksiyon interpolasyonun çözülmesi ile elde edilmiş ve bu tekrar ağ oluşumunu gerektirmediğinden sayısal yöntemin performansını artırmıştır. Küçük zaman adımlı zamana bağlı akışların çözümü için projeksiyon metodunda olduğu gibi oluşan cebirsel denklemler üç ayrı matrise ayrıklaştırılmış ve bu matrislerin tersi önkoşullandırıcı olarak kullanılmıştır. Burada oluşan ayrık ölçekli Laplacian operatörünün tersi yerine iki adım HYPRE BoomerAMG önkoşullandırıcısı kullanılmıştır. Paralel önkoşullandırılmış iteratif yöntemlerin verimini artırmak için PETSc ve HYPRE kütüphanelerinden yararlanılmıştır. Hareketli ağlar üzerinde şu testler yapılmıştır: Azalan Taylor-Green Girdap akışı, kanal içindeki salınım hareketi yapan silindir etrafındaki akış, yere paralel salınım hareketi yapan küp içerisindeki küre etrafındaki akış.An arbitrary Lagrangian-Eulerian (ALE) approach has been developed in order to investigate the near wake structure of Drosophila flight. The numerical algorithm is based on side-centered finite volume method where the velocity vector components are defined at the mid-point of each cell face while the pressure is defined at the element centroid. The present arrangement of the primitive variables leads to a stable numerical scheme and it does not require any ad-hoc modifications in order to enhance pressure coupling. A special attention is also given to to satisfy the discrete global conservation law. An efficient and robust mesh-deformation algorithm based on the indirect radial basis function method is developed at each time level in order to enhance numerical robustness. For the algebraic solution of the resulting large-scale equations, a matrix factorization is introduced similar to that of the projection method for the whole coupled system and we use two-cycle of BoomerAMG solver for the scaled discrete Laplacian provided by the HYPRE library, which we access through the PETSc library. The present numerical algorithm is initially validated for the decaying Taylor-Green vortex flow, the flow past an oscillating circular cylinder in a channel and the flow induced by an oscillating sphere in a cubic cavity. Then the numerical method is applied to the numerical simulation of flow field around a pair of flapping Drosophila wings in hover flight. Finally, the numerical calculations with different wing kinematics are carried out to simulate the flow field around a pair of flapping Drosophila wings in hover.DoktoraPh

    Resilience for Asynchronous Iterative Methods for Sparse Linear Systems

    Get PDF
    Large scale simulations are used in a variety of application areas in science and engineering to help forward the progress of innovation. Many spend the vast majority of their computational time attempting to solve large systems of linear equations; typically arising from discretizations of partial differential equations that are used to mathematically model various phenomena. The algorithms used to solve these problems are typically iterative in nature, and making efficient use of computational time on High Performance Computing (HPC) clusters involves constantly improving these iterative algorithms. Future HPC platforms are expected to encounter three main problem areas: scalability of code, reliability of hardware, and energy efficiency of the platform. The HPC resources that are expected to run the large programs are planned to consist of billions of processing units that come from more traditional multicore processors as well as a variety of different hardware accelerators. This growth in parallelism leads to the presence of all three problems. Previously, work on algorithm development has focused primarily on creating fault tolerance mechanisms for traditional iterative solvers. Recent work has begun to revisit using asynchronous methods for solving large scale applications, and this dissertation presents research into fault tolerance for fine-grained methods that are asynchronous in nature. Classical convergence results for asynchronous methods are revisited and modified to account for the possible occurrence of a fault, and a variety of techniques for recovery from the effects of a fault are proposed. Examples of how these techniques can be used are shown for various algorithms, including an analysis of a fine-grained algorithm for computing incomplete factorizations. Lastly, numerous modeling and simulation tools for the further construction of iterative algorithms for HPC applications are developed, including numerical models for simulating faults and a simulation framework that can be used to extrapolate the performance of algorithms towards future HPC systems

    Proceedings of the ECCOMAS Thematic Conference on Multibody Dynamics 2015

    Get PDF
    This volume contains the full papers accepted for presentation at the ECCOMAS Thematic Conference on Multibody Dynamics 2015 held in the Barcelona School of Industrial Engineering, Universitat Politècnica de Catalunya, on June 29 - July 2, 2015. The ECCOMAS Thematic Conference on Multibody Dynamics is an international meeting held once every two years in a European country. Continuing the very successful series of past conferences that have been organized in Lisbon (2003), Madrid (2005), Milan (2007), Warsaw (2009), Brussels (2011) and Zagreb (2013); this edition will once again serve as a meeting point for the international researchers, scientists and experts from academia, research laboratories and industry working in the area of multibody dynamics. Applications are related to many fields of contemporary engineering, such as vehicle and railway systems, aeronautical and space vehicles, robotic manipulators, mechatronic and autonomous systems, smart structures, biomechanical systems and nanotechnologies. The topics of the conference include, but are not restricted to: ● Formulations and Numerical Methods ● Efficient Methods and Real-Time Applications ● Flexible Multibody Dynamics ● Contact Dynamics and Constraints ● Multiphysics and Coupled Problems ● Control and Optimization ● Software Development and Computer Technology ● Aerospace and Maritime Applications ● Biomechanics ● Railroad Vehicle Dynamics ● Road Vehicle Dynamics ● Robotics ● Benchmark ProblemsPostprint (published version

    High performance implementation of MPC schemes for fast systems

    Get PDF
    In recent years, the number of applications of model predictive control (MPC) is rapidly increasing due to the better control performance that it provides in comparison to traditional control methods. However, the main limitation of MPC is the computational e ort required for the online solution of an optimization problem. This shortcoming restricts the use of MPC for real-time control of dynamic systems with high sampling rates. This thesis aims to overcome this limitation by implementing high-performance MPC solvers for real-time control of fast systems. Hence, one of the objectives of this work is to take the advantage of the particular mathematical structures that MPC schemes exhibit and use parallel computing to improve the computational e ciency. Firstly, this thesis focuses on implementing e cient parallel solvers for linear MPC (LMPC) problems, which are described by block-structured quadratic programming (QP) problems. Speci cally, three parallel solvers are implemented: a primal-dual interior-point method with Schur-complement decomposition, a quasi-Newton method for solving the dual problem, and the operator splitting method based on the alternating direction method of multipliers (ADMM). The implementation of all these solvers is based on C++. The software package Eigen is used to implement the linear algebra operations. The Open Message Passing Interface (Open MPI) library is used for the communication between processors. Four case-studies are presented to demonstrate the potential of the implementation. Hence, the implemented solvers have shown high performance for tackling large-scale LMPC problems by providing the solutions in computation times below milliseconds. Secondly, the thesis addresses the solution of nonlinear MPC (NMPC) problems, which are described by general optimal control problems (OCPs). More precisely, implementations are done for the combined multiple-shooting and collocation (CMSC) method using a parallelization scheme. The CMSC method transforms the OCP into a nonlinear optimization problem (NLP) and de nes a set of underlying sub-problems for computing the sensitivities and discretized state values within the NLP solver. These underlying sub-problems are decoupled on the variables and thus, are solved in parallel. For the implementation, the software package IPOPT is used to solve the resulting NLP problems. The parallel solution of the sub-problems is performed based on MPI and Eigen. The computational performance of the parallel CMSC solver is tested using case studies for both OCPs and NMPC showing very promising results. Finally, applications to autonomous navigation for the SUMMIT robot are presented. Specially, reference tracking and obstacle avoidance problems are addressed using an NMPC approach. Both simulation and experimental results are presented and compared to a previous work on the SUMMIT, showing a much better computational e ciency and control performance.Tesi

    MATLAB

    Get PDF
    This excellent book represents the final part of three-volumes regarding MATLAB-based applications in almost every branch of science. The book consists of 19 excellent, insightful articles and the readers will find the results very useful to their work. In particular, the book consists of three parts, the first one is devoted to mathematical methods in the applied sciences by using MATLAB, the second is devoted to MATLAB applications of general interest and the third one discusses MATLAB for educational purposes. This collection of high quality articles, refers to a large range of professional fields and can be used for science as well as for various educational purposes

    Generalized averaged Gaussian quadrature and applications

    Get PDF
    A simple numerical method for constructing the optimal generalized averaged Gaussian quadrature formulas will be presented. These formulas exist in many cases in which real positive GaussKronrod formulas do not exist, and can be used as an adequate alternative in order to estimate the error of a Gaussian rule. We also investigate the conditions under which the optimal averaged Gaussian quadrature formulas and their truncated variants are internal
    corecore